Marine Biological Laboratory Library Woods Hole, /Massachusetts Presented by t.ie i-lBL Associates-1971 Gift INFORMATION STORAGE AND NEURAL CONTROL ^-^ ^^ ^^-1 INFORMATION STORAGE and NEURAL CONTROL Tenth Annual Scientific Meeting of the Houston Neurological Society Jointly Sponsored by the Department of Neurology Baylor University College of Medicine Texas Medical Center Houston, Texas Compiled and Edited by WILLIAM S. FIELDS, M.D. Professor and Chairman Department of Neurology Baylor University College of Medicine and WALTER ABBOTT, Ph.D. Assistant Professor of Epidemiology Director, Biomathematics Research Laboratory Baylor University College of Medicine CHARLES C THOMAS . PUBLISHER Springfield • Illinois • U.S.A. Published and Distributed Throughout the World by CHARLES C THOMAS • PUBLISHER Bannerstone House 301-327 East Lawrence Avenue, Springfield, Illinois, U.S.A. This book is protected by copyright. No part of it may be reproduced in any manner without written permission from the publisher. © 1963, by CHARLES C THOMAS • PUBLISHER Library of Congress Catalog Card Number: 62-20580 With THOMAS BOOKS careful attention is given to all details of manufacturing and design. It is the Publisher's desire to present books that are satisfactory as to their physical qualities and artistic possibilities and appropriate for their particular use. THOMAS BOOKS will be true to those laws of quality that assure a good name and good will. Printed in the United States of America CONTRIBUTORS Gregory Bateson, M.A.: Ethnologist, Veterans Administration Hospital, Palo Alto, California; Professoi (Visiting) Depart- ment of Anthropology, Stanford University, Stanford, California. Mary A. B. Brazier, D.Sc.: National Institutes of Health Career Professor at the Brain Research Institute, University of Cali- fornia, Los Angeles, California. Neil R. Burch, M.D.: Associate Professor of Psychiatry, Baylor University College of Medicine, Houston, Texas. Harold E. Childers: Assistant Professor of Biophysics, Baylor Uni- versity College of Medicine, Houston, Texas. James E. Darnell, Jr., M.D.: Department of Biology, Division of Microbiology, Massachusetts Institute of Technology, Cam- bridge, Massachusetts. Harrison Echols, Ph.D.: Assistant Professor, Department of Bio- chemistry, College of Agriculture, The University of Wisconsin, Madison, Wisconsin. Ralph W. Gerard, M.D., Ph.D.: Director of Laboratories, Mental Health Research Institute, The University of Michigan, Ann Arbor, Michigan Robert T. Gregory, Ph.D.: Associate Professor of Mathematics; Senior Research Mathematician, Computation Center, The University of Texas, Austin, Texas. E. Roy John, Ph.D.: Professor and Director, Center for Brain Research, The University of Rochester, College of Arts and Science, Rochester, New York. Saul Kit, Ph.D.: Biochemist, Department of Biochemistry; Head, Section of Nucleoprotein Metabolism, The University of Texas M. D. Anderson Hospital and Tumor Institute, Houston, Texas. Robert K. Lindsay, Ph.D.: Assistant Professor of Psychology; Research Scientist, Computation Center, The University of Texas, Austin, Texas. vi Contributors Warren S. McCulloch, M.D.: Head, Neurophysiology Group, Division of Sponsored Research, Research Laboratory of Elec- tronics, Massachusetts Institute of Technology, Cambridge, Massachusetts. James G. Miller, M.D.: Director, Mental Health Research In- stitute, The University of Michigan, Ann Arbor, Michigan. Frank Morrell, M.D.: Professor of Neurology, Stanford University School of Medicine, Palo Alto, California. Bernard C. Patten, Ph.D.: Associate Professor of Marine Science, Virginia Institute of Marine Science, College of William and Mary, Gloucester Point, Virginia. Bernard Saltzberg: Senior Scientist, The Bissett-Berman Cor- poration, Santa Monica, California. FOREWORD X HIS volume entitled Information Storage and Neural Control is compiled from the proceedings of the Tenth Annual Scientific Meeting" of the Houston Neurological Society. This meeting, like its predecessors, was concerned with the exploration of a specific area of current biomedical investigation. For some of those persons who may have occasion to read the contributions presented here by scientists in various disciplines, there may be little that is im- inediately applicable in clinical medicine. Many of the concepts and techniques which are described relate at this time only to fundamental research, but there is no doubt that in the future a better appreciation of these facts will be exceedingly important to clinicians. Progress in the biological sciences has been impeded to a con- siderable extent by our inability to obtain objective quantitative data in many critical areas of research. Biostatisticians and geneticists were among the first to recognize this serious defect and to inake attempts to fill in the gaps. The concept of vary- ing information content was introduced when I — the informa- tion value of a group of observations — was defined as the re- ciprocal of the variance of the data. At first glance, this concept appears to be in direct conflict with the modern idea of high information content for a low probability datum. This need not, necessarily, be the case, since a narrow range of variation implies inclusion of low probability observations from the extremes of the normal curve with the correspondingly high information content of these low probability observations. The next iinportant forward step resulted from the application in the biological sciences of physical and chemical laws derived from the exact sciences. This period began with the publication of A. J. Lotka's Elements of Physical Biology in 1925, in which the theoretical concepts of modern mathematics, physics, and physical chemistry were applied rigorously to models of biological systems. Many of the laws and viii Foreword theories now widely accepted in various special areas of research can be traced to this basic work. For example, the studies of Gause and Witt on competitive action in biological systems con- stitute experimental verification of several of the models described by Lotka. The modern era of physical biology, or biological physics, cannot be dated precisely, but great impetus was given to this field of investigation by the publication of Shannon's work. The Mathematical Theory of Communication, in 1948. The full impact of this monumental contribution is only now beginning to be realized. The symposium from which these papers were compiled was organized for the specific purpose of presenting to both basic scientists and clinicians a spectrum of applications of information theory in biology. The audience, as well as the contributors, represented a diversity of disciplines including mathematics, physics, chemistry, virology, ecology, physiology, and several fields of clinical medicine, such as neurology, psychiatry, and internal medicine. It is hoped that this volume will serve as a source of reference for clinicians and basic scientists alike. We wish to acknowledge the continued support of Dr. Hampton C. Robinson, whose financial aid has made possible the presen- tations of these symposia. Assistance in underwriting publication costs of the proceedings has been given to us by the M. B. and Fannie Finkelstein Foundation. We also wish to express our appreciation to Dr. Wayne H. Holtzman, Director of The Hogg Foundation for Mental Health, Austin, Texas, for his helpful suggestions in the formulation of the program. We are grateful for the wonderful cooperation given us by the contributors to this volume and for the editorial assistance of Thelma Armstrong and Joan Chambers. W. S. F. W. A. CONTENTS Page Contributors Foreword V vii Part I — Introduction Moderator William S. Fields, M.D. Chapter I. What Is Information Theory? — Bernard Saltzberg ... 5 Discussion of Chapter 1 17 II. Binary Representation of Information — Robert T. Gregory . 27 III. Information Processing Theory — Robert K. Lindsay . . 34 Part II — Information in Biological Systems Moderator: Heather D. Mayor, Ph.D. IV. Genetic Control of Protein Synthesis — Harrison Echols . . 59 Discussion of Chapter IV 72 V. Coding by Purine and Pyrimidine Moieties in Animals, Plants, and Bacteria — Saul Kit 76 Discussion of Chapter V 120 VI. Virus Action and Replication — James E. Darnell, Jr. . .123 Discussion oi Ch3.\)ie.r VI 139 VII. The Information Concept in Ecology: Some Aspects of In- formation-Gathering Behavior in Plankton — Bernard G. Patten 140 Discussion of Chapter VII 171 VIII. Exchange of Information About Patterns of Human Behavior — ^Gregory Bateson „ 173 Discussion oi Chdipi^r Will 184 X Contents Chapter Page Part III — Neurophysiological Aspects of Information Storage and Transfer Moderator: Hebbel E. HofF, M.D., Ph.D. IX. Information Storage in Nerve Cells — Frank Morrell . . . 189 X. How Can Models From Information Theory Be Used in Neurophysiology? — Mary A. B. Brazier 230 Discussion of Chapter X 241 XI. Neural Mechanisms of Decision Making — E.Roy John . . 243 Discussion of Chapter XI 278 XII. Anastomotic Nets Combating Noise — Warren S. McCulloch . 283 Discussion of Chapter XII 296 Part IV — The Human Nervous System Moderator: Wayne H. Holtzman, Ph.D. XIII. The Individual as an Information Processing System — James G. Miller 301 XIV. Information Processing in the Time Domain — Neil R. Burch and Harold E. Childers 329 Discussion of Chapter XIV 349 Part V — Summary and General Discussion Moderator: Ralph W. Gerard, M.D., Ph.D. XV. Summary— Ralph W. Gerard 353 General Discussion 367 Appendix A Introduction— Michael H. Arbib 377 A Logical Calculus of the Ideas Immanent in Nervous Activity— Warren S. McCulloch and Walter H. Pitts . . 379 Index 401 INFORMATION STORAGE AND NEURAL CONTROL PART I — INTRODUCTION Moderator: William S. Fields, M.D. CHAPTER I WHAT IS INFORMATION THEORY? Bernard Saltzberg ^^ INTRODUCTION XHE purpose of this paper is to describe the principles under- lying" the quantitative aspects of the storage and communication of information so that a better understanding may be gained of the nature of efficient information storage, with its attendant implications in coding and control processes, including neural control. Insofar as possible, the discussion will avoid abstract mathematical arguments and will be directed to those with little or no previous acquaintance with probability or information theory. INFORMATION MEASURE Although information theory is an essentially mathematical subject, a basic understanding of the underlying principles can be acquired without resorting to complex mathematical argu- ments. In simple qualitative terms, information, as defined by Shannon, * is merely a measure of how much uncertainty has *The formal development of information theory originated in the work of Claude E. Shannon of Bell Telephone Laboratories who published his fundamental paper, "The Mathematical Theory of Communication," in 1948. In this paper he set up a mathematical scheme in which the concepts of the production and transmission of information could be defined quantitatively. Historically however, Shannon's work stems from certain early basic observations in theoretical physics concerning entropy. Boltzman (1894) observed that entropy is related to "missing information," inasmuch as it is related to the number of alternatives which remain possible to a physical system after all the macroscopically observable information concerning it has been recorded. Leo Szilard (1925) extended this idea to a general discussion of information in physics, and von Neumann (1932) treated information in quantuin mechanics and particle physics. Information theory, as developed by Shannon, con- 6 Information Storage and Neural Control been removed by the receipt of a message. For example, if you are told that the baby Dr. Jones delivered today is a boy, then you have been given one bit of information* (by definition one bit is the amount of information necessary to resolve two equally likely alternatives). If the uncertainty is greater, the amount of information necessary to remove it is greater. Therefore, a message which identifies one of 32 equally likely alternatives contains more information (five bits as we shall explain later) than a message which resolves 16 equally likely alternatives (four bits). Some elementary examples to convey the basic notions of information measure may be helpful in developing some qualitative insights. Consider a simple game in which you are asked to guess a number with possible values from one to eight. With no a priori knowledge of which of these numbers is the correct choice, the probability of guessing the correct number is 1/8. In the language of information theory, this situation might be described as follows: A system is in one of eight equally probable states, and the state of the system is completely unknown to the receiver. It is appro- priate to ask how much information is conveyed to the receiver by completely resolving the uncertainty of the receiver's knowledge. Let us designate the eight equally probable states of the system by the numbers from one to eight. Assume that you are to ask only binary questions, i.e., questions which admit only of a yes or no answer, in an attempt to determine the state of this system. It is a simple matter to discover that the minimum number of such questions certain to establish the state of the system is three. In this simple illustration we have introduced the basic concepts from which the quantitative definition of information can be formulated: namely, the number of equally probable states of nects more directly with certain ideas generated about thirty years ago by H. Nyquist and R. V. L. Hartley, both of Bell Telephone Laboratories. Professor Norbert Weiner's work in the study of Cybernetics, which deals mainly with the use of information to effect certain control actions, has been a major impetus in applying information theory to biological and central nervous system phenomena. *Information theory does not deal with the importance of the information in a message. For example, the information in the message, "the baby is a boy," is one bit independent of whether you are the father. This comment is made in order to em- phasize the fact that information theory docs not deal with the subjective value of information, which falls more properly into the domain of semantics, but rather with objective measures of information. JVhat is Information Theory? 7 a system (in this case eight), the number of ahernatives resolved by each question (two, because of the binary nature of the ques- tion), and the minimum number of questions necessary to de- termine tlie state of the system (three in this case). It is easily seen that tiie relationship between these numbers is 2'^ = 8. In the vernacular of information theory, we say that three bits of information are necessary to determine the state of such a sys- tem; i.e., three appropriately chosen questions, each of which resolves two alternatives, usually designated as 1 or 0, correspond- ing" to yes or no, are all that is necessary to reduce indeterminacy to certainty. The problem of choosing the appropriate questions is analogous to that of choosing a good code. For example, asking the question: "Is the number 3?" would correspond to very inefficient coding of information. Phrasing or coding questions in this way would require that you be allowed to ask eight questions in order to be certain to determine the state of the system. In this illustration a correct way of coding or phrasing the c^uestions would be as follows: Question 1: "Is the number greater than 4?" If yes, then ask Question 2: "Is the number greater than 6?" If no, then ask Question 3: "Is the number 5?"' If no, then the system must be in state 6. As previously mentioned, the probability of having guessed the correct state before receiving these three bits of information was 1/8 in the example used. After the first bit of information is re- ceived the probability of guessing" correctly is increased from 1/8 to 1/4, after the second bit from 1/4 to 1/2, and after the third bit from 1 ;'2 to 1 . Thus, each successive bit received has reduced our uncertainty as to the state of the system until all the uncertainty is removed. In this example, the receipt of any more information is unnecessary or redundant. However, as we shall discuss later, the redundancy may be useful in correcting errors due to noise in the communication channel. In order to use a more general illustration which is not restricted to a system with equally probable states, let us consider the game of Twenty Questions. In most situations the probabilities of some states, i.e., the possible set of objects to be identified, are higher than the probabilities of others. A good information theorist with some a priori knowledge of the probabilities of these states would 8 Information Storage and Neural Control ask questions in accordance with his a priori knowledge of these probability states. Information theory as well as intuition tells us that a good strategy would be to inquire about the most likely probability states first. This point might be more clearly illustrated by considering the information storage problem, which is equiva- lent in principle. INFORMATION STORAGE Mathematically, there is no important difference between the application of information theory to communications systems through which information flows continuously and to static sys- tems used for storing information. The problem of storing infor- mation is essentially one of making a representation. The repre- sentation can take any form as long as the original or something equivalent to it can be reconstructed at will. It is clear for example, that even though information exists as sound, there is no need to store it acoustically. There is no objection to the use of a re- versible code since information is invariant under such a trans- formation and, therefore, can be stored equally well electrically or magnetically; as for example on a recording tape. We simply have to insure that every possible event to be recorded can be represented in the store. This implies that an empty store must merely be capable of being put into different states and that the precise nature of these states is quite immaterial to the question of how much information can be stored. Thus, the capacity of an empty information store depends only on the total number of distinguishable states of which it admits. Hence, the larger the number of states, the larger the capacity. If a storage unit such as a knob with click positions has n possible states, then two such units provide altogether n'' states. From this it is clear that duplication of the basic units is a powerful way to increase storage capacity. Since physically, it is generally easier to make two ^/-state devices than one single device with n'' states, practical storage systems will generally be found to consist of a multiplicity of smaller units. Thus, 1000 two-state devices can provide a total of 2'""'' possible states. The exponential dependence of the number of states on the number of units immediately suggests a logarithmic measure of What is Information Theory? 9 information capacity and, in fact, the information capacity of a storage system is defined by the equation, C = log n, where n is tlie number of distinguishable states. This makes the capacity of a compound storage system equal to the capacity of a basic storage unit multiplied by the number of units in the system. If the logarithm is taken to the base 2, then C is the equivalent number of binary storage units (bits); and if the logarithm is taken to base 10, then the information capacity is given in units called Hartleys. For example, the capacity of a knob with 32 click positions is equal to that of five two-position switches (five bits). A ten-position knob, on the other hand, has a capacity of one Hartley, and two ten-position knobs capable of being placed in 100 different states have an information capacity of two Hart- leys. Since storage elements which are binary in nature (two positions) are much less susceptible to error and are easier to mechanize, it is more common to deal with binary units (bits) of information than with decimal units of information (Hartleys). So far we have discussed information storage and, correspond- ingly, information capacity. There is an important distinction, however, between information capacity and information content. The information content of a message may be defined as the ! minimum capacity required for storage. To illustrate this impor- tant point, consider a two-state message such as a reply to some question which admits only yes or no. If someone in this audi- torium is asked, "Are you a doctor?", then a reply admits of two possible message states and it will certainly be possible to store the reply in one binary storage unit. Intuition tells us that the message contains one bit of information, for, by itself, it cannot be stored any more efficiently. However, our previous discussion has demonstrated that a bit of information should substantially reduce uncertainty. In view of the fact that most of the people in this auditorium are doctors, I could simply guess "yes" for each person questioned and be correct most of the time. Thus, one would expect the average information per question to be less than one bit, as indeed it is. Using a numerical example from Woodward, suppose that 128 people in this auditorium are questioned and the 128 binary 1 0 Information Storage and Neural Control messages have to be stored in a system of binary storage units. (It will be assumed that we are interested in preserving the exact order of the replies and not simply in counting the number of yeses and noes). Proceeding in the most obvious manner and using one storage unit for each message, we should set down a sequence such as this: YYYYYYYYNYYYYYYYYYYYYYYYNYYY . . . The question, "Are you a doctor?", expects the answer yes from this medical group, and of the 128 messages there will be only one or two no states. Therefore, it would be more economical to store the positions of the noes in the sequence and convert the numbers 9, 25, corresponding to the noes in the above sequence, into the binary form as 0001001 and 0011001. Seven digits are allowed because there are 2 messages altogether. Thus, the se- quence could be coded into the sequence 00010010011001 . . . This makes use of binary storage units just as in the original sequence, but a much smaller number of them. It is understood, as part of the code, that decoding proceeds in blocks of seven. This avoids violating the binary form of marking off groups of digits. The preceding code, which is only one of many that could be devised, shows that a set of two-state messages can sometimes be stored in such a way that each message occupies, on the average, less than one bit of storage capacity. From such considerations, the following definition of information content is suggested: n I = -2 ?^.Tog/>i whei'e p. = probability of the ith. state n = total number of states. If all n states are equally probable, then it follows that pi = \/ n for all values of /. Thus, substituting pi = 1/n into the expression for /, we note that / = log 71. From this it follows that if all n states are equally probable, the information content is exactly equal to the information capacity What is Information Theory? 11 of a store with n states. In other words, if ah the states are equally probable, then it is not possible to store the information any more efficiently than one bit of information per message, on the average. If there are some preferred states, i.e., if the pi are not equally probable, then it can be shown that the average information per message can range from zero to one bit. Zero information cor- responds to the condition where a single state has unity probability and all the other states have probability zero. As stated before, the other extreme is attained when all the states are equally probable. In other words, on the average, one must receive more information to resolve fully the states of a completely random system (all states equally probable) than to resolve the states of a less random system (all states not equally probable). COMMUNICATION OF INFORMATION Let us now consider information theory as it pertains to the communication of information. For this purpose, we define infor- mation received as the diff'erence between the state of knowledge of the recipient before and after the communication. In more precise terms, information received is given by: where / = log I = information received Pa = probability of the event at the receiver after the message is received Pb = probability of the event at the receiver before the message is received. In receiving a message regarding the sex of a baby, for example, this expression implies that if the receiver does not know the baby's sex, then 1 Vb = 2' and if you (the receiver) receive a signal that "the baby is a boy," then Pa = I (provided the message is not noisy) 1 2 Information Storage and Neural Control and, therefore, / = log jy^ = log 2 = 1 bit. If the message were a noisy one, then you might not be quite certain that you received the signal for "boy" correctly. You may nevertheless be willing to give four to one odds that it is a boy, based on the noisy signal you received. In this case Va = .8 and, thus I = log Yjx = log 1.6 = .68 bits, 1/ Z which demonstrates the quantitative reduction in information due to noise. In the case of no noise it is clear tlie pa is always unity and / = -log pb. The important problems in tlie communication of information are, however, concerned with the effects of noise. The maximum amount of information that can be sent through a communication channel in the presence of noise is a topic of particular usefulness whicli we shall examine briefly. Getting back to the expression it is interesting to note the implications of this definition. If, for example, a communication system is so noisy that the message has not reduced the receiver's uncertainty as to the event (i.e., p^ = p^)^ then / = log 1=0 and no information has been received. Thus, it is seen that a communication does not necessarily convey any information. The communication must reduce the recipient's uncertainty as to the events in question in order to convey infor- mation. The mathematical definition I = log Pa/Pb is, therefore, consistent with intuitive requirements for a measure of information. One of the important problems in communication theory has to do with the maximum rate at which information can be sent over a communication channel which is disturbed by random noise. This problem has fundamental implications for information What is Information Theory? 13 transfer rates in biological systems as well as for the neurophysio- logical aspects of information transfer, which are to be treated in later papers at this symposium. In order to discuss this problem, it is necessary to define a few terms, namely: B = the bandwidth of the communication channel (this defines the range of frequencies which can pass through a system) S = received effective signal power N = received effective noise power. In any communication system, the message from which the recipient derives information is a combination of signal plus noise. It can be shown (not without some mathematical difficulty, however) that the maximum rate at which information can be sent through a channel — which is 1) signal power limited by .S', and 2) disturbed by random noise of power N — is given by R = B\og(l + S/N). In other words, the maximum information that can be sent in a time T is RT or I = BT log (1 + S/N). The important implication of these formulae in the design of communication systems resides in the fact that S/N, the signal- to-noise ratio, is a function of B, the bandwidth of the channel. Therefore, if one determines the dependence of signal-to-noise ratio on bandwidth, it is possible to achieve a tradeoff between S/N and B, which optimizes the information handling capacity of the system. EQUIVOCATION This leads us to the more involved concepts of equivocation and channel capacity and to Shannon's basic theorems on error correction. The previously mentioned maximum rate at which information can be sent through a channel, usually referred to as the channel capacity C, is intimately related to these ideas and, therefore, requires some elaboration and clarification. As Shannon has stated, it may seem surprising that we should define a definite capacity C for a noisy channel, since we can never 14 Information Storage and Neural Control send certain {i.e., probability equal one) information over such a channel. It is clear, however, that by sending the information in a redundant form, the probability of errors can be reduced. For example, by repeating the message many times and by a statistical study of the different versions of the message, the prob- ability of errors can be made very small. One would expect, however, that to make this probability of errors approach zero, the redundancy of the encoding must increase indefinitely and the rate of transmission must therefore approach zero. This is by no means true. If it were, there would not be a well-defined capacity, but only a capacity for a given frequency of errors or a given equivocation, the capacity going down as the error require- ments are made more stringent. Actually, the capacity C defined earlier has a very definite significance. It is possible to send infor- mation at the rate C through the channel, with as small a fre- quency of errors or equivocation as desired, by proper encoding. This statement is not true for any rate greater than C. If an attempt is made to transmit at a higher rate than C, then there will neces- sarily be an equivocation equal to or greater than the excess. To clarify the concept of equivocation, let us suppose there are two possible symbols, 0 and 1, and that we are transmitting at a rate of 1,000 symbols per second with probabilities Po = Pi = 1/2. Thus, our source is producing information at the rate of 1,000 bits per second (Shannon refers to this as the entropy of the source). During transmission, noise introduces errors so that, on the average, one symbol in 100 is received incorrectly (a 0 as 1, or 1 as 0). What is the rate of transmission of information? Cer- tainly less than 1,000 bits per second since about one per cent of the received symbols are incorrect. Our first impulse might be to say the rate is 990 bits per second, merely subtracting the expected number of errors. This is not satisfactory since it fails to take into account the recipient's lack of knowledge of where the errors occur. We may carry this to an extreme case and suppose the noise so great that the received signals are entirely independent of the transmitted signals. The probability of receiving 1 is one-half whatever was transmitted, and the same is true for zero. Since about one-half of the received symbols are correct due to chance alone, we could give the system credit for transmitting What is Iiijormatwn Theory? 15 500 bits per second while actually no information was being transmitted at all. Equally good transmission would be obtained by dispensing with the channel entirely and flipping a coin at the receiving end. The proper correction to apply to the amount of information transmitted is the uncertainty of what was actually sent after we have received a signal. This reduction in received information is the conditional entropy of the message and is called the equivo- cation. It measures the average ambiguity of the received signal or, in other words, the average uncertainty in the message when the signal is known. For definiteness, let us calculate the equivo- cation of the first example. In this example, noise caused an error in about one out of each 1 00 symbols, so that if a zero was received, the a posteriori probability that a zero was transmitted is .99 and that a 1 was transmitted is .01. The equivocation, or the uncer- tainty associated with each symbol, is exactly the entropy associated with these concHtional probabilities. Thus, Equivocation per symbol = —[.99 log .99 + .01 log .01] = .081 bits. Since the source is producing information at a rate of 1,000 bits per second, the equivocation rate is 1,000 X .081 =81 bits per second. Therefore, we may say that the system is transmitting at a rate of 1,000 — 81 = 919 bits per second. Again, in the extreme case where a 0 is equally likely to be received as a 0 or 1 and a 1 as a 1 or 0, the a posteriori probabilities are ,1/2 and 1/2, Equivocation = — U^ log ~y -{- 7, log -^ = 1 bit per symbol, or 1,000 bits per second. The rate of transmission is then zero as it should be. These examples have demonstrated that noise causes a reduction in received information and have shown precisely how this loss in information is measured. Before leaving this subject, I would like to quote a theorem (due to Shannon) which emphasizes why this quantitative measure, called the equivocation, is so important. Shannon has shown that (in a noisy communication system) if a correction channel is added which has a capacity equal to 1 6 Information Storage and Neural Control the equivocation of the system, then it is possilole to encode the cor- rection data so as to send it over this channel and correct all but an arbitrarily small fraction of the errors. This is not possible if the channel capacity is less than the equivocation. Roughly then, the equivocation may be considered as the amount of additional information that must be supplied per second at the receiving point to correct the received message. CONCLUDING REMARKS This paper has dealt primarily with some of the basic aspects of the statistical theory of information. Very few comments have been made regarding semantic information, not because this sub- ject is unimportant, but rather because there is at present no sound quantitative theory for treating semantic information. In concluding, however, I would like to remark that statistical infor- mation theory has relevance to semantics insofar as it tells us what confidence we can place in the accuracy of the information we re- ceive as opposed to the information sent. The significance or value of the information to the recipient does not fall within the domain of the quantitative measures provided by information theory. With reference to the theme of this symposium, one might say that information theory provides insight for analyzing and im- proving storage and communication processes, but does not unravel the bewildering complexities associated with significance, meaning, or value judgments. From my personal experience with the prob- lems of physiological signal analysis, this fact lies at the core of the difliculties which the life sciences face in applying information theory to their problems. Finding significant factors in a maze of statistical information is an immensely challenging problem in medicine as well as in many other fields. The problems require both an intelligent application of information theory and a thorough knowledge of the phenomena being studied so that good questions can be asked in the right way to enhance the probability of getting a useful answer. REFERENCES 1. Bell, D. A.: Information Theory and Its Engineering Applications. New York, Sir Isaac Pitman and Sons, Ltd., 1956. What is Information Theory? 17 2. Cherry, Colin: On Human Communication — A Review, A Survey, and A Criticism. Massachusetts Institute of Technology, Technology Press, 1957. 3. Feinstein, Amiel: Foundations of Irformation Theory. New York, McGraw-Hill, 1958. 4. Gabor, D.: Lectures on Communication Theory. Massachusetts Institute of Technology Research Laboratory of Electronics, Technical Re- port No. 238, 1952. 5. Goldman, Stanford: Information Theory. Englewood Cliffs, New Jersey, Prentice-Hall, Inc., 1952. 6. Khinchin, A. I.: Mathematical Foundations of Information Theory. New York, Dover Puljlications, Inc., 1957. 7. Schwartz, Mischa: Information Transmission Modulation and Noise. New York, McGraw-Hill, 1959. 8. Shannon, Claude E., and Weaver, Warren: The Mathematical Theory of Communications. Urbana, The University of Illinois Press, 1949. 9. Stumpers, F. L. H. M.: Interpretation and Communication Theory. Lab- oratoria N. V. Philips Gloeilampenfabreiken, Eindhoven, Holland, 1959. 10. Woodward, P. M.: Probability and Information Theory, With Applica- tions to Radar. New York, Pergamon Pixss, 1953. DISCUSSION OF CHAPTER I Heather D. Mayor (Houston, Texas): In case one wishes to draw analogies from physics rather than from thermodynamics, can you clarify something? Could we equate your conditional entropy with, say, the Heisenberg uncertainty principle and your noise ratio with the perturbations introduced in measuring the system? Bernard Saltzberg (Santa Monica, California) : Yes, in terms of inforination measure, uncertainty, or conditional entropy, and the noise which gives rise to the uncertainty (/.^., equivocation) are aspects of essentially ec^uivalent ideas. Mayor: And the Bohr generalized complementary principle in biological systems — would that fit, too, with your intrinsic con- cepts? For example, if we can find the exact position of a micro- organism, it is difficult at the same time to establish with certainty another parameter, such as its size. In a biological system, would this approach fit with your generalized entropy concept? 1 8 Information Storage and Neural Control Saltzberg: The principles involved in applying generalized entropy concepts or information theory to biological systems and quantum mechanical systems are not altered in essential ways. There are differences in nomenclature which sometimes conceal these basic similarities. Mayor: But could you treat them the same way? Saltzberg: Yes. In fact, all of these applications have an amaz- ingly close parallel to the generalized treatment of entropy in thermodynamics. Robert R. Ivers (Fargo, North Dakota): Would you define a little better the term noise that you used during your discussion? Is this interference with signals, or is it the interposing of randoin signals in the systein, or is it just general inaccuracy of the system? Saltzberg: You have asked a very basic question. In order to avoid confusion, I should like to refer to noise as a subclass of a larger class of signals called undesired signals. Undesired signals may be placed in three categories: namely, (a) noise, (b) inter- ference, and (c) distortion. Noise signals may be defined as signals which are not coherent with any signals to which meaning is assigned. Interference may be defined as an undesired signal which is a desired signal in some other system or is coherent with desired signals of some other systein. Examples are cross-talk and common channel interferences in broadcast programs. Distortion introduces undesired signals due to effects such as non-linearities or non-flat amplitude vs. frequency transmission characteristics. In my discussion I have been referring to the first of these undesired signals, namely, random noise. Mayor: In actually performing measurements on your system, you no doubt introduce additional perturbations which could be considered as "noise." Would you consider this a valid parallel? Saltzberg: This sort of parallel seems reasonable to me. If you add an element of indeterminacy to the state of the system, you may consider this as due to noise. There are some deep questions as to what constitutes noise in systems, and these cannot be treated in cjualitative terms or in brief comments. Arthur Shapiro (New York, New York) : Along the same line, how would you treat what happens if you read onto a transmission line a page from a table of random numbers and then another What is Information Theory? 19 page from a table of random numbers? Are you transmitting information, and how would you determine how much? Saltzberg: Yes. You are always transmitting information when- ever you convey a message, unless the noise is so great that the equivocation of the system is equal to the information content of the source. Your question apparently refers to tlie importance of the information. A table of random numbers may be useless to the receiver, but, nevertheless, statistical information has been communicated . Shapiro: Then it is not really true, as you started out by saying, that the meaning of what you transmit has nothing to do with how much information is transmitted. The meaning apparently has a great deal to do with how much information is transmitted. Saltzberg: Apparently I have caused some confusion. Seman- tic meaning or the importance of a message is subjective and is not part of statistical information theory. The previously used example applies here. A message announcing the birth of a boy conveys one bit of information to an unknowing" receiver inde- pendent of whether the receiver is the father or not. Thus, whether a number is taken from a table of random numbers or a table of trigonometric functions has no bearing on the information received, providing the receiver has no a priori knowledge of these numbers. Walter Abbott (Houston, Texas): The point has been made that the information content of any datum is proportional to its surprise value. Does this get involved in your semantic implications? Saltzberg: Surprise, as used in this context, does not have any semantic implications. If a datum or a message identifies one of a thousand possible states, then it has surprise value in the sense that you would have been extremely surprised to have guessed the state without receipt of the information provided by the message. If a system had only two possible states, then you would not be so sur- prised to guess the correct state. Mary A. B. Brazier (Los Angeles, California) : I believe that by surprise value Dr. Abbott means an event of low probability. Myron F. Weiner (Dallas, Texas) : How much must be known of the probabilities, or of the number of probabilities of different messages, or of the number of possible different messages to be conveyed before one can get some idea of what a message is, 20 Information Storage and Neural Control providing one has previously had no information about the system? Saltzberg: If you knew nothing" about the probability states of the messages, then, of course, you would have very little engi- neering data upon which to base an optimum design for a receiving system. This question may pertain to the a priori probabilities which are useful in choosing an appropriate code. This is analogous to the Twenty Question game mentioned previously. If the ques- tioner has some a priori knowledge of the probabilities, he can ask questions in a specific order, depending on the probabilities, and, on the average, will ask fewer questions to get a correct answer than will someone who just asks questions at random. E. Roy John (Rochester, New York): It seems to me that there is a large class of messages in which the a priori probability cannot be evaluated by the receiver. One can think of messages in which the rate of convergence of the total information of the message is not linear for the components of the message and in which the rate of convergence would depend upon the sequence of the components. This might, as a matter of fact, be a charac- teristic difference between certain languages. In a situation where you do not have this advantage of being able to stipulate prob- abilities— in which the probability of a given event is affected by the preceding sequence — it seems to me you must modify your treatment to provide an argument for the bit function, recog- nizing that the information content of a specific event depends on preceding events or context. Could you say something about how you treat this kind of situation, since it seems to be much nearer the situation in which we frequently find ourselves in the nervous system than does the starting point from which you began here. Saltzberg: Your question is a good one. It refers to the effects of inter-symbol influence on information content. The fact that there are transitional probabilities which have to be taken into account in determining the information content in language, for example, is included in the mathematics of information theory. These transitional probabilities have the effect of making the information content of a sentence much less than that calculated by assuming that the sequences of letters and words are inde- pendent of their predecessors. I should comment at this point on What is Information Theory? 21 another aspect of information theory which I have not mentioned before. Information theory is concerned with the properties of ensembles of messages or objects. One of the properties which is quite important in scientific analysis is known as ergodicity. In analyzing" many problems, the assumption of ergodicity is one that is a practical necessity rather than a statement of fact relative to the nature of things. However, this simplifies analysis in that it allows one to examine a long time sample of one of the mem- bers of an ensemble and to conclude from this that he knows something about the statistics of the ensemble. This is not always true since, for example, it would imply that the statistics associated with the EEG record of a single individual apply equally well to another subject. If this were the case, then an ensemble of messages composed of the EEG's of many subjects would be an ergodic ensemble. In testing engineering components, one ordinarily takes a single component and tests it for a long period of time and then draws implications about the behavior of all similar components. This is an aspect of statistical analysis and information theory which, when applied to the life sciences, creates a great many problems since one may not be aware that this assumption may underly the mathematical formulation of certain problems. Herman Blustein (Chicago, Illinois): How do you determine the validity of the samples when you analyze the EEG's in this manner and make a generalization from them? Saltzberg: The validity of the sample is not the question here. For example, an EEG record may be sufficiently long to give you a good estimate of its properties for a particular individual. How- ever, unless EEG's of different individuals are statistically similar, or ergodic, this does not allow you to draw any conclusions about the properties of another individual's EEG record. Harold W. Shipton (Iowa City, Iowa) : The way the discus- sion is going means, I think, that we have to say a little more about the properties of noise, because when we deal with formal information theory we use "noise" in exactly the way that we used to use the phrase "Brownian movement." This is quite acceptable. However, when we perform an experiment, we are dealing with band limited noise, and we are also probably dealing with nonrandom perturbations in the system. I would like to hear 22 Information Storage and Neural Control from Dr. Saltzberg whether he wishes to introduce a second term — noise in this physical sense — or whether he would also Hke to consider things wliich are not related to the required signal over a short-time epoch. There is a good example of tliis in the field of EEG analysis. If you repeat an experiment in time, you expect the signal-to-noise ratio to go up as -^/N , but in almost any biological system you will find it goes up by rather more than this simply because our noise is not "noisy," so to speak, in the sense that it is not white. Saltzberg: There are many things which people refer to as noise that are quite diff"erent from one another. The different types of noise have considerably different effects on the informa- tion content of systems. For example, there is distortion which, if reversible, does not reduce the information content of a message at all. Although people commonly refer to this type of distortion as noise, it is not noise in the context of information theory. You have mentioned white noise, which is a special type of random noise, and the chscussion on maximum rate of transmission of information in the presence of noise is applicable to this lype of noise. It is important to distinguish between this type of noise and interference, which is sometimes referred to as noise. The basic difference is that random noise is not coherent with any signals to which meaning is assigned, while interference is an undesired signal which is coherent with desired signals of some otlier system. The improved signal-to-noise ratios that you men- tioned for biological systems may have something to do with the ability of biological systems to narrow their noise bandwidths by providing certain kinds of adaptive filtering. Blustein: Is this similar to a television signal in which the audio signal is intact and the video is distorted, and yet one can receive and interpret the signal? Saltzberg: I am not sure of the analogy. I have to beg off on this. Blustein: Does the system have to filter the signals? Saltzberg: If the receiver has some a priori knowledge of what it is looking for, then it can do an excellent job of minimizing the effects of noise. One of the simple ways this is accomplished is by means of frequency filters or correlators. If the information signals occupy a narrow bandwidth, then narrowing the accept- What is Information Theory? 23 ance band of the system by employing filters will reduce the amount of random noise, since random noise occupies all parts of the spectrum. The object is to use spectrum space for the signal information, not the noise. Shapiro: Are there actually two kinds of information involved here, only one of which is treated in this way? Perhaps I should not use the word information for the other kind, but a priori knowledge about the nature of the system which the receiver may have, whether it is a person or a machine, must be important. For example, if the machine or the person knows that nothing is coming over this channel except when some kind of an event occurs, then when a lot of noise, i.e., a lot of signals, comes over, this will be interpreted as meaning a lot of activity in the trans- mitter. On the other hand, if the receiver knows from past experi- ence that this system generates its own noise, then when a lot of noise comes over the channel, the receiver says it does not know what is going on. Only when a clear individual signal comes over will it be interpreted as information. I think that there is another set of values, which I suspect is what you mean by filter theory. Saltzberg: I believe your comment relates to filtering in the time domain. It is possible to design a system which simply stops processing information when the signal is too badly corrupted by noise. If cues, which are essentially a priori information, are avail- able, then it is possible to use various types of time filtering. When you do not have any a priori timing cues for determining when you ought to process information, then frequency filtering is sug- gested, providing you know something about the spectral region which the signals occupy. Peter Kellaway (Houston, Texas): It would help if you could tell us something about your ow^n results. I understand you are interested in analyzing the EEG. What sort of information have you obtained by applying information theory and technique to this type of analysis, and what type of information do you hope to obtain? Saltzberg: I can comment on some of the analysis of EEG which was conducted in an attempt to establish how much infor- mation is processed by one technic^ue of EEG analysis as compared with the amount of information which is processed using another 24 Information Storage and Neural Control technique of EEG analysis. The objective of this investigation was to evaluate the information handling capability of zero crossing analysis as compared to frequency analysis. Conventional fre- quency analyzers were compared on an information theoretical basis with the zero crossing analyzers which Dr. Burch of Baylor is using in his EEG research. The particular problem treated was concerned with how much information is abstracted by an analyzer which processes only time point data associated with zero crossings of the record and the time positions of its peaks and points of inflec- tion. Knowing the time resolution which could be achieved with an analyzer of this type, it was possible to establish the frequency resolution that would be required of a frequency analyzer in order to provide the same amount of information. I will admit that this does not add much understanding of neurophysiological processes, but it does give a basis for some engineering decisions relative to whether one type is more effective in abstracting information than another type. It turned out that the resolution that could be achieved with practical period analyzers was much greater than that which could be achieved with practical fre- quency analyzers. Kellaway: Would it matter if it were a physiological signal? The signal that you are using could be anything, could it not? Saltzberg: The signal could well be anything. However, what one would like to achieve is a signal representation and a cor- responding form of analysis emphasizing the physiological effects. For example, if one attempts to extract information by analyzing the coefficients of a Fourier series, the problem may be impossible because the interesting infoi^mation is contained in small per- turbations in many amplitudes of many frequencies. Since similar small perturbations can be caused by noise, the presence of noise would invalidate any physiological correlates being sought. How- ever, if some other parameter associated with a different signal representation were measured, it is possible that physiological effects could cause this parameter to change grossly, which would lead to a higher probability of valid pliysiological correlates. Max E. Valentinuzzi (Atlanta, Georgia): Gan you say any- tliing about the relation between information and organization? Gan we say that these two words are equivalent? You have dealt What is Information Theory? 25 with transmission of information from one point to another. Suppose that now we are not transmitting information, but are organizing a set of elements in a particular pattern or configura- tion. What is the amount of information obtained by the system in the transition from one state to the other? Saltzberg: Whether we talk about the entropy of thermo- dynamics or the information in a message, we are in principle talking about organization, or, more precisely, about the prob- abilities of the various arrangements of the component parts of the system. The second law of thermodynamics states that entropy must increase or, at best, remain constant, which is another way of saying that the system is becoming more disorganized or ran- dom. In communication systems, however, upon the receipt of information, the disorganized or uncertain state of our knowledge becomes more certain or better organized; therefore, we may consider received information as negative entropy since it increases the organization of the receiver. Gregory Bateson (Palo Alto, California) : You separated rather clearly the notion of measuring cjuantity of information from the notion of ""meaning'' of the information measured, but it appears to me that this becomes difficult when we have a secjuence of items comprising a total message and so related that some of these items reflect upon the significance of other items in the sequence. In this case, the meaning of these meta signals is a very important part of the whole economics of communication. Saltzberg: I think your question refers to the very strong con- straints which may exist between the elements of a signal. These constraints are essentially the transition probabilities and the intersymbol influences which are referred to in information theory. We do not look upon a knowledge of these transition probabilities as implying meaning in the sense that we talk about semantic meaning. In other words, the constraints between a sequence of elements comprising a signal are accounted for in measuring information, but the importance of the signal, i.e., its semantic value, is not. Bernard S. Patrick (Memphis, Tennessee): Can you speak for just a moment about redundancy or the use of redundancy in communication systems? 26 Information Storage and Neural Control Saltzberg: In any communications system which places a very high priority on accuracy, redundancy is frequently employed. In situations where it is not practical to increase the strength of the signal in order to improve accuracy by increasing signal-to- noise ratio, it becomes important to use redundancy to correct the errors due to noise. In digital communication systems, re- dundancy is often employed in the form of error-correcting codes. The accuracy of language communication is greatly enhanced by the redundancy of language. It is easy to get a qualitative feeling for the amount of redundancy in the English language. For example, consider a situation in which a sentence composed of a sequence of symbols is transmitted and the first symbol re- ceived is a t, the second a k, and the third an e. You would have no trouble concluding that the word transmitted is "the" because of the constraints in the language. The language structure tells us that "tke" must be in error since "tke" is not a word. Further, since h very frequently follows t in the English language, there is not much doubt that the first word in the sentence is "the." This is one aspect of how the redundancy of the language increases the accuracy of communication. In fact, in communica- tion systems which are signal power limited, it is necessary to employ redundancy techniques to reduce the errors in com- munication due to noise. Ralph W. Gerard (Ann Arbor, Michigan): It was said that maybe an example would illuminate some of the points that would come up, and I think I can give one that might be helpful. Limiting the spectrum in frequency or the interval of time in which the significant signal is to be expected is helpful. Gregory Bateson referred back to that in speaking of a para-signal, which tells the meaning of the signal itself by indicating where in the total space you must look for the signal. I wonder how many in the room will understand the statement, ^^ Cur antrum santrum, ovidiim, ovidum.'' Hands? None. That is because you thought I was talking Latin, whereas I was actually talking German. Now I will repeat it the same way. "'/Tz//? rant rum, Sand rum, ohwiedumm, oh wie dummy Knowing that the language is German is the para- signal; locating the message in the total space is what is meant by having the proper set. I CHAPTER II BINARY REPRESENTATION OF INFORMATION Robert T. Gregory, Ph.D. INTRODUCTION T IS well known that most modern electronic digital computers use the binary representation of numbers internally rather than the more familiar decimal representation, although sophisticated programming systems may allow the computer user to do almost all of his communicating with the machine in decimal. The reason for the fact that binary representation is in common use is explained in the next section. It is tiie purpose of this paper to review the binary representation of numbers, including the word structure for a typical binary computer, and to demonstrate some typical machine commands that are available for manipulating patterns of binary digits. It is hoped that this will provide some indication of the extreme versatility of the electr'onic computer as an information processing instrument and will encourage those who have not yet discovered its usefulness to explore its potentialities. REPRESENTING NUMBERS INSIDE A COMPUTER It is well known to those who design the basic circuits for digital computers that the optimum number base, B, for representing numbers inside a computer, from the standpoint of economy of electronic hardware needed, is, B = 3. To verify this let us recall that the number of numbers that can be expressed using n digits, base B, is B". For example, in the decimal numeral system if n = 3, we can express 10^ numbers 000, 001, . . ., 999. 27 28 Injormation Storage and Neural Control Assume that the number of electronic components required to represent a single digit, base 5, is approximately proportional to B. Thus, a rough estimate of the number of electronic components required to represent n digits, base B, is TV = n lu B, where A"i is a constant. If B" = P is held fixed, and we wish to find the value of B which minimizes the number of electronic components, N, needed to represent P numbers, we proceed as follows: Since i?" = P, we can write and so n In B = In P = /v%. cJN _ „ dB - ^^' In B • Thus, the number of electronic components needed is KsB In B • DifTerentiating with respect to B yields InB - r . (In BY' _■ Setting this derivative equal to zero gives us biB = 1, or B = r = 2.71828 .... For integer values of B the minimum occurs when B = 3, with slightly greater values for B = 2. Since tristable devices are almost nonexistent, and bistable devices are plentiful, most engineers choose the binary numeral system rather than the ternary numeral system when they design a machine. Binary Representation of Information 29 BINARY NUMBER REPRESENTATION If we recall the definition of our standard positional notation whereby the meaning of a digit depends on its position relative to other digits in the number representation, then we note that any positive integer may be written (/„ . . . r/.//ir/o = r/o/i" + (JiB' + f/,/i' + . . . + daB" where i:? > 1 is the base of the number representation and where 0 ^d,< B. For example, if Z? = 10 the integer "fifty-seven" may be written 57 = 7 . 10' + 5 . kV - 7 + 50. \i B = 1 then "fifty-seven" becomes Llll0011,,vo = [2'^ + 2' + 2' + 2i,,.„ = [1 + 8 + 1(> + 32],,„ where the subscript "two" indicates that binary notation is used on the left of the equal sign and the subscript "ten" indicates that decimal notation is used on the right. Similarly, any positive fraction (less than one) may be written OV/_if/_or/^3 . . . d-,n = (UB-' + r/_o^-' + d.^B'^ + . . . + r/_„,fi""', where m does not have to be finite and 0 ^ «'_, < B. For example, if /? = 2 then fi\e-sixteenths becomes [0.0101]uvo = 2"' + 2- 1 + i,' . .4 IbJten Since positive numbers may be decomposed into an integer part and a fraction part we may consider, as a more general example, the binary representation of thirty-seven and nine sixty-fouiths. [I00101.0010011t„,, = [2" + 2' + 2' ,„ + [2-^ + 2^^J [1 + 4 + 82],,„ + 1+^ L8 ()4J [37]ten + !) L()4jt It is not our intention to go into great detail at this point and discuss the procedures for converting from one number representa- tion to another. We have merely tried to review, by means of 30 Information Storage and Neural Control three examples, the well-known fact that positive numbers are easily represented by a pattern of binary digits, that is to say, by a pattern of "zeros" and "ones." (Binary digits are commonly called bits, for short.) Before continuing, it is necessary to recall that negative numbers also have representations in terms of a pattern of bits. To demon- strate this let us discuss methods for negative number representa- tion inside a computer. The following systems are currendy used: [1] Signed absolute values [2] Complements with respect to some integral power of the base [3] Complements with respect to one less than some integral power of the base. The first method is simple — the machine contains the absolute value of each number stored, with an indication of its sign. The second and third methods involve number representation modulo B'' and modulo (^^-1), respectively, where B is the number base, and the machine registers are assumed to hold k digits. To illustrate system [2], let /: = 9 and assume a decimal machine. Thus, if we use the symbol = to mean "is represented by," then 126 « 000 000 126 and -126 « 999 999 874, since -126 = 999 999 874 (mod 10'). Using system [3] we have -126 « 999 999 873, since 10' - 1 = 999 999 999, and -126 = 999 999 873 (mod 999 999 999). System [3] is sometimes called the "nines complement" system when B is ten. This is motivated by the fact that one merely takes a digitwise complement with respect to nine in forming 999 999 873 as the negative representation of 126. System [2] is called the Binary Representation of Information 31 "tens complement" system when B is ten, since the least significant digit is actually complemented with respect to ten. In a nine-digit binary machine using "ones complements," we would have [126]ten « 001 111 no, and [-126]ten « 110 000 001. Thus we have reviewed, by means of examples, how both positive numbers and negative numbers may be represented easily by a pattern of bits. PROCESSING BINARY INFORMATION INSIDE A COMPUTER Let us consider a typical modern high-speed computer which is designed to process binary information in blocks of 48 bits. Such blocks are called words, and we describe such a computer as having a 48 bit word length. These words may be bit patterns representing numbers (we discussed binary representations of numbers in the previous section) or the words may be bit patterns having a non-numerical interpretation. To the machine this is immaterial. The repertoire of machine commands for carrying out operations on machine words includes commands for performing the basic arithmetic operations of addition, subtraction, multiplication, and division. More complicated mathematical tasks, such as the ex- traction of square roots, solving algebraic equations, and so on, are accomplished by using a combination of these basic commands. In addition to the commands for performing basic arithmetic operations, the machine is capable of executing commands which perform operations of a non-numerical character. These are the commands that make the modern electronic digital computer a versatile information-processing instrument rather than just a high-speed computing instrument. We begin our discussion of non-numerical type commands (although some of these may have a numerical interpretation as well) by mentioning the shift cortimands. Consider the bit pattern consisting of ones in the odd numbered positions and zeros in 32 Information Storage and Neural Control the even numbered positions. We shall write the word in the form 101010 . . . 101010, where the meaning of the three dots is obvious, and their use enables us to avoid writing" all 48 bits. A left shift of n bits causes the individual digits to be shifted to the left n places in an "end-around" fashion, which means that bits shifted off the left end are carried around and introduced into the right end of the word. Thus, if n is an even integer, the pattern displayed above will not appear to have changed following the shift. On the other hand, if n is odd, the pattern will have the appearance 010101 . . . 010101 following the shift. This shifting operation can be quite useful. To illustrate this we need to mention that the machine is capable of performing branchmg operations, i.e., it can be made to do one thing if the first bit of a word is a "one" and another thing if the first bit is a "zero." This means that the machine is capable of performing each of two sequences of operations depending on the nature of the first bit of a word. Figure 1 will aid us in this discussion. Fig. 1 If lines represent sequences of operations then we traverse the path AB if the first bit of our word is a "one" and AC if the first bit is a "zero." If we assume that we shall begin at point A many times and if we desire to traverse the paths AB and AC on alter- nate occasions, then we can use the word 101010 101010 and the left shifting operation to do this. All we need to do upon arrival at either of the points B or C is to command the machine Binary Representation of Information 33 to perform an odd number of left shifts. This will cause the first bit of our word to be alternately "one" and "zero." Other examples of useful commands include commands to per- form several logical operations. In order to describe a few of these commands, reference will be made to Table I. TABLE I Logical Operations O Logical Product 10 1 = 1 10 0 = 0 0 0 1 = 0 0 0 0 = 0 © Logical Sum 1 ® 1 = 1 1 © 0 = 1 0 © 1 = 1 0 © 0 = 0 ® Exclusive "Or' 1 ® 1 = 0 1 ® 0 = 1 0 ® 1 = 1 0 ® 0 = 0 For example, if we have the two words A 10101010 and Q, 11001100 we can generate the word M 10001000 10101010 11001100 10001000 by performing the bit-by-bit logical product of the two words A and Q,, that is to say, we can form A O Q = M. If we have the two words A 111000111 000111000 and M 101010101 . . . 010101010 we can replace A by M © A giving A 010010010 . . . 010010010 and as a final example, we can replace A by A © M giving A 111010111 . . . 010111010 These examples merely illustrate the kinds of bit manipulation that are possible, and no attempt has been made to be exhaustive. As one gains experience in using such commands it is possible to discover how versatile this new instrument is as an information processor. Research workers from many diverse fields are con- stantly finding ways to apply such machines to their problems. CHAPTER III INFORMATION PROCESSING THEORY Robert K. Lindsay, Ph.D. n 'ESCARTES is usually credited with introducing the mind- body problem to psychology. What he did was introduce the body to psychology. In those days, of course, there were no card-carrying psychologists, but there were many people who were interested in the huinan thought processes, which were assumed to reside in a mysterious nonentity called the mind. Descartes wished to show how the mind influenced the motions of the body, and in so doing he made some guesses as to how the body itself might have something to do with decision making. The abstracted description of the control mechanism which Descartes provided sounded much like a description of the inechanical statues which were found in the gardens of his day. He described nerves as hollow tubes through which ran bell ropes of the sort used to summon servants. These bell ropes, when stimulated, manipulated valves in the head which directed the flow of animal spirits from the ventricles of the brain to the inflatable muscles. The expansion of the muscles brought about movement. The mind was, in this model, adjoined to the body through the pineal gland, which served as a sort of master control which could override any of the other valves, thus maintaining the integrity of the free will. Descartes' system, though somewhat obsolete today, was, in its time, quite ingenious. Even though men are no longer profitably viewed as garden decorations, Descartes and his notions can be credited with having thrown a great deal of light on the working of the human control system. Although advances in physiology have shown the preceding model to be inadequate, such knowledge has not eliminated the 34 Injormation Processing Theory 35 approach. Students of behavior still exhibit a propensity to describe the human system in terms of the engineer's handiwork. In the first half of this century, and still today, psychological models took a form remarkably similar to the telephone switchboard, with incoming signals being routed through connections, strengthened, by degrees, through use, to trigger a response, with scarcely a "by-your-leave" to their brothers under the skin. Psychology moved back the boundaries of the mind as emphasis withdrew from the mechanical procedures which performed the motions, and moved toward the control procedures which decided what motions would be made. The mechanical monster seemed too clumsy, and an electrical monster was substituted. As the preceding papers have indicated, the recent years have seen some new tools, both conceptual and actual, added to the engineer's gadget bag. These years have also seen some further friendly borrowing of these tools by students of biology and behavior. The new tool with which I am most impressed and with which I hope to impress you is the digital computer. A great deal of work has been directed in the last decade toward the understanding of the neural bases of the control processes which interest the psychologist, and a fair number of psychologists have decided that switchboards are perhaps not the best model for neurological processes. So now we hear that people are really like electronic monsters. We are not quite as physiologically naive as was Descartes. We know that humans are not really made up of transistors, resistors, or even electric wire. What, then, do we mean when we say that humans are like computers? Humans and other animals make decisions, behave, solve prob- lems, and learn. Machines — digital computers — also do these things. Superficially, at least, humans are like machines. C'an we be more specific? If a machine does the same things that a human does and fails in the same things in which humans fail, then machine and man are alike at a more basic level of description. The more details which can be replicated by the machine, the closer is the comparison. Once the basic features of a computer are pointed out to some- one who has seriously attempted to analyze the human system, 36 Information Storage and Neural Control some similarities are obvious, as are some points of dissimilarity. A legitimate question yet remains: How does the existence of a potentially remarkable device of this sort aid us in our present work? There are at least two answers to this question. The first answer is exemplified by the work of those who have studied the brain as a computing machine. Turing (1) proved that a very simple device is capable of computing any number which a reasonable man might wish to call computable. In a classic paper, McCulloch and Pitts (2) argued that, since mathe- matical logic has been stated in a form where deductions become a form of computation, a device of no greater complexity than a Turing machine should be capable of performing any logical computation, no matter how complex. They, in fact, proved that elements no more complex than neurons were sufficient for this purpose. That is, they demonstrated that to every logical proposi- tion there corresponds a nerve net which can be constructed from idealized neurons, and that the converse is also true. The brain, thus, is not just in some vague sense like a computing machine; the brain is a computing machine. The important activity of the brain is its inputting, processing, and outputting of information. Although we have had these computing machines — brains — around for a long time, only recently have we had any others of comparable complexity. To biology, the presence of the digital computer has provided, in addition to a new source of interested human talent, a manipulatable device which can be studied in vivo and whose descriptors, as they are discovered, might profitably be applied to the human machine. Studies of electronic systems, and of sys- tems in general, have provided insights into some important biological questions. To mention just one such question which has received a lot of attention: How is it possible to construct a reliable system out of billions of variable, unreliable parts? This question has been attacked profitably by McCulloch (3, 4) and von Neumann (5), among others. The second answer is the one on which I wish to dwell more extensively. It is frequently the case that, although we know the properties of all components of a system, we are unable to predict the behavior of the system if it is composed of many components. It is true that not all of the relevant properties of neurons are Information Processing Theory 37 known in sufficient detail. However, some of their basic features — features which undoubtedly are critical in brain function — are well established and can be described accurately. A lot of talent has gone into speculating on the manner in which neurons inter- act to perform the higher functions. One such theory is that of the psychologist Hebb (6), who attempted to explain the phe- nomenon of memory in terms which were physiologically sound and yet psychologically relevant. Hebb proposed three phases in the formation of memory traces. The first is reverberation, the persistence of nervous activity after the termination of the initiating stimulus. The second mechanism, the cell-assembly, consists of a characteristic pattern of firing associated with a particular stim- ulus configuration and comes into being upon adequate repetition of the stimulus. To account for this, Hebb postulated that if one neuron succeeded in firing a second, the synapse, by some un- specified processes, should change so as to make this triggering more probable in the future. The third mechanism, evolving from the second, amounts to the passing of activity from one cell-assem- bly to another as a result of the repeated temporal sequencing of the corresponding stimuli. This mechanism, the phase-sequence, is the primitive basis of expectancy, an important psychological concept. Although the neuronal properties which Hebb assumed are well established, and the "growth" hypothesis is almost certainly correct, it is not an easy task to show that these assumptions are sufficient to cause the reverberation, cell-assembly, phase-sequence organization postulated. Rochester et al. (7), attempted to demon- strate the sufficiency of Hebb's assumptions in a novel way. They instructed a digital computer to behave according to the assump- tions, and then simply observed its behavior. It is interesting to note that they were forced to make some additional minor assump- tions before the theory was specified in sufficient detail to be realized. But more important, they found that, although rever- beration was easily achieved and cell-assemblies formed spon- taneously after some suitable modification of the theory, phase- sequences were not achieved. This work has given some important clues as to what is lacking in the theory, and some specific altera- tions have been proposed. 38 Information Storage and Neural Control We see in the works of McCulloch and Rochester a feature which distinguishes them from many other efforts at describing brain activity and behavior. This new approach may be con- trasted with many behavior theories which describe the product or output of the system rather than the process by which the output is obtained. Akhough it is perfectly reasonable to develop a science of psychology from product models, psychological theories would be more directly useful to neurophysiology if they could be stated as process models. The absence of analytic tech- niques and languages for describing processes has until recently blocked any rigorous development of psychological process models. The development of computer sciences offers hope of removing these blocks. In order to provide a clearer picture of what is meant by the infoimation processing theory approach, I wish to contrast a process model with three other types of theoretical descriptions. All four of the models to be discussed deal with human language production. The three non-process theories, in fact, purport to explain exactly the same phenomenon; unfortunately, the infor- mation processing model does not. However, I think the exposition will not suffer excessively from this lack of aesthetics. The phenomenon described by the three non-process models is the strikingly regular statistical distribution of words produced in speech and writing. The data are most often displayed in what is called the standard curve, which is obtained as follows. A passage of text is examined to determine which word occurs with greatest frequency, which with second greatest frequency, and so on. A graph is then made, with this rank plotted on the abscissa and frequency of occurrence on the ordinate. Thus, if "the" is the most frequent word, and if it occurs one thousand times, then the point so determined is (1, 1000). Such graphs, made from a wide variety of sources, are well-approximated by the equation Jr =C, where / = frequency, r = rank, and C = Constant. Information Processing Theory 39 If the curve is plotted on log-log coordinates, this rectangular hyperbola becomes a straight line with slope of minus one. The above equation is equivalent to nf = K, where / = frequency of occurrence, n = number of words of that frequency, and K = Constant. This form is the more usual representation of a frequency dis- tribution. The first explanation of this regularity which I wish to discuss is due to Zipf (8), and exemplifies what I will call a mentalistic theory, although I shall try to avoid defense of that term. The basis of Zipf's explanation is the Principle of Least Effort, which itself requires explanation. People, says Zipf, behave so as to minimize effort, and this strategy underlies behavior of all forms. He is at pains to emphasize that effort includes not only actual work but mental effort as well, including the mental effort to decide which path involves the least effort. And here is where the trouble begins. Since a person is unable to predict the future exactly, he must make guesses. The Principle then becomes the statement that a human will behave so as to "minimize the average rate of probable work." At this point it is clear that no problems are solved by the Principle because, in order to make a prediction, we must determine the subject's view of the world and understand his decision process, which of course was the problem with which we began. Since this is an important point, let me state it somewhat differently. Although Zipf provides elaborate discussion of what he means by effort, he never gets around to telling us how it is to be measured, nor does he ever rigorously state what is meant by this Principle of Least Effort. Since the length of time over which this undefined effort is to be averaged is also unspecified, it is clear that we can adjust the foresightedness of our subjects to obtain whatever results we desire. Since the word "probable" is thrown in, our only recourse is to the subjective probabilities of the subjects in order to apply the principle; and subjective 40 Information Storage and Neural Control experience is by definition private. The Principle is seen to be an elastic phrase which can be distorted to fit whatever data are presented. Zipf has committed an error whicli psychologists have been attempting to eliminate for fifty years. This is why I have used the term "mentalistic" to describe Zipfs theory. But let us take one quick look at how the principle is applied to the word-frequency results. Zipf argues on teleological grounds. From the viewpoint of the speaker's purpose, communication would require least effort if one word could be used to convey every meaning, for then no decision would be necessary. From the auditor's viewpoint, every meaning should have a separate code, for then his effort would be least. Thus, we have the Foice of Unification in opposition to the Force of Diversification, tending to create a vocabulary balance. I wish to quote the following passage from Zipfs book as an example of the application of the Principle: "We obviously do not yet know that there is in fact such a thing as vocabulary balance between our hypothetical Forces of Unification and Diversification, since we do not yet know that man invariably economizes with the expenditure of his effort; for that, after all, is what we are trying to prove. Nevertheless — and we shall enumerate for the sake of clarity — if 1) we assume ex- plicitly that man does invariably economize with his effort, and if 2) the logic of our preceding analysis of a vocabulary balance between the two Forces is sound, then 3) we can test the validity of our explicit assumption of an economy of effort by appealing directly to the objective facts of samples of actual speech that have served satisfactorily in communication. Insofar as 4) we may find therein evidence of a vocabulary balance of some sort in respect of our two Forces, then 5) we shall find ipso facto a con- firmation of our assumption of 1) an economy of effort." (9). The argument which has been presented is scientifically sound: deduce something from theoretical assumptions; if the deduction is empirically verified, the theory has found support. Our only quarrel is with the weakness of the prediction. All that has been predicted is that passages of actual speech or writing will not be repetitions of the same single word, nor will all words be different. The absence of detailed specification of the constructs of the theory leads only to predictions which are trivial, and yet the superficial Information Processing Theory 41 rigor of the statement of the Principle gives the impression that something has been said — something which sounds very reason- able and powerful. The weakness of such explanations would seem not to deserve such extensive comment, and yet the problem has come up so frequently, particularly in psychology, that apparently it does not hurt to point out such fallacies. Not all cases are so obvious, unfortunately, and many theories which appear rigorous to the most competent of scientists are found at times to fail on this same count. I shall return to this point later. The amazing regularity found in the word-frequency data, regularity which seems to be so hard to come by in the field of human behavior, deserves more serious attempts at explanation. Fortunately, other workers have attacked the same problem; and, fortunately, for our purposes of today, one such approach illus- trates a stochastic theory and another illustrates an application of information theoretic concepts. The stochastic model is due to Simon (10). The approach is to postulate probabilistic decision rules and from these to derive the statistical properties of a device which follows the rules. The challenge is to postulate rules which will yield the statistical properties of the observations, in this instance the frequency distribution of words. It should be pointed out that the weaker, i.e., the more general, are the underlying postulates, the better is the theory. Thus, as in any theory, we wish to account for as much as possible with as little as necessary. Simon's basic model rests on only two assumptions. From these he is able to derive a frequency distribution known as the Yule distribution, which has all of the properties required for fitting the word-frec|uency data. As a matter of fact, slight variations on the assumptions yield slight differences in the resulting dis- tribution. These various forms of the theory can be plausibly associated with various real world situations, and the theory thus accounts for several phenomena, such as the distribution of authors by number of professional papers published, the distribution of incomes, and the distribution of biological species by genera. Furthermore, the steady state statistical properties are fairly insen- sitive to minor changes in the assumptions. 42 Information Storage and Neural Control With regard to word production, Simon argues that an author selects words not only by association with other words he has already put down in this passage but by imitation of the language as well. In other words, the next word which an author will choose is determined by the frequency distribution of his present effort, i.e., the subject under discussion, and by the statistical properties of the language he is using. We thus need to postulate two "birth processes." Further, we will consider a passage of fixed length, as is reasonable if we consider that our sample under analysis is but a segment selected from the author's total word production. Thus, we must postulate a "death process" which specifies which words are dropped from the sample at one end of the passage as we add words at the other end. Let (3 be the proportion of words added by imitation, and let f*{i) be the relative frequency of words which have occurred exactly / times each. The following assumptions shall prove to be sufficient. Birth Processes i. The probability of adding a word, already having occurred i times, by association is {\-^)i f*{i)- ii. The probability of adding a word, already having occurred i times, by imitation is ^{i-c) /*(i). Death Process iii. If a word of frequency i is dropped, all instances of that particular word are dropped; this occurs with proba- bility /*(z). Note that in both birth processes, the probability of adding a word is proportional to the total number of occurrences of that word and all other words used equally often. The assumption that a word will be chosen with probability proportional to its own frequency is a special case of our assumptions i. and ii.; hence, the assumptions used are more general. The factors in- volving jS are self explanatory. The constant, c, appearing in assumption ii. may be made plausible by the following considera- tions. In any given passage, not all of the words in the language will have been used yet. We wish to allow the possibility that Information Processing Theory 43 a word never before used in this sample will be produced. Thus, we wish to attenuate slightly the probabilities attached to the association process. The death process, though intuitively less satisfactory, would be true if all occurrences of a given word were closely grouped as if associated only with the topic of the moment. Thus, if a page is removed from the sample, it is likely that all occurrences of every word on it will be removed. Since we are discussing a sample of constant size, we may write an equation to indicate that total births are balanced by total deaths. But, furthermore, we are concerned with the so-called steady state of this stochastic mechanism. This is the situation which persists when the sample size is large enough that the statistical distribution remains invariant. Thus, the number of words dropped from the category witli relative frequency f*{i) must be just balanced by the number of words entering that category, which is the number of births in the category with relative frequency /*(/-l). These requirements enable us to write the following equation births in (z — 1) minus births in (z) minus deaths in (0 = 0 (I-/3C-1) j*{i-\) - (i-^c) f*{i) - f*(i) = 0 which may be rewritten as /•«=(:5^)/-o-i) thereby recursively defining the desired quantity. This function has the required properties. In the example of the stochastic theory, then, the assumptions are probabilistic decision rules and the deductions are made analytically. The third description of language production is due to Mandel- brot (11) and exemplifies the application of information theoretic concepts. Superficially, this approach is similar to that of Zipf, for Mandelbrot derives the equation for the standard curve by minimizing the cost of coding the speaker's ideas into words, subject to the constraint of a fixed amount of infoimation trans- mitted per word- However, Mandelbrot is quite specific as to what he means by both information and cost. 44 Information Storage and Neural Control Zipf argued that, while the speaker's effort (cost) would be least if only one word were used, this situation does not persist because the listener's decoding" efforts would be too great. Infor- mation theory allows us to put this notion on a sounder base. You have seen that a message which is always sent can convey no information, and that the larger the vocabulary, or set of alternatives, from which messages are selected, the greater is the information which they convey. On the other hand, the process of deciding which message is next to be sent is also more difficult when the set of messages is larger. Mandelbrot has proposed that the balance of these two factors may be conceived as the basis for word statistics, and in this we see the similarities with Zipf. How- ever, Mandelbrot has employed a specific definition of infoimation, and has rigorously defined the probleni. Let us examine the main features of his derivation for a problem which is formally identical with the one stated above: Given a fixed average cost per word, what will be the frequency distribution of the words to give inaxi- mum information per word? Let Cr be the cost of the r-th most frequent word, which occurs with probability p^. Average cost per word, C, is then C = X VrCr. r Also, r must hold. The problem is then to maximize H = -IZ Vr log Pr T subject to the above conditions. When this is done, it is found that Vr = AAI where A, B, and M are constants which have interpretations in terms of the coding process. One further step is needed in order to complete the derivation, and this involves relating C,- and rank, r. Mandelbrot has managed to show that if words are coded "optimally," the resulting word statistics will be correct. Suppose that words are coded from' some elementary units. These units Information Processing Theory 45 are undefined, being like, perhaps, phonemes, but not necessarily so identified. It is only necessary to assume that words are com- posed of these units and that the cost of a word is equal to the sum of the costs of the units. To illustrate, take the special case where each unit has the same cost attached to its use. The least expensive words are tlien the ones composed of single units. If there are M units, there are M such minimum cost words, M~ double unit words (second in cost), and so on. The rank of a word will be determined by the number of words which can be coded with Cr or fewer symbols. For example, if words are coded as binary sequences, M = 2, and there are fourteen codes of three or fewer digits (0, 1, 00, 01, 10, 11, 000, 001, 010, Oil, 100, 101, 110, ill). Thus a word of cost 3 will have rank Hand )•{?>) = 14. In general, c c r{Cr) =i:ii/^=i:ii/^- 1 Cj. + 1 1 - - M 1 - - M M (71/- 1) 1 - 71/ 1 - M so that M - 1 71/^' - 1 = /• (71/ - l)iW-' Now, Cr = Cr log,/ 71/ = log,; [(71/'''- 1) + 1] = log,/ [(7I/'''- - 1) + .^7 (71/ - l)-\/M {M - !)-'] , r + 7l/(J/-l)-' = log,/ 71/ (71/ - ir' = log,/ (/• + M {M - I)-') - log,/ .1/ + log,/ (.1/ - 1) which is of the form Cr = log,/ (/• + m) + / where m and jn are factors independent of r. Mandelbrot shows that the general form of the expression for CV \s> the same no matter 46 Information Storage and Neural Control what coding rules are assumed, provided that the coding results in a ranking of words by cost. Substituting this last equation in our first result yields Pr = P ijr -\- my , which reduces to the standard equation when w = 0 and B = — 1 . The additional parameters allow closer fit; but since each has a "physical interpretation," we are not really cheating. My purpose is not to compare the adequacy of these three particular theories — this has been argued elsewhere: Simon (12, 13, 14), Mandelbrot (15, 16), Rapoport (17) — but to contrast the theoretical style. The procedure of Mandelbrot, then, is to start from ceitain assumptions and to deduce the resulting prop- erties. In his case, the assumptions were stated in information theoretic terms and the deductions were analytic. My final example illustrates the information processing ap- proach. Unfortunately, as I rhentioned earlier, it does not deal with the same data, although it is concerned with verbal pro- duction. Hence, we may contrast the underlying notions even if we cannot compare theoretical validity. This description is due to Yngve (18) who has attempted to explain some of the salient features of English grammar. As a starting point, Yngve has pointed out that English often provides several grammatically correct and semantically equivalent ways of saying the same thing, and that some of these ways are quite complicated. On the other hand, the grammars of formal mathe- matical notations, such as that of algebra, impose severe limits on the number of forms permitted, and yet these restrictions do not hamper expressive power nor limit "sentence" length. Let us consider just two examples. In English the standard form of modification places modifiers before that which is modified. Thus, we have such phrases as "the big, happy man." But we may also reverse this order — which logically should be completely ade- quate— in such phrases as "a man as tall as a circus giant." Why do we not avoid such discontinuous constituents (some modifiers in front, some behind) and use the more consistent form "an as tall as a circus giant man" or "an as a circus giant tall man"? Secondly, note that English provides both active and passive voices: "Johnny gave the ball to Billy" and "The ball was given Information Processing Theory Al to Billy by Johnny." Surely, since both are equivalent, we are just complicating things by allowing two grammatical forms. In algebra we do not have a symbol \ as in "B\A," which means the same as "A/B", but we may say in English "B divided into A" as well as "A divided by B." To explain these and many other aspects of English grammar, Yngve postulates a mechanism for sentence production. Assume that the brain has a large memory in which are stored rules such as S = NP + VP; NP = T + N; VP = V + A; T = the; T = a; A = away; V = went; V = ran; N = man. Such rules define a grammar in that they can generate sentences if used in the fol- lowing fashion: S = NP + VP -T+N+V+A = the man went away or S = a man ran away By selecting various rules, we may generate various sentences, all grammatical. However, in order to generate sentences in the prescribed left- to-right fashion, it is necessary that we complete the expansion of the left-most phrases while "keeping" our place," i.e., remem- bering the higher order rules which are guiding" the sentence production. If we had a scratch pad on which to keep our place, its contents at various staoes mio'ht look like this: Verbalized On scratch pad S NP VP TN VP the N VP the N VP the man VP the man VP the man V A the man went A the man went A the man went away man went away 48 Information Storage and Neural Control CUearly, the type of grammar rules will determine both how much we have written down at any time and the maximum capacity required of the scratch pad. Since rules may be used recursively, i.e., we permit rules such as S = S + and + S, we might generate grammatical sentences which exceed the capacity of any given scratch pad. This is not such a danger in algebraic notation, which is not generally used as a spoken language except for short expressions, but it could be critical in spoken English. Since human span of attention is quite limited — and we have some pretty consistent evidence as to what this limit is — English has evolved rules of grammar which spare our mental scratch pads. For example, we can see that elaborate phrases which occur at the beginning of a sentence must be expanded while keeping in mind the structure of that which is to follow. Grammarians ad- monish us not to use such "top-heavy" sentences. It is not sur- prising to find that we have been provided with alternate ways of modifying nouns, and that these ways allow us to postpone some of the modifiers until we have gotten rid of the object of modi- fication. Discontinuous constituents are such mechanisms. The same argument accounts for the existence of the passive voice when the active is just as accurate. If the subject of a sentence is greatly elaborated, we can postpone it until later by making it the predicate of a sentence in the passive voice. Note how the following sentence, used as an exaniple by Yngve and taken from a U. S. patent, organizes the information so that one need not expand the middle while keeping in mind other features: "The said rocker lever is operated by means of a pair of opposed fingers which extend from a pitman that is oscillated by nieans of a crank stud which extends eccentrically from a shaft that is rotatably mounted in a bracket and has a worm gear thereon that is driven by a worm pinion which is mounted upon the drive shaft of the motor." The same sentence can be expressed in the active voice, but this requires a memory which is beyond the capabilities of most of us. The sentence is ungrammatical for that reason, accord- ing to Yngve's model: "A pair of opposed fingers (that extend from a pitman (which a crank stud (that extends eccentrically from a shaft (which is rotatably mounted in a bracket and which a worm gear (that a worm pinion (which is mounted upon the Information Processing Theory 49 drive shaft that the motor has) drives) is on)) osciUates)) operate the said rocker arm." Tiie preceding examples, tiien, represent four approaches to the same general type of observation. I have called them mentalistic, statistical, information theoretical, and information processing theoretical. The latter consists of postulating" some sort of mechan- istic decision procedure; the operation of the mechanism is then examined and compared with human behavior. Assumptions are stated as processes; the method of deduction is not analytic. For processes more complex than that in the Yngve example, the de- duction often takes the foim of specifying the processes for a digital computer, the running of which then provides the predictions. There are yet several points of this discussion which deserve more elaboration. First, why go to all the trouble and expense to build and instruct this device when we might do better to hire a mathematician, whose services are certainly cheaper, to solve the problem analytically? This is certainly a good suggestion, and many people who have resorted to simulation might better have resorted to mathematics. But the systems which are of major interest to the psychologist and biologist have the property of being complex. Mathematics, although it has earned its place of respect in science, is not a completely developed discipline. The task of writing equations for the human system is far too difficult. Some attempts have been made to describe mathematically cer- tain learning processes, for example. Bush and Mostellar (19), Estes (20); but it has been necessary to limit the complexity of the equations in the interest of getting them solved. Learning processes have pretty well resisted linear descriptions. It is, how- ever, possible to define in computer terms systems which cannot be defined in normal mathematical notation; and if the system can be defined as a computer program, a computer can simulate the behavior of the system. It is important to realize that writing a program is analogous to writing an equation, and running the program is analogous to solving the equation. It is then clear what I meant when I said that the program is a theory: it is a theory in the same sense that a mathematical equation is a theory — it makes some well-defined assumptions and makes some predic- tions which are rigorously deduced from these assumptions. 50 Information Storage and Neural Control With this analogy in mind, it is easier to elaborate on the other points. One may argue that having discovered one set of opera- tions which accounts for the behavior of a computer system does not assure us that the same set of operations is involved in the human system. This is a truism which also applies to mathe- matical theorizing; that is to say, more than one equation can fit the same set of data. Ultimately, we must live with this prob- lem, for if a theory accounts for all data within its domain, then it is as good as a theory can be even though there is no assurance that its underlying assumptions have any basis in reality. Such considerations have forced philosophers of science to conclude that reality has no meaning; we can only ask if the assumptions work, not if they are real. The job of the scientist is that of the inven- tor who creates descriptions, not of the explorer who discovers reality. Even leaving this ultimate state aside, it is important to con- tinue on this same point, but at a more practical level. If we have a program which accounts for a small segment of human behavior, how have we progressed? Seldom are we satisfied with a theory of small segments of behavior. Let us expand our program until it is more encompassing. If this can be done by making use of some of the same postulated operations, we achieve the parsimony which we seek. Let us look at programs written by other people to describe other things. If they consist of markedly similar por- tions, then we again have made progress. Eventually, when a certain process or feature has turned up frequently enough as an asset, we may forget our philosophy and begin to look within the human system to see if we cannot find independent evidence for the existence of some such process. We have thus generated two types of hypotheses: those which make predictions about similar types of behavior, and those which give us clues about the com- position of the organism. I shall return to some examples of the latter at the conclusion of this paper. The third point on which I wish to elaborate is a matter of practical research strategy. The process of simulation provides an important fringe benefit which becomes apparent only after trial. It has long been a feature of psychological theorizing that would-be theories suffer from chronic vagueness. The result is Information Processing Theory 51 a theory which can be stretched to fit anything. The genesis of this difhcuhy lies in the fact that the theorist knows what he is saying and so does his audience. Hence, it is often possible to put together assumptions which, logically, will not fit, or to make deductions which, logically, do not follow. These unfortunate juxtapositionings may go unnoticed by an intelligent theorist and his informed listeners, who can readily and unwittingly supply the missing pieces, ignore the excesses, and beg the answer which they know is there even if it is not. The computer, though, is a very stupid audience. From one point of view, it may prove more valuable now while it is stupid than later when it is not; for today it will not tolerate vagueness. When a theorist with an idea sits down to convey his idea to a inachine he almost invariably finds that he must first sharpen it up. And when the machine attempts to simulate the idea, the theorist almost invariably finds it will not do what it is supposed to do. These lengthy elaborations on a fairly concise statement point up the similarities between the process of computer simulation and the other techniques of theory construction. The computer has not answered the many problems which were formulated by these other techniques. The computer will not make scientists out of programmers. It is just another way of theorizing which has certain special advantages, certain special disadvantages, and the same old problems. 1 have attempted to show how process models may be stated and why computer simulation is often an appropriate means for their analysis. It is quite legitimate to ask what such efforts to date have implied about information storage and neural control or, to be more classical, neurophysiology. When computer sci- entists discover processes which appear to be useful building blocks for explaining human behavior or for constructing artificial intelligences, it is natural to ask if actual mechanisms for per- forming these processes can be found within the central nervous system. The observations of the reflex led Sherrington to inquire as to its basis, with a great deal of benefit to science. Pavlov ex- amined the conditioned reflex and based his psychology on it. The discovery of more complex processes could likewise direct efforts in neurophysiological research. 52 Information Storage and Neural Control Of course, such procedures are dangerous, and I hesitate to make any very strong suggestions. The danger lies in the fact that a theory which embodies an hypothesized mechanism, Hke any otlier theory involving an assumption, can only prove the suf- ficiency of the hypothesis, not its necessity. Anyone who accepts directions from a psychologist runs the risk of getting" lost. None- theless, I will indicate a few possibilities based on mechanisms which have been found useful in psychological and computer theory. One observation which has proved highly important to psy- chological process models has appeared in the preceding discussion of Yngve's hypothesis. I refer to the concept of a limited scratch pad, or immediate memory, as it is called. It has often been recog- nized that permanent memory can persist even after severe dis- turbance of the ongoing cerebral activity, such as that brought about by freezing or electroshock. Since any form of persistent trace must undoubtedly require periods of time, at least on the order of seconds, for establishment, then some temporary form of storage, basically different from the permanent form, must be utilized to maintain the information until it can be permanently stored. Miller (21) has shown that the capacity of this immediate memory, as inferred from a variety of psychological studies, is remarkably constant. This capacity is not measured in bits of information, however, but in terms of the number of symbols which can be temporarily remembered; i.e., a subject may retain about seven binary digits, about seven decimal digits, or about seven monosyllabic adjectives, all of which differ in amount of information as defined by Shannon. Thus, the hunian is capable of conceptually complex activity largely because he is capable of dealing with informationally rich symbols, and he is provided with a capacity which is largely independent of the richness of his thoughts. By measuring a subject's success at discriminating various numbers of stimuli which differ along one diniension, one finds that the capacity of the human communication channel is rela- tively constant at about seven discriminations. If one then gives the subject the task of discriminating stimuli which vary on two dimensions, one discovers that the subject, although unable to Informatioti Processing Theory 53 distinguish forty-nine categories, can do better than in the one- dimensional case. For example, a subject who can discriminate wiiich of ten positions a point occupies on a line cannot place the point in one of one hundred cells of a square, but can manage only twenty-five. This is just what would be predicted if ten cells of immediate memory were divided into two groups of five. In other words, the compound discrimination reduces the accuracy of discrimination for each dimension, but still allows independent examination of each. The question arises as to the underlying neurological structure. Is there a single set of pathways which performs this function for all inputs including internal inputs? It seems unlikely, though not impossible, that such a set of pathways is localized in one geo- graphic position in the brain; but even if it is diffusely distributed, as are other memory functions, one may still ask if one set serves in common. Little work of the kind summarized by Miller has been done on cross-modality studies, but one wonders if there is a "final common path" for all sense modalities. Cllosely related to the notion of informationally ricii symbols is the concept of a hierarchically organized memory. It is fairly clear from both logical and psychological considerations that nriemory organization is such that one trace can evoke a number of others, each of which can in turn evoke a number of others, and so forth; i.e., one trace is associated with several others, and any one of them can be elicited without eliciting the otiiers. Such structures liave largely been ignored in classical stimulus-response models, where the theories have been concerned with the forming of a single association between two traces. Neurophysiological theories, perhaps reflecting the concern of the psychologist, have concentrated on exploring the method of single associations. Some meaningful questions might be asked as to the adecjuacy of linear neurological models for explaining hierarchical structures. One such question is related to the concept of set, which has been found extremely useful, if not necessary, in psychological theories, and which has turned up under a variety of names with only minor variations in meaning. It is recognized that a subject can be "set," by instructions or by other experimental manipula- tions, so as to give responses of a certain class, to perform operations 54 hiformation Storage and Neural Control more quickly, or to overlook completely otherwise obvious solution paths in a problem situation. If one were to instruct a computer so that it had this capability, it would be required that the set information, given before the critical task, provide information (or set switches) at a number of different places in the piogram. This is generally accomplished by setting a "flag," which is tested by various subroutines, or by setting several flags, one in each subroutine. The result is a memory structure which might be called diffusely localized. This type of signal must be extremely flexible, and must be controlled by the executive program; i.e., it must be at a higher level in the process structure. To my knowl- edge, no information exists concerning the cerebral mechanism which could explain such a phenomenon, nor has anyone worried much about it. Although perhaps other mechanisms are con- ceivable, it seems necessary that communication channels of some sort must exist between the higher control centers and several lower centers, or that the nerve nets which define processes must be constructed so that they can be rapidly, but temporarily altered by some signal in a higher control center. Finally, I wish to point out a feature underlying all of the computerized brain models which deal with the learning or growth of connections between neurons. Such models have been proposed as the basis for such complex functions as pattern recognition (Rosenblatt, 22); yet each rests on fairly simple and standard assumptions of the sort discussed above in connection with Hebb's growth hypothesis: "If neuron B fires immediately after neuron A, the probability increases that A will fire B." Although such a process is quite feasible, no direct physiological evidence defines its mechanism, so the assumption remains a psychological one. It is almost certain to be correct, and yet perhaps we should not give up the search for alternate mechanisms — if not to replace this notion, then to complement it. For example, the firing of neuron A followed by the firing of neuron B might increase the efficiency of all other connections at the A-B synapse as well. Or perhaps the A-B "growth" takes place only if B subsequently fires C, which bears some relation to A. The neurological mechanisms underlying these suggestions are not so plausible as those of the Hebb hypothesis, but if they are true they might have a profound Information Processing Theory 55 effect on the behavior of a highly interconnected net. Here is where computer simulations might be used to explore new pos- sibilities. By studying the organizing effects of such additional mechanisms, which are just as easily programmed, we might reinitiate some originality into essentially similar models. The fact that psychologists and biologists are beginning to think in terms of processes in addition to stimulus-response associa- tions and equations provides a more obvious link between their work and that of the physiologist. It is the promise of this new link which has revitalized discussions of cross-fertilization resulting in conferences with titles like this one. The value of these new conceptions remains to be seen, but it is probably safe to assume that anything which brings our disciplines closer together can do no harm. REFERENCES 1. Turing, A. M.: On computable numbers, with an application to the Entscheindungs-problem. Proc. London Math. Sac, series 2, 42: 230-265, 1937. 2. McCulloch, W. S., and Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophysics, 5.- 11 5-1 33, 1943. 3. McCulloch, W. S.: Agathe Tyche, — of nervous nets — the lucky reckoners. Proc. Syrnp. on Mechanization of Thought Processes, Ted- dington, England, f959. 4. McCulloch, W. S.: The reliability of biological systems, in Self- Organizing Systems, Interdisciplinary Conference on Self-Organizing Systems, ed. by Yovits, M. C. and Cameron, S., New York, Pergamon Press, 1960. 5. von Neumann, J.: Probabilistic logics and the synthesis of relial^le organisms from unreliable components, in Automata Studies, Shan- non, C. E. and McCarthy, J., Princeton, Princeton University Press, 1956. 6. Hebb, D. O.: The Organization of Behavior; a Neuropsychological Theory. New York, Wiley & Sons, 1949. 7. Rochester, N., Holland, J. H., Haibt, L. H., Duda, \V. L.: Tests of a cell assembly theory of the action of the brain, using a large digital computer. IRE Trans, on Information Theory, IT-2;80-93, Sept., 1956. 56 Information Storage and Neural Control 8. Zipf, G. K.: Human Behavior and the Principle oj Least Effort. Cambridge, Addison-Wesley, 1949. 9. Zipf, G. K.: Ibid., p. 22. 10. Simon, H. A.: On a class of skew distribution functions. Biometrika, 42.-425-440, 1955. 11. Mandelbrot, B.: An informational theory of the statistical structure of language, in Information Theory, by Jackson, W., London, Butterworths, 1953. 12. Simon, H. A.: Some further notes on a class of skew distribution functions. Information and Control, J.-80-88, 1960. 13. Simon, H. A.: Reply to "final note" by Benoit Mandelbrot. Infor- mation and Control, 4;21 7-223, 1961. 14. Simon, H. A.: Reply to Dr. Mandelbrot's post scriptum. Information and Control, 4.- 305-308, 1961. 15. Mandelbrot, B.: A note on a class of skew distribution functions. Analysis and critique of a paper by H. Simon. Information and Control, 2:90-99, 1959. 16. Mandelbrot, B.: Final note on a class of skew distribution functions: analysis and critique of a model due to H. A. Simon. Information and Control, 4.- 198-21 6, 1961. 17. Rapoport, A.: Comment: the stochastic and the 'teleological' ration- ales of certain distributions and the so-called principle of least effort. Behav. Sci., 2;147-161, 1957. 18. Yngve, V.: A model and an hypothesis for language structure. Proc. Am. Phil. Soc, 704:444-466, 1960. 19. Bush, R. R., and Mostellar, F.: Stochastic Models of Learning, New York, Wiley & Sons, 1955. 20. Estes, W. K.: Toward a statistical theory of learning. Psychol. Rev., 57:94-107, 1950. 21. Miller, G. A.: The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev., <5J:81-97, 1956. 22. Rosenblatt, F.: The perceptron: a probabilistic model for informa- tion storage and organization in the brain. Psychol. Rev., (55:386- 408, 1958. PART II— INFORMATION IN BIOLOGICAL SYSTEMS Moderator: Heather D. Mayor, Ph.D. CHAPTER IV GENETIC CONTROL OF PROTEIN SYNTHESIS Harrison Echols, Ph.D. INTRODUCTION kjOME ten years ago the work of Beadle, Tatum, and Horo- witz (1) led to the famous "one gene-one enzyme" hypothesis, which asserted that gene control over cell metabolism is exerted through genetic determination of the structural specificity of enzymes. I would like to discuss our present knowledge and beliefs concerning genetic control of protein synthesis by starting with this concept of the "structural gene" and inquiring into the chemical nature of the gene and into the process by which the gene controls protein specificity. Finally, I shall briefly consider the concept of "regulatory genes" concerned with controlling the rate of action of the structural genes. CHEMICAL IDENTIFICATION OF GENES It is now generally accepted that deoxyribonucleic acid (DNA) stores the genetic information of the cell. The evidence for this comes chiefly from work with bacteria and bacterial viruses, and is based primarily on three types of genetic transfer experiments: transformation, virus infection, and bacterial conjugation (2). In transformation experiments purified DNA extracted from one bacterial population has been shown to carry genetic information to another bacterial population. For example, DNA from a strain of Bacillus sub til is which possesses the ability to synthesize the amino acid tryptophan can confer this biosynthetic ability on a 59 60 Information Storage and Neural Control strain of B. siibtilis which previously could not synthesize tryp- tophan. Evidence that DNA is the genetic material in a DNA-protein virus comes from studies of the infection of Escherichia coli with bacteriophage T2. Virtually all of the DNA of the virus enters the infected bacterium, and virtually none of the associated protein enters. Finally, in bacterial conjugation, DNA is trans- ferred from a donor to a recipient strain of E. coli. The amount of DNA transferred is proportional to the number of genes trans- ferred, again suggesting that the DNA carries the genetic in- formation. There is, then, excellent evidence that DNA is the genetic storage material in bacteria and some viruses (there are ribonucleic acid (RNA) containing viruses in which the RNA has been shown to be the genetic material). The generalization to higher organ- isms of this picture of DNA as the storehouse of genetic information rests largely upon the observations that the DNA content per cell nucleus is proportional to chromosome number; haploid sperm cells, for example, have one-half the DNA of diploid somatic cells (3). Further, the chromosomal DNA is quite stable meta- bolically as befits a genetic storage unit. At present, however, much of our belief in the idea that genes are universally DNA comes from a feeling that nature ought to be universal about such things as the storage and transfer of genetic information, so that what holds true for bacteria should hold true for man. If we accept DNA as the genetic material, we can then ask how such a molecule stores genetic information. The simplest hypothesis concerning this point follows from a consideration of the chemical structure of DNA. DNA is a polymer of deoxyribonucleotides linked together by phosphate bridges between deoxysugars to give a sugar-phosphate "backbone" with purine and pyrimidine side groups (Fig. la). The only topographic feature of this covalent "primary" structure which forms a likely candidate for informa- tion storage is the base sequence of the purines and pyrimidines. A consideration of the probable three-dimensional structure of DNA tends to reinforce this view. The Watson-Crick model (4) for DNA structure proposes that the molecule consists of two chains forming a double helix with hydrogen bond pairing between Genetic Control of Protein Synthesis 61 NH..f l,Ot°' A .. T .. T A XL... .. C A T .. A .. T A r ._ -.JL G- C .. [a] [b] Fig. 1. The Structure of DNA. (a) Part of a polynucleotide chain showing the sugar-phosphate backbone with purine (adenine) and pyrimidine (thymine) side groups, (b) Schematic representation of base pairing between the two chains. The sugar-phosphate chains are represented by the parallel vertical lines and the bases by horizontal lines, (c) The double-helix. Base pairs are represented by horizontal lines. the bases adenine (A) and thymine (T) and between guanine (G) and cytosine (C) (Figs, lb and Ic). A and T are called comple- mentary bases because of this pairing phenomenon, and, similarly, G and C are complementary. This model is now supported by evidence from a variety of chemical and physical experiments. Since the double helix model reveals no new irregularities in topography, one feels reasonably confident that the mode of storage of genetic information in DNA is in the linear sequence of the four bases A, G, T, C along the DNA chain. The linear aspect of the information storage mechanism is supported by genetic studies which indicate linearity of the fine-structure genetic map (5), the order of mutations within a genetic region controlling a sinsie metabolic function. GENES AS DETERMINANTS OF PROTEIN STRUCTURE (The Coding Problem) We have sketched briefly the evidence that genes are DNA and that the "genetic code"' consists chemically of the base sequence 62 Information Storage and Neural Control of the DNA. Let us now discuss how a gene imparts catalytic specificity to an enzyme. Enzymes consist of a linear chain of amino acids (the primary structure), coiled in part into an a-helix (the secondary structure), and folded into a compact and specific three-dimensional structure (the tertiary structure) (Fig. 2). I N 0=C H-C- H-N I c= CH3-C- H I CH,-Q =0 H [c3 '' lb] ft] Fig. 2. The Structure of Protein, (a) Part of a polypeptide chain showing the peptide bonded backbone with side gi'oups characteristic of individual amino acids (here alanine and phenylalanine), (b) Schematic representation of the a-helix showing the hydrogen bonds required to maintain it. (c) The folded polypeptide chain in myoglobin providing the specific three dimensional struc- ture of tlie protein (as determined by the x-ray crystallographic work of Kendrew and collaborators) (19). The working hypothesis for the past few years concerning gene control over protein specificity, usually called the sequence hy- pothesis (6), states that the base sequence of the DNA specifies the primary structure of the protein — the sequence of amino acids. The original argument was based primarily on two points: first, the base sequence of DNA is linear, and the only corresponding" linear object in the protein is the amino acid sequence; second, since proteins diff'er widely in amino acid composition, it was difficult to see how such differences could arise other than by genetic specificity. The argument is now much stronger. A number Genetic Co?itrol of Protein Synthesis 63 of substitutions of one amino acid for another have been found in the abnormal hemoglobins (7) which are presumed to be products of a mutationally altered globin gene. In addition, mutant bac- terial strains producing an altered alkaline phosphatase (8) and an altered tryptophan synthetase (9) have been shown to have substituted one amino acid for another. Similarly, a number of substitutions have been described in tlie tobacco mosaic virus "coat protein" (10). From the sequence hypothesis, it is a short step to the usual statement of the "genetic coding problem": how the sequence of four bases in DNA specifies the twenty amino acids commonly occurring in protein. The first step toward "solving" the coding problem is really to show that the problem as stated exists— to demonstrate that the base sequence of DNA does specify the amino acid sequence of the protein. On the protein side, evidence that mutations can cause amino acid substitutions has been men- tioned. On the DNA side, the determination of nucleotide sequence is not possible at present, but a prediction (or corollary) to the sequence hypothesis has been used to arrive at an experimentally feasible system. This prediction states that the order and relative position of point mutations within the structural gene for a par- ticular protein, presumably reflecting base alterations, should correspond to the order and relative position of amino acid sub- stitutions in proteins produced by these mutated genes. Work on the bacterial enzymes alkaline phosphatase (8) and tryptophan synthetase (9) has shown that two mutations linked genetically affect amino acids in the same region of the respective proteins, so that we can feel some confidence that the sequence hypothesis is correct. Can one determine which bases code which amino acids by this combined genetic and protein chemical approach? The answer is probably yes, provided that mutagens specific for a single base can be developed and used; but the number of amino acid substitutions which must be accumulated is almost prohibitively large. Recently, a much more direct approach to working out the nature of the genetic code and probably its explicit solution has appeared somewhat unexpectedly on the scene. This approach indicates that the future of the work with mutationally altered proteins probably lies in the realm of 64 Information Storage and Neural Control protein chemistry — in the effect of amino acid substitutions on protein structure and specificity — and in the confirmation of the correctness of the biochemical approach which we shall now consider, rather than in the determination of the code for each amino acid. THE MECHANISM OF PROTEIN SYNTHESIS AND THE BIOCHEMICAL APPROACH TO THE GENETIC CODE The attempt to understand the intei mediate steps by which genetic information is transferred into specific protein structure obviously poses a very interesting biological problem. As recently as a year ago, however, no one would have predicted that a crude cell-free extract of E. coli could be forced, even in principle, to yield precise information about the genetic code. The discovery which revolutionized the coding search and opened what might be called the biochemical approach to the genetic code was the finding of Nirenberg and Matthaei (11) that one could trick the E. coli extract into making a most unnatural protein — the polyamino acid polyphenylalanine — by adding a most unnatural piece of genetic material — the polyribonucleic acid of uridylic acid (poly U). To explain the significance of this experiment, it is necessary first to describe briefly present ideas on the mechanism of protein synthesis. There is believed to be a flow of information from DNA through RNA to protein involving three classes of RNA: ribosomal RNA, transfer RNA, and "messenger" RNA. Chemically, all of these RNA's are polymers with a sugar-phosphate backbone like DNA, but with ribose sugar instead of deoxyribose, and with the base uracil (U) instead of thymine (T). Ribosomal RNA exists in the cell in cytoplasmic ribonucleo- protein particles (ribosomes), which are generally considered to be the cellular sites of protein synthesis (12). Messenger RNA is assumed to carry the genetic information detailing the specific amino acid sequence of the protein from the DNA to the ribosome. Presumably the messenger RNA binds to the non-specific ribosome (probably to ribosomal RNA) and serves as the information bearing "template" for protein synthesis (13). Transfer RNA's bind amino acids specifically (with the aid of enzymes). They are Genetic Cotitrol of Protein Synthesis 65 thought to carry amino acids to the ribosome-messenger complex and to act as an "adapter" to position amino acids in the proper place for their polymerization into specific proteins (12). The transfer RNA presumably "recognizes" the code for a particular amino acid in the messenger RNA in order to provide specific positioning of the amino acid. The evidence that transfer RNA is an intermediate in protein synthesis is very good, at least in in vitro systems, and there is strong experimental support for the idea that ribosomes are the site of protein synthesis from both in vivo and in vitro studies (12). The question of whether there is a distinct messenger RNA loosely attached to nonspecific ribosomes, or whether the genetically specific RNA is built into ribosomes as they are synthesized, giving specific ribosomes, is still a matter of some controversy. In the case of virus infected E. co/i, there is strong evidence favoring the loosely bound messenger view (14). At present the model described (and shown schematically in Figure 3) is the most adequate to ex- plain existing experimental results. . DNA i poly U T-RNH-AA \ Protem T-RNVAA / ^ Fig. 3. Schematic representation of the normal protein synthesizing system (on the left) and the synthetic system (on the right). DNA has been removed from the synthetic system by the enzyme deo.xyribonuclease and the synthetic mes- senger poly U replaces the normal messenger RNA. One can imagine a simple base-pairing mechanism by which all of this can occur. The messenger RNA may be synthesized with a DNA primer by a base-pairing, enzyme-catalyzed process which produces a "complementary" copy or translation of the 66 Information Storage and Neural Control DNA in which each base in the RNA is the complement to eacli base in the DNA. For example, the sequence ATGC in DNA would be translated into UACG in the RNA because U, replacing T in RNA but having similar base pairing properties will form a hydrogen-bonded base pair with A, A with T, C with G, and G with C. An enzyme has been found which appears to catalyze this process. Messenger RNA may bind to ribosomal RNA by means of rather general regions of base complementarity. Finally, transfer RNA need only have a base sequence complementary to the messenger RNA base code for its particular amino acid to fulfill its function, since pairing of the complementary bases will correctly position the amino acid. If the DNA sequence AAA codes the amino acid phenylalanine, then the messenger RNA will have the complementary sequence UUU and the transfer RNA for phenylalanine a sequence AAA. The UUU sequence in the messenger RNA will pair with the AAA sequence in the transfer RNA to provide for the insertion of phenylalanine into its genetically determined site in the protein (Fig. 4). The gene DNA and its messenger RNA are equivalent in informational content, since one is a direct translation of the other. C .((>Ala lAirrAiriATA t • I I • I iu iu iu iu iu i m-pna RNA Flare Ribosomal 5ur?cxce Fig. 4. Hypothetical base pairing scheme for protein synthesis. Poly U is shown in its role of messenger. The poly U chain binds loosely to a segment of ribosomal RNA flaring out from the "protein surface" of the ribosome. Transfer RNA for phenylalanine is presumed to contain an AAA sequence complementary to the UUU of the poly U and therefore "positions" a sequence of phenylalanines for polymerization into polyphenylalanine. Genetic Control of Protein Synthesis 67 The implication of the Nirenberg experiment is that polyuridylic acid is the messenger RNA for polyphenylalanine and that a sequence of U is the messenger RNA code for plienylalanine (or a sequence of A is the DNA code). The "synthetic" and "normal" systems are compared in Figure 3. Since there exists an enzyme, discovered by Ochoa and Grunberg-Manago (15), which will catalyze the random synthesis of ribonucleotides into a polymer, there is now a very powerful tool available for investigating the genetic code. For example, a mixed polymer of A and U provides for sequences of AAA, AAU, AUA, UAA, AUU, UAU, UUA, and UUU, choosing only triplets for purposes of illustration, (It should be noted tiiat at least three bases per amino acid are required if four bases are to specify twenty amino acids.) If poly AU is added as a synthetic messenger, then amino acids coded by the above triplets will be incorporated into a polypeptide chain. Even if some of the triplets are "nonsense" in that they do not specify an amino acid, by using a large excess of U some poly- peptide formation can be assured by providing a polyphenyl- alanine 'handle" so that those triplets which spell an amino acid will not be lost. This approach has been pursued very successfully by the Ochoa and Nirenberg groups to describe the most probable code letter for fourteen of the twenty amino acids (16, 17). To carry out the synthetic messenger experiment, the coli extract is first treated ("preincubated") to remove existing messenger RNA. Existing DNA is removed by the enzyme deoxyribonuclease so that new messenger RNA cannot be synthesized. Then synthetic messenger RNA is added, and the amount of C^^ amino acid incorporated into protein-like material (insoluble in trichloroacetic acid) is deter- mined by radioactivity measurements. Any significant incorpora- tion of a CI'^ amino acid, using the UA polymer as the messenger RNA, implies that the code for that amino acid consists of some combination of A and U or of a sequence of A. One can then hope to separate a 2U1A from a 1U2A or a 3A code by deter- mining the ratio of the observed incorporation of a given amino acid to that of phenylalanine and comparing this ratio with that expected for the calculated number of 3U, 2U1A, 1U2A, and 3x\ sequences (using a polymer with U in large excess so that the 68 Information Storage and Neural Control numbers will be quite different, and assuming that 3U is the code for phenylalanine). The way to complete the determination of the genetic code by discovering the actual sequence of bases is also clear in principle using the biochemical approach. It should be possible to add small, known ribonucleotide sequences to poly U enzymatically and to use these messengers to produce polyphenylalanine plus the amino acids coded by these sequences (if any). Unless there are some large surprises lurking around the corner, the genetic code for E. coll may well be officially solved within the next three years or so. There remains the question of whether the coli code is common to all organisms, although most of the limited infor- mation available argues for universality. Even if the code is different in higher organisms, the techniques evolved for the coli system should be generally applicable. All that is needed is a crude, cell-free, protein-synthesizing system plus the proper syn- thetic messenger to trick the system. CONTROL OF THE RATE OF PROTEIN SYNTHESIS (The Regulatory Problem) The process by which genetic information is converted into protein specificity is rapidly becoming spelled out, and the com- plete unraveling of the exact nature of the genetic code providing this specificity of protein structure is on the horizon. However, the genetic control necessary to provide for the adaptive skill of the microorganism and for the much more complicated growth pattern of the differentiated organism cannot be accounted for simply by the ability of genes to control protein structure. The structural gene, structural messenger, and ribosome constitute a protein factory, always working at the same rate for all proteins. It seems obvious that there must exist regulatory genes involved in turning on and off the structural genes and in varying the enzyme complement of the cell. Recent work with bacteria has shown the existence of genes which serve to control the rate of protein synthesis in response to changes in external conditions. Normally, the production of the lactose-hydrolyzing enzyme ^-galactosidase by the bacterium Genetic Control of Protein Synthesis 69 E. coli can vary roughly a thousand-fold up to a maximum of some 6 per cent of the cellular protein if a jS-galactoside, an "inducer," is present in the growth medium. Mutants have been isolated which have lost this control of the rate of enzyme syn- thesis. Jacob and Monod (13) have divided these "constitutive" mutants into two genetically and functionally distinct classes designated i^ and o^ By studying the dominance properties of these mutations in partially diploid strains carrying both i+ and i~ and both 0+ and o*" (both inducible and constitutive genetic structures) Jacob and Monod have developed a model of the control process (Fig. 5). This model proposes that a "repressor" material is made under the control of the i gene, which they call a regulator gene, and that this repressor binds to a site near the structural gene, the O or operator gene, preventing formation of the structural messenger. Requlator Operator jtructural Gene bene Genes Repression or Tnduction Proteins Fig. 5. The model proposed by Jacob and Monod for the mechanism controlling the rate of action of the structural gene. A repressor material is made under the control of the DNA of the regulator gene. This repressor material acts (after possible metabolite activation) by binding to a DNA site adjacent to the struc- tural gene or genes subject to rate control by the repressor and preventing the formation of structural messengers. In this model the regulation is negative; genes are noimally functional and are turned off by a repressor. Similar analysis of the system controlling alkaline phosphatase synthesis (18) has 70 Information Storage and Neural Control indicated that a gene involved in the regulation of this enzyme can hav^e a positive effect, i.e., can be involved in turning on a gene to its full capacity. Therefore, the generality of the model proposed for the /3-galactosidase system is not at present estab- lished; but the essential feature of the model — the proposed existence of specific gene products which exert a controlling influence over the rate of synthesis of the structural messenger RNA — is very appealing. It serves as a valuable guide to future experimental efforts aimed at trying to understand the control process at a chemical level. CONCLUDING COMMENTS We have considered: 1) the chemical nature of the gene; 2) the "sequence hypothesis" which serves as the basis for our definition of the genetic coding problem; 3) the evidence sup- porting the sequence hypothesis from combined genetic and chemical studies; 4) the recent rather dramatic progress of the biochemical approach; and, finally, 5) the problem of regulation. We cannot at present unequivocally separate fact from fancy. However, the evidence now extant certainly favors our main conclusions: 1) that the genetic information of an organism is contained in the base sequence of its DNA; 2) that the base sequence of the DNA of "structural genes" specifies the amino acid sequence of proteins; 3) that an RNA "messenger" carries the genetic information from the structural gene to the ribosome for protein synthesis; and, finally, 4) that the base sequence of the DNA of certain "regulator genes" specifies a material which exerts a controlling influence over the rate of protein synthesis. It should be emphasized, however, that most of the evidence for these conclusions comes from work with microorganisms and that the generalization to higher organisms is chiefly an act of faith. REFERENCES 1. Horowitz, N. H.: Biochemical genetics of neurospora, Advances in Genetics, J.' 33, 1950. 2. Levinthal, C: Coding aspects of protein synthesis. Revs. Mod. Physics, J7.-249, 1959. Genetic Control of Protein Synthesis 71 3. Ris, H.: The Chemical Basis of Heredity, ed. by McElroy and Glass. Baltimore, Johns Hopkins Press, 1957. 4. \Vatson, J. D., and Crick, F. H. C: The structure of DNA. Cold Spr. Harb. Symp. Qiiant. Biol, 7S.T23, 1953. 5. Benzer, S.: On the topology of the genetic fine structure. Proc. Nat. Acad. Sci., Wash., 45:1607, 1959. 6. Crick, F. H. C: On protein synthesis. Symp. Sac. Expt. Biol., 12: 138, 1958. 7. Ingram, V. M.: Hemoglobin and Its Abnormalities, Springfield, Thomas, 1961. 8. Rothman, F.: Cold Spr. Harb. Symp. Qjiant. Biol, 26.-1961, in press. 9. Yanofsky, C, Helinski, R., Mahng, B.: Ibid. 10. Wittman, H. G.: Comparison of the tryptic peptides of chemically induced and spontaneous mutants of tobacco mosaic virus. Virology^ 12:609, 1960. 11. Nirenberg, M., and Matthaei, J. H.: The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc. Nat. Acad. Sci., Wash., 47.-1588, 1961. 12. Berg, P.: Specificity in protein synthesis. Ann. Rev. Biochem., 30:29 o, 1961. 13. Jacob, F., and Monod. J.: Genetic regulatory mechanism in the synthesis of proteins. J. Mol. Biol, 3.-318, 1961. 14. Brenner, S., Jacob, F., and Meselson, M.: An unstable intermediate carrying information from genes to ribosomes for protem synthesis. Nature, 190:576, 1961. 15. Grunberg-Manago, M., and Ochoa, S.: Enzymatic synthesis and breakdown of polynucleotides; polynucleotide phosphorylase. J. Am. Chem. Soc, 77.-3165, 1955. 16. Lengyel, P., Speyer, J. F., Basilio, C, and Ochoa, S.: Synthetic polynucleotides and the amino acid code, IV. Proc. Nat. Acad. .Sci.', Wash., 48:282, 1962. 17. Martin, R. G., Matthaei, J. H., Jones, O. W., and Nirenberg, M. W.: Ribonucleotide composition of the generic code. Biochem. and Biophys. Research Comm., 6.-410, 1962. 18. Garen, A., and Echols, H.: Properties of two regulating genes for alkahne phosphatase. J. Bad., 83:291, 1962. 19. Kendrew, J. C, Dickerson, R. E., Strandberg, B. E., Hart, R. G., Davies, D. R., Phillips, D. C, and Shore, V. C: Structure of Myoglobin. A three-dimensional Fourier synthesis at 2 A reso- lution. Nature, 185:422, 1960. 72 Information Storage and Neural Control DISCUSSION OF CHAPTER IV Mike McGlothlen (Houston, Texas) : What about suppressor genes where you have a mutation of the structural gene and then a counter-mutation of the type that causes the still mutated structural gene to produce normal enzymes? Harrison Echols (Madison, Wisconsin): The theory which is now usually advanced to explain these suppressor mutations is that a suppressor is a mutation which has affected the translation mechanism; i.e., it has perhaps affected the ability of the soluble RNA to bind the correct amino acid. The soluble RNA then makes mistakes which partially rectify the mutational mistake. For example, suppose that the original change in the protein was a substitution of the amino acid alanine for glycine and that the suppressor mutation is such that some of the time, in protein synthesis, glycine is put back in place of alanine. In this case, you would now get a reduced level of the original premutation type of protein. Certain suppressors may also involve a change in the concentration of some cell constituents, which leads to the activation of a mutationally altered protein. McGlothlen: Would you care to say anything about the origin of the secondary and tertiary structure of proteins? Presumably, the sequence of amino acids is controlled by sequences in DNA, but what about the folding, etc., that produces the active un- denatured form of an enzyme? Echols: We think that this comes about purely from a deter- mination of the primary structure. The secondary structure is a matter of solution thermodynamics. A repeating chain of amino acids forms an alpha helix if the solvent is not too hard on hydrogen bonds. To get the specific three-dimensional structure is a tougher problem. However, we can imagine that as the newly synthesized protein comes off the ribosome, there are regions of the protein which are capable of bonding and are in very close proximity to each other. There is actually some evidence that the primary structure does determine the three-dimensional structure of the protein ribonuclease. This derives from experiments in which the protein is unfolded and then caused to fold again. One can break all four of the disulfide bonds by reduction to SH, and unwind the protein into a completely random coil. One would expect, Genetic Control of Protein Synthesis 73 just by considering random re-formation of four disulfide bonds, tliat 105 possible alternative forms of the protein should exist. However, what is found is that oxidation to bring back the disulfide bonds produces something like 90 per cent enzymatically active protein. So even though the system is far from physiological, with this protein, at least, one appears to get the total three-dimensional structure purely from the primary structure. Arthur Shapiro (New York, New York): A protein is, of course, made up primarily of amino acid chains; but most pro- teins, particularly the specific ones— enzyme proteins — do contain polysaccharides and do contain very specific, important, and critical terminal groups. Is it the general notion now that the DNA chain contains the information that determines these non- amino acid fractions in the protein molecule as well, or is this supposed to be a different kind of thing? If so, is there any clue as to what? Echols; In general, as proteins and enzymes have been purified more carefully and more successfully, they have been found in most cases to contain nothing except amino acids. I would feel that the appearance of a sugar group or some other moiety attached to a protein would either be a nonspecific accident or the result of a specific site built into the amino-acid-determined structure of the protein. In other words, I think the complete specificity of the protein comes about because the DNA specifies the amino acid sequence. Frank Morrell (Palo Alto, California): We know of some agents that can alter DNA, such as x-ray, etc. Could you elaborate on the sorts of agents that can selectively alter base sequence in RNA? Echols: One which is widely used is nitrous acid. This removes the amino groups from bases, producing a change of the base cytosine into the base uracil in RNA. Morrell: What is the consequence of alteration of the base sequence in RNA without simultaneous alteration of DNA? Echols: Mutagenic agents which aff'ect RNA but not DNA are not known. Thus, in treating a bacterium with a mutagenic chemical, both the DNA and the RNA are involved. To my knowledge, no one has been able to purify a messenger RNA. 74 Information Storage and Neural Control Until this is done, it will not be possible to study the effects of specific agents on the RNA and on the resulting proteins. The only kinds of messenger RNA's which are available are the syn- thetic ones. James E. Darnell, Jr. (Cambridge, Massachusetts): The viral RNA's were the first to be treated with deaminating agents. Schuster's work with tobacco mosaic virus (TMV), which showed that deamination of cytosine resulted in mutation, confirmed the fact that the chemical change is preserved in the progeny particles. The deamination, which results in mutation of TMV particles and change in the protein code, is assumed to be preserved from the initial change in the RNA of the virus. If one considers viral RNA's as messengers, which they are, then this type of messenger, at least, can be treated with mutagens, and the residual damage, if you will, can be preserved. Echols: I suppose I am being unfair to the viral RNA's, although it has not been clearly established that the viral RNA functions directly as a messenger. Heather D. Mayor (Houston, Texas): There are clear in- dications from viral RNA that the seat of genetic information can be RNA as well as DNA. I think there is quite good evidence that an RNA virus can act as its own messenger. I should like to ask Dr. Echols if he has any information from mutations on closely positioned bases in DNA. Is there any evidence that the code is indeed a triplet sequence rather than, say, a multiple of six? In fact, are there any data indicating that six bases could represent the fundamental unit of the code? Echols: I think that there is no compelling evidence, even taking Crick's work into consideration, defining the size of the coding unit. However, the work of Nirenberg and Ochoa cer- tainly suggests that the number of bases which code an amino acid cannot be an exceedingly large number. If a large number were required, the only thing which should promote incorporation of most amino acids would be a polymer containing all four bases. Mayor: If you have four bases and a triplet code, you could possibly get codes for sixty-four different amino acids instead of the twenty we know to be involved. Do you think that dif- ferent combinations of bases may code the same amino acids? Genetic Control of Protein Synthesis 75 Echols: The work which Nirenberg and Ochoa have done suggests that there may be a degeneracy in the code; i.e., there may be more than one coding unit which specifies a given amino acid. They only use polymers containing uracil, but they have already worked up to nineteen or twenty amino acids. This suggests that there is either a degeneracy in the code or a rather surprising selection for "sense" codes containing uracil. Also, incorporation of at least one amino acid is promoted by polyiners containing different coding units. CHAPTER V CODING BY PURINE AND PYRIMIDINE MOIETIES IN ANIMALS, PLANTS, AND BACTERIA* Saul Kit, Ph.D. T INTRODUCTION HE transfer of genetic information in biological systems may be represented by the following schematic diagram: DNA DNA Filial cells Parental cell Figure 1 Implicit in this schematic diagram are three concepts: 1) that genetic information is coded in deoxyribonucleic acids (DNA); 2) that the transfer of genetic information from parental to filial cells involves the replication of the DNA and distribution of "equal" amounts to the daughter cells; and 3) that the expression of genetic potential within a cell involves the transcription of information from DNA to "Informational" ribonucleic acids *This investigation was aided in part by grants from the American Cancer Society, The Leukemia Society, Inc., and The National Cancer Institute (CY-4064, CY-4238). 76 Pyrimidine Moieties in Animals, Plants, and Bacteria 11 (RNA) and the translation of the Informational-RNA for specific protein synthesis. The key to the understanding of tliese concepts is tlie model for the structure of DNA proposed in 1953 by Watson and Crick (77). Watson and Crick suggested that DNA consists of two helical polynucleotide chains of opposite polarity which are twined round one another (Fig. 2). Each chain is built from four mononucleo- tide units which are joined together by 3', 5' phosphodiester bonds. Each of the four mononucleotide units consists of either a Fig. 2. Simplified model of die DNA double helix showing hydrogen bonding and the DNA strands of opposite polarity. P = orthophosphate; S = deoxyribose; A = adenine; T = thymine; G = guanine; G = cytosine. 78 V/ Information Storage and Neural Control OH K .OH HjC O— PO3H2 HjC O PO3H2 Ribose Phosphate Deoxyribose Phosphate Fig. 3. Chemical formulas for tlie pentose phosphates found in DNA and RNA. N^C — NH. N=C— OH N — C — NH ^CH N— C— N >" N: ■NH, HC C — N^ HjN-C C— N^ 0=C CH H— N CH Adenine (A) Guanine (G) Cytosine (C) N==C — OH 0=C CH H N CH N:=:C OH 0=C C CH, H N — CH N: -NH. o=c ■CH, H— N CH Uracil (U) Thymine (T) Methyl Cytosine (MC) Fig. 4. Chemical formulas for the purine and pyrimidine bases found in DNA and RNA. Pyrimidine Moieties i?i Animals, Plants, and Bacteria 79 purine or a pyrimidine base connected in nucleoside linkage to deoxyribose-5 '-phosphate (Fig. 3). The purine and pyrimidine bases are adenine (A), guanine (G), cytosine (C), and thymine (T) (Fig. 4). In addition, methyl cytosine (MC) may partly replace cytosine in the DNA of certain plant and animal cells and glu- cosylated hydroxymethylcytosine may replace cytosine in the DNA of the T-even bacteriophages. The four bases A, G, C, and T are the symbols of the genetic alphabet, just as dot and dash are the symbols of the Morse code. Triplets of bases, such as TTT or TGC, may be the letters of the genetic alphabet and each tri]3let may specify a particular amino acid of a protein chain. Tims, the sequence of triplets along a polynucleotide chain would determine the amino acid sequence of a protein. The two DNA chains are held together by hydrogen bonds between the bases, each base being joined to a companion base on the other chain (Fig. 5). The pairing" of the bases is specific, HO OH o /^ HO-P = 0 0=d-OH X r -\j ^ ^ VvO^ 0 — ,v HO H o T A H OH OH I HO 0»d-OH O x2 E I _x cf V°V HO H I -a M 0 >>/ G C H OH Fig. 5. Hydrogen bonding between deoxyadenylic acid and thymidylic acid and between deoxyguanylic acid and deoxycytidylic acid. 80 Information Storage and Neural Control adenine (A) going with thymine (T) and guanine (G) going with cytosine (C). The phosphate groups of the DNA chains are accessible to hydrogen or hydroxyl ions and to dyes and are, therefore, on the outside, whereas the bases occur opposite one another on the inside. From x-ray diffraction studies, it has been deduced that there is a succession of flat nucleotides spaced 3.36 A apart and standing out perpendicular to the fiber axis. The structure is relatively rigid and serves as a template for either its own replication or for the replication of "Informational" RNA. Plausible mechanisms for DNA replication and for spontaneous mutations were embodied in the proposals of Watson and Crick (77). These mechanisms are strongly supported by a large number of experiments. According to the proposal for DNA replication, a twin stranded DNA molecule partially unwinds, and each base attracts a complementary free nucleotide already available for polymerization within the cell. These free nucleotides, whose phosphate groups already possess the free energy necessary for polyesterification, then link up with one another, after being held in place by the parental template chains, to form a new poly- nucleotide molecule of the required nucleotide sequence. Thus, each DNA strand serves as a template for the synthesis of a com- plementary strand. The replication process can be schematically represented as follows: -A-C-T-G-->-A-C-T-G- -A-c-T-G-/* : : : : '••• -T-G-A-C- -T-G-A-C-\ -T-G-A-C-->-T-G-A-C- Parental DNA '.'.'. Duplex DNA Chains .... After Unwinding -A-C-T-G- New DNA Duplexes It is a corollary of the Watson-Crick hypothesis that a change of one or a few nucleotides in the DNA sequence will be mutagenic. Mechanisms for spontaneous mutation and for experimentally Pyrimidine Aioieties in Animals, Plants, and Bacteria 81 induced mutations have been suggested on the basis of this con- cept. Watson and Crick (77) pointed out that the specificity in DNA structure (adenine pairing with thymine and guanine with cytosine) resuhs from the assumption that each of the bases possesses one tautomeric form which is very much more stable than any of the otlier possibilities. The fact that the compound is tautomeric, however, means that the hydrogen atoms can occasion- ally change their location. Thus, a spontaneous mutation might be caused by a base occurring very occasionally in one of the less likely tautomeric forms at the moment when tlie complementary chain is being formed. For example, whereas adenine will nor- mally pair with thymine, if there is a tautomeric shift of one of its hydrogen atoms, it can pair with cytosine. The next time pairing occurs, the adenine (having resumed its more usual tautomeric form) will pair with thymine, but the cytosine will pair with guanine, and so a change in the sequence of bases will have occurred. Mutations may also be induced by chemical agents. Let us consider two categories of chemically induced mutations: 1) those which result from the conversion of one nucleotide base in a DNA chain co another nucleotide base (transition), and 2) those which result from the deletion of a base from the chain. The conversion of cytosine to thymine may be effected by adding nitrous acid to cells (58). Cytosine is deaminated by nitrous acid to uracil. When DNA which has been deaminated in this way undergoes replication, the uracil will attract adenine during the complementary base pairing. The next time pairing occurs (second cycle of replication) the adenine which had paired with uracil will now pair with thymine. Hence, following two cycles of DNA replication, the original cytosine-guanine base pair will have been converted to a thymine-adenine base pair. A mutagenic change from thymine to cytosine may be induced in cells by the use of the thymidine analog, bromodeoxyuridine (21). A nucleotide base change in the DNA chain expresses itself as an amino acid change in the protein chain whose synthesis is controlled by the altered DNA. A third mutagenic agent, nitrogen mustard, may alkylate some of the guanine groups of the DNA chain at the N(7; position. 82 Informatio7i Storage and Neural Control Two stages of degradation follow: 7-alkylguanine splits off, and a slow fission of the sugar phosphate chain follows (41). If in the process of replication of the DNA which has been exposed to nitrogen mustard, the guanine of the template is skipped, the resulting sequence of nucleotide bases in the daughter chain will be altered. Another mutagen which apparently acts by deleting a base from the DNA chain is proflavin, an acridine derivative, A series of T-4 bacteriophage mutants induced by proflavin have been studied by Crick et al. (13). The mutants in almost all cases manifest a complete inactivation of the function of the eene Equations 1 through 3 depict schematically some current ideas about genetic coding (13): (TRIPLET) I ' I II 1 I 1 I 1 I 1 I 1 [1] TGCTGCTGGTGCTGGTGGTGA--- I - ' — '-I— I— 1 — 1— I — l-_l_l_l_l_l_l_l_l_|_|__L I Starting ' point Normal DNA Ghain I I I 1 I 1 1 1 r "1 r [2] TGGTGTTGGTGGTGGTGGTGA - I — I — I — I — l^-l I I I I I ^1 L_l l_l__l_L_l._[ Starting: T ^iD point Gonversion of G to T by Nitrous Acid (Gene may still be functional, amino acid in protein chain is changed). [3]TGGTGTGGTGGTGGTGGTGA I — ' — I— I— I— 1^1 I I I L_l I I l_I_^l_^l__I_| Starting T point G deleted from Ghain by Proflavin or Nitrooen Mustard (Gene inactivated). It is assumed that a triplet of three nucleotide bases (for example, TGG) codes each particular amino acid in the protein chain. It is further assumed that the DNA chain is translated from a fixed starting point and that the genetic code is nonover lap ping. If one base in the chain is altered (for example, the change of G in the second Pyrimidine Moieties in Animals, Plants, and Bacteria 83 triplet of Equation 1 to T in the second triplet of Equation 2), only one amino acid in the resulting protein chain will be altered; that is, the amino acid coded by TGT replaces the amino acid coded by TGC. The protein with the changed amino acid may still be functional or partly functional. If a nucleotide base is deleted following nitrogen mustard or proflavin treatment, the gene is inactivated. In Equation 3, it is seen that G has been deleted from the second triplet of Ecjuation 1. With a nonoverlapping code, the second triplet now becomes TCT, the third triplet becomes GCT, etc. In other words, all triplets from TCT on are changed. Thus, all amino acids in the protein chain which are coded by the second triplet to the last triplet are changed, and the new protein chain cannot function. As a result, the gene controlling the synthesis of that protein has been inactivated. For a more detailed discussion of mutation mechanisms at the chemical level, the reader is referred to the papers of Freese (21), Lawley (41) and Crick et al. (13). It is apparent that a knowledge of the number of DNA molecules in a given cell and of the entire nucleotide sequence of each mole- cule, along with the code by which DNA and RNA sequences are translated to the amino acid sequences of proteins, would suffice as a "blue print" for describing any organism. Such a total description is, of course, not available to us as yet. We do, however, know cer- tain characteristics of the DNA of many species of bacteria, plants, animals, and viruses. The amount of DNA per cell is known in many instances. This, in a sense, tells us how "thick" each "genetic book of instructions" is. We also have knowledge of the average nucleo- tide composition of the DNA of different species, that is, of how fre- quently the "alphabet symbols" are repeated in each book. There is also some knowledge of the range of composition within a particular cell. These topics will be discussed in the second section of this paper. In the third section of this paper, I shall consider the composition of RNA molecules and the proposed mechanisms for transcribing information from DNA to RNxA. Finally, I shall briefly touch on the translation of the DNA-RNA code to amino acid sequences of proteins. 84 Information Storage and Neural Control THE DEOXYRIBONUCLEIC ACIDS (DNA) Amount of DNA per Cell The amount of DNA per cell varies greatly from the simplest to the most complex organisms. Some representative values are shown in Table I. Mammals, reptiles and amphibians often contain about six picograms of DNA per cell; but many fish and birds contain only about a third of this amount. Bacteria and fungi cells have about 1/100 the amount of DNA found in the higher animals. Larger viruses such as vaccinia and T-even bac- teriophage have about 1/10,000 the amount of DNA per particle and the smallest viruses, such as bacteriophage 0X174 and the Shope papilloma virus have one millionth the amount of DNA per particle found in the cells of higher animals. TABLE I DNA Content Per Cell of Various Species (All Values Expressed as Picograms DNA Per Cell or Virus Particle) 2.6x10-6 Reference <^X174 phage (65, 66) T-2 phage 2x10-" (66) Rabbit papilloma virus (Shope) 6.6xl0-« (56) Vaccinia virus 3 X 10-^ (56) E. coli B (log phase) 0.0137 (26) Clostridium 0.0245 (75) Yeast (diploid) 0.05 (50) Neurospora 0.017 (47) Fish Shark 5.46 (75) Sturgeon 3.2 (75) Carp 3.49 (75) Perch 1.9 (75) Catfish 1.89 (75) Barracuda 1.37 (75) Amphibians Frog 15.0 (75) Toad 7.33 (75) Reptiles Green turtle 5.27 (75) Wood turtle 4.92 (75) Alligator 4.98 (75) Birds Domestic fowl 2.34 (75) Guinea hen 2.27 (75) Mammals Man 6.8 (75) Rabbit 6.5 (75) Rat 5.7 (75) Mouse 5.0 (75) Pyrimidnie Moieties in Animals, Plants, and Bacteria 85 Since the molecular weight of Shope papilloma virus is about 4x 10" (78) and the molecular weight of an average nucleotide base pair is about 600, it is obvious that Shope papilloma virus has a total of about 4 x 10V6 x 10", or 6600 nucleotide base pairs in its DNA. Vaccinia virus has roughly 300,000 base pairs; bac- teria have roughly 20x10*^ base pairs, and mammalian cells a total of about 7x10^ base pairs. If there are no restrictions as to the proportion in which base pairs occur or as to the sequence in which they occur, the number of different DNA molecules possible is 4", where n is the number of base pairs. Thus it is clear that DNA provides an adequate basis for gene specificity. The increase in the relative amount of DNA from the lowest to the highest forms of life reflects the need for an increasing number of genetic units for embryogenesis and differentiation and for various regulatory mechanisms. How Large are DNA Molecules? The molecular weight of Shope papilloma virus is about 4x10'' (78). The weight average molecular weights of most of the DNA preparations which have been studied are about 5-14 x 10^ How- ever, the molecular weight of DNA may actually be much greater than this. Very high molecular weight DNA has been isolated from the T-even bacteriophages and it is possible that the entire genome of the T-even bacteriophages consists of one long DNA chain having a molecular weight of 90x10'' to 150x10" (15). There is reason to believe that very long DNA molecules are partially fragmented to smaller pieces when they are isolated from tissues and viruses. Average Composition of DNA The average nucleotide base composition of DNA molecules may be measured by hydrolyzing the DNA and then measuring the nucleotide bases after they have been resolved by paper chromatography, ion exchange chromatography, or paper elec- trophoresis. To the extent that a mixture of DNA molecules can be partially separated, the distribution of base compositions among the molecules of the mixture can also be estimated. Two other important methods are available for measuring the molar nucleotide composition of DNA: 1) equilibrium sedimenta- 86 Injormation Storage and Neural Control < Q H < Z o .1^ o * o R + ^^ s^ ^;^ ^ <^ ic: G -s^ sr- «u -w Ml W it l=/l o -^ e- ^ q CN]f\l-i^->^TfTtul\OVOr--COCOOOOOCNCNCNCNCN^-iT--0]01M010!tN]MT+Tj-LriiOv-iO r-'T-.tococ<-ir<-i':t-LnLnsoi^r-r^[~-oooooooooooo^-^^^^^T-Hroco^^ij-)Ln C^C7^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^C^OOOOOOOOOOOOOOC^ < o So Oh ro Q Q O Group 9484 RD 75 a Group 35 3478 < -* O] J C in ;i3 bp >. oj C "3 "^ ^ 'b 3 P =^ S 3 ^ fs o 2 « jS en 3 3X! .2-2 S O rt .cti >- Hj ^ P 3 O ^r, cj c^ ^^ p 03 CL en -O 0) (/] „ 3 t^ ■O C; 3 « 3 •-< tn C — -^ ' y 2i 3' ' _ tH to: 3 brj tn > 3 n 3 o > -s ■^i='~— oj^o— oo-^n ., .„_ m — .^c^sS^rt^iJ'-" UU22OM0HpqOi-If0c/3fiHKpqEEc^ciI| o 'S fl tn .22 a en r3 I °-c o O lU Sh I _Q ^O ^ i-.^-i-rio'-'i:— 5 o o o o x.y o y ^ 3 3-5 ^28= 10^:3^^33 = &Hh-lh-l Pyrimidine Moieties in Animals, Plants, and Bacteria 87 JS ho S -0 ^°l^ >-i^ w ;coQ, >t^ CA m o o CN iri -; . 3- u-iH 3-^ ■c u br rt OJ C 0 __, ,_, ,_, 5J 5-' C a: C 0 0 0 0 0 0 0 rt 02 d a7i 6 c« m « « "C .5 -n 2 0 03 tj "o "u 3 «j '-H lU 0 2; w w w a 'H o w;^ O >. w nj 3 ^ ^ c c •- six; n! 03 « Di c/3 c/5 C/3 W u e S '-' S S C X crt O a; ij ,. e w s XI i: isi ; 3 3 ^ 3 0 3 C i^ h ! Ji (« p 73 « -M 3 u J5 O U l) s^ 3 M w 03 o «j r; .-y) oa CL, < Q « 3 3 11 O 2j .2,0 o "3 n! C 3 3 3 r r- ^ ■ S S c« D _j ^j T3 3 0 2 t; ii -T. « iOc^p: £ X 73 5J G n 0 0 -1 3 > a U F s a X3 0 N £ 0 ■a 0 X 0 0 -a 0 3 0 0 a H. C G J3 X u u u C^Qpi ^ Oh C^ t^ c)? JJ X o ss -^ X ^ r 88 Information Storage and Neural Control tion in CsCl density gradients; and 2) the estimation of the melting" temperature (T,„) of DNA by a study of the change in DNA absorption as a function of temperature (16). The latter two methods have the advantages that less material is required to analyze the DNA and that an estimate of the variation from the mean of nucleotide base composition can be made. Of the three methods, density gradient centrifugation has the highest accuracy, requires the least DNA per experiment, and permits the detection of DNA molecules of unusual base composition even where the latter comprise less than 5 per cent of the total DNA. A further discussion of density gradient centrifugation will be presented later. Since DNA is double stranded* and the guanine and adenine of each chain are paired, respectively, with the cytosine and thymine of the complementary chains, the total purine bases (A+G) are equal to the total pyrimidine bases (C+T) and the total 6-amino bases (C+A) are equal to the total 6-keto bases (G+T). However, the ratios of guanine plus cytosine to adenine plus thymine (G+C)/A+T) are not the same in different or- ganisms and provide a parameter by which organisms can be characterized. The molar (G + C) content of the DNA of seventy-two different bacterial species, thirteen species of higher plants, ten species of algae, four species of fungi, two species of protozoa, sixteen species of invertebrates, twenty-three species of animals, twelve bacterial viruses, six animal viruses and rickettsiae, and twelve insect viruses have been measured and are shown in Tables II through IX. The molar per cent (G+C) varies from 26.5 per cent in the protozoan, Tetrahymena, to 73 molar per cent (G + C) for the bacterium, Mycobacterium phlei. Of the 170 species listed in Tables II through IX, 112 have DNA molecules whose average molar (G + C) contents are 40 to 60 per cent. This is not surprising. It is probable that Kutagenic transitions of adenine-thymine to qua- nine-cystosine base pairs in one part of a DNA molecule are compensated for by transitions from quanine-cystosine to adenine- thymine base pairs in other parts of the molecule so that the molar per cent (G + C) remains, on the average, close to 50 per cent. *Exceptions to this statement are the DNA of the bacteriophages, 0X174 and SI 3, which are single stranded (65, 66). Pvrimidine Moieties in Animals, Plants, and Bacteria 89 X ^_, C5 K ffi td « o «J' oi H > ^ QJ ^ ^ r-< r^ 00 ■r- ■^ ON CM 00 CO ^-1 Tf lO CO lO NO \ocMcor^r^t-~-oocNNO -^-^ ssoooooocM-^r^oooo COT)--*',J2 > !- 2 '^-' ^ > S"^!^ ^? « > JS . — « I S "rt -S J3 3 (^ u 03 i, §: K < O c/3 "SoiOH SO D ii P3 - .s 3 >- U 2 3 6 u S cq Ci 90 Information Storage and Neural Control i f^" w e) •^ 1 ■&^ CM CS CM Tl- so r^ ■b , '^ all S c " ^ n! « rt "^ O Q ^fo C« C ,1 3 w M >ii o 3 « M V > ^ oooo-*oooo- DM ■'-' "^ w +-< r- •« S-i 4J 5" a nj ij -^ i! ^ dJ ■•-' o *^ ^ O i- oj « 2 3 3 C 3 3 c -0 ^^ p >< S 3 Echinoc Psamme Oak mo Paraccn Arbacia Silk woi Arbacia House f ii S S ii 3; S 6 « j> JQU |l3(jCCoCCu t^ B3 Pyriffiidine Moieties in Animals, Plants, and Bacteria 91 o txi o:^ o:) m 00 00 00 ro fe§&§^°fe? &?fe5 ^ £5&-5 Tj- (N] (N] ri rO ■" CO TT -"T Tj- tT ^ ^ ^ Tl- -:!■ ■^ Tt- &5&°fe?£?&?£^°^^°&§fe? ^6?^fe?65^^^ (^1 o o o O O o o o o r^ r^ Tj- Lo ■^ m Tj- Tj- o o o o o o o o o o o o 00OOC->OOO'>-'Oi-' CsOOCnOCsOOOO s s V v JJ gi 00 „ c 'H.'a 3 3 E B CJ ^ c be- t' ^ c -? -^ ^ N U U <^ a gl^ 5^1H « w ^ si VI QUO QJ G G;-D bo 2 •— tti re B n u -— - j3 re nj 5 '-' ^ ., -C ;3 ■- u ^ « c P, ^ ^ a man Lass 1 bull calf monk dog rabbi Chine goldc rat O O m ^ o o d I-- CO ^ o CN o o cv 1 so i~~ r^ so '~-'— -^^ U eg J^ 3 CW^ M r^ c ^ 4^ V aj re "a en -S3 bB* C C '^ IH 3^=; ^ -J i-T .S u dj re JJ > u •> JD ;s bp 'a re « V M c 3 bO O B o 92 Information Storage and Neural Control TABLE VII The Densities and Base Compositions of Bacteriophage DNA Density Gradi 'ent Centrifu ■gat ion Chemical Analysis Phage Density gcm-^ % , G+C Reference %G+C Reference T2 + 1.700 35 (66) T4 + 1.698 34.4 (66) T6 + 1.707 34.2 (66) T5 1.702 43 (60) 39 (66) 0X174 + + + 1.72 43 (65) 44 (66) Salmonella Al + + 43.4 (66) Phage alpha 1.704 44 (11) 42.5 (11) Tl 1.705 46 (60) 48 (66) T7 1.710 51 (60) 48 (66) T3 1,712 53 (60) 49.6 (66) Xvir 1.710 51 (60) 50 (66) P22 50 (66) +T2, T4, T6 contain hydroxymethylcytosine + + Preliminary or tentative data + + + Single stranded DNA TABLE VIII DNA Base Composition of Animal Viruses and Rickettsia Density Gradient Centrifugation Chemical Analysis Virus Density gcm-^ 1.714 %G+C 50 Reference (78) % G+C Reference Shope papilloma 49 (78) Vaccinia 1.698 39 (38) 40.6 (10) Fowl pox 38 (52) Rickettsia burnetti 1.704 45 (60) 45 (10) Rickettsia prowazeki 30.8 (10) Rickettsia rickettsii 37.5 (10) TABLE IX Base Composition of DNA of Insect Viruses (3) Host Species %G+C Polyhedral viruses Porthetria dispar 58.7 Lymantria monacha 51.5 Clioristoneura fumiferana 51.2 Ptychopoda seriata 47.6 Malacosoma americanum 42.7 Malacosoma disstria 42.2 Bombyx mori 42.7 Colias philodice eurytheme 42.5 Neodiprion sertifer 37.3 Tipula poluclosa Merg. (T.I.V.) * 31.5 Capsular viruses Cacoecia muriana 37.6 Choristoneura fumiferana 34.8 "Reference (74). Pyrimidine Moieties in Animals, Plants, and Bacteria 93 The variation in composition of DNA molecules among dif- ferent species of microorganisms is very great. C/ostricIium perfringens has only 32 molar per cent (G + C) while at the other extreme of the distribution, Alycobacterium phlei has 73 molar per cent (G + C) (Table II). Fungi vary from 36 to 54 per cent (G + C) in the four species investigated and the two protozoan strains so far measured {Tetrahymena and Euglena) contain 26.5 and 47 molar per cent (G+C), respectively (Table IV). The range of average DNA values among algae is also rather great — 36.9 molar per cent (G+C) for diatomic algae to 64 molar per cent (G+C) for the green alga, Chlamydomonas reinhardi (Table III). The dis- tribution of average values among different species of higher plants (Table III) and invertebrates (Table V) is much narrower: Plant DNA composition varies from 35 molar per cent (G+C) for tobacco leaves to 48.4 molar per cent for Triticum vulgare, and ainong invertebrate species the values vary from 34.9 molar per cent for the echinoderm, Echinocardium cordatum to 44 molar per cent (G + C) for the crab, Carcinus maenas. The range of values is very narrow indeed for the average composition of the twenty-three vertebrate DNA species so far examined. The values range from 40 to 44 molar per cent (G + C) (Table VI). The range for DNA aniinal viruses appears to be greater than that for the host animal species: 38 molar per cent (G+C) for fowl pox virus and 50 molar per cent (G+C) for the Shope papilloma virus of rabbits (Table VIII). Various insect viruses manifest in their DNA average (G + C) contents of 31.5 to 58.7 per cent (Table IX), a range which is also somewhat broader than the compositions of the few insect DNA's so far studied. The T-even bacteriophage DNA's have about 35 molar per cent (G + C), a value outside the range of the bacterial host in which the viruses grow {E. coli, 51 per cent (G+C) (Table VII). The DNA from a number of other bacteriophages (T-1, T-3, T-7) and the lysogenic phages (X, P22, salmonella Al) have average molar (G+C) contents which are very similar to those of the bacterial host cells {E. coli. Shigella, Salmonella). Base Composition of DNA Strands Since a DNA molecule consists of two complementary nucleotide strands of opposite polarity, the question arises as to whether the 94 Information Storage and Neural Control C3H LUNG C3H BRAIN Fig. 6. Ultraviolet photograph of the banding of mouse DNA and Streptomyces viridochromogenes DNA in a CsCl density gradient. The photograph was taken after centrifugation for twenty-four hours at 25° at 44,770 rev. per minute. The Streptomyces band (p25° =[l.729 gcm-^) appears at the left. Two bands with mean densities of 1.701 and 1.690 gcm-^ were obtained with mouse DNA. The narrow band at the right is due to the meniscus of the solution. two Strands have similar or grossly dissimilar average nucleotide compositions. Only fragmentary data are presently available. There are indications that the DNA strands of animal cells differ slightly in thymine, and hence, in adenine content (14). The DNA strands of the bacteriophage, alpha, have been separated Pyrimidine Moieties in Animals, Plants, and Bacteria 95 and shown to differ in density (11). The density differences could reflect differences in the base compositions of tlie strands. Tliere are alternative explanations, however. For example, one strand might contain more glucose residues attached to hydroxymethyl- cytosine than tlie other strand. Equilibrium Sedimentation of DNA in CsCl Density Gradients Equilibrium sedimentation of DNA in CsCl density gradients provides an elegant method for characterizing DNA with respect to average nucleotide composition and heterogeneity of composi- tion (46). Following the centrifugation of a DNA solution for twenty-four hours in a CsCl density gradient, the DNA tends to form a band at a position in the cell corresponding to its effective buoyant density. A photograph, taken with ultraviolet light, of the banding of bacterial and mouse DNA is shown in Figure 6, and a microphotodensitometer tracing of the photograph is presented in Figure 7. The white (ultraviolet light absorbing areas) in the 24 Hours, 25°C (CsCI-p= 1.7208) 24 Hours, 25°C (Cs CI- p= 1.7165) Fig. 7. Microdensitometer tracing of photograph of the banding of mouse DNA and Streptomyces DNA after density gradient centrifugation. See Figure 6. 96 Information Storage and Neural Control center part of the photograph correspond to Streptomyces virido- chromogenes DNA and to two mouse DNA bands, respectively. The effective mean buoyant density of the Streptomyces virido- chromogenes DNA band is 1.729 gcm~^ The corresponding values for the principal and minor mouse DNA bands are 1.701 and 1.690 gcm~^ (Table II and Table VI). The mean densities of the bands can be measured with an accuracy of ±0.001 gcm~^ It has been shown by Rolfe and Meselson (54) and by Schild- kraut et al. (60) that the mean effective buoyant densities of double stranded DNA bands vary linearly with the molar per cent (G+C) content of the DNA. For example, Streptomyces viridochromogenes DNA (density = 1.729 gcm~^) contains 73 molar per cent (G+C), Escherichia coli DNA (density = 1.710 gcm"^) contains 51 molar per cent (G+C), and mouse DNA (density = 1.701 gcm~^) con- tains 42 molar per cent (G+C). Thus, if the density of DNA is measured by equilibrium sedimentation in CsCl, the molar per cent (G + C) can be calculated. TABLES II to VIII show the densities of DNA preparations from various sources and the agreement between molar per cent (G + C) as calculated from the densities of the bands and as determined directly by chemical analyses. The standard deviations of the DNA bands expressed in density units (a ) depend upon at least two factors: 1) the molecular size of the DNA, and 2) the heterogeneity of DNA composition within the sample. This follows from the following considerations. The centrifugal field tends to drive the DNA into a region where the sum of the forces acting on a given molecule is zero. This concentrating tendency is opposed by Brownian motion, with the result that at equilibrium, the macromolecules are distributed with respect to concentration in a band of width inversely related to their molecular weight (46). When a DNA population consists of molecules which differ considerably in density and in molar per cent (G+C), discrete and nonoverlapping DNA bands may be formed. For example, two discrete DNA bands are formed with mouse DNA (Figs. 6, 7) and there is no overlapping between the mouse DNA (p = 1.701 gcm"^) and the Streptomyces (p = 1.729 gcm"^) DNA bands. On the other hand, if a DNA preparation consists of a heterogeneous Pyrimidine Moieties in Animals, Plants, and Bacteria 97 population of molecules differing only slightly in density and in molar per cent (G + C), the DNA bands will overlap, and there will be an increase in the overall standard deviation of the band. So long as the total variance of a band and the number average molecular weight of a DNA preparation are known, it is possible to calculate the contribution which heterogeneity of composition makes to the band width (16, 36, 54). An independent estimate of the heterogeneity of composition of a DNA preparation may also be made from the DNA thermal denaturation curves. These independent estimates agree satisfactorily. Nonoverlapping Bands Density gradient centrifugation experiments have permitted a number of interesting conclusions concerning DNA. First, as already mentioned, the DNA obtained from many bacterial species form discrete bands which do not overlap. These observa- tions indicate that the respective organisms have no DNA mole- cules with common density, and, by inference, that they have no DNA molecules with common nucleotide base composition (16). Since the metabolism and replication of bacteria do have much in common, it is generally thought that many of their proteins should be identical or very similar. Yet all current discussions of the way in which DNA can control the sequence of amino acids in proteins require a direct correlation between the composition of the DNA and of the protein. There are a number of ways in which this dilemma can be resolved. The simplest approach is to assume that some amino acids are coded by more than one nucleo- tide triplet. For example, each of the triplets UCC and UAC might specify the amino acid threonine. Thus, the dependence of the amino acid composition of proteins on the DNA nucleotide composition would not be exacting and DNA molecules having different compositions could code very similar proteins. A genetic code in which two or more nucleotide triplets are used to code a particular amino acid is called a '"degenerate" code (13). There are a number of arguments which can be advanced in favor of this hypothesis. By studying nitrous acid induced mutants of bacteriophage, Tessman (73) demonstrated that each of the complementary DNA strands is functional. The experiments "* of 98 Information Storage and Neural Control Chamberlin and Berg (7) suggest that genetic information can be transcribed from either of the two DNA strands to infor- mational-RNA. Thus, when single stranded 0X174 DNA was used as a template for RNA polymerase, an RNA strand having a composition complementary to the 0X174 DNA was formed. However, when single stranded 0X174 DNA was used as a tem- plate for DNA polymerase, double stranded DNA was formed. The double stranded 0X174 DNA could then be used as a primer for RNA polymerase and, in this case, RNA was formed having a composition identical to that of double stranded 0X174 DNA. If both of the complementary DNA strands are ultimately trans- lated from the same fixed starting point, the foregoing experi- ments would suggest that complementary nucleotide triplets code the same amino acid. On the other hand, if the complementary DNA strands are translated from opposite starting points, the results would point to one of two possibilities: 1) that two dif- ferent nucleotide triplets code the same amino acid, or 2) that the complementary DNA strands are identical even though they have opposite polarities. The second of these possibilities would further require that one half of a given DNA strand be complementary to the opposite half. The latter restrictions do not seem to apply to the 0X174 DNA studied by Chamberlin and Berg (7, 65). Davern's experiments (14) also suggest that these restrictions are unlikely as a general proposition. Hence, some form of a "degen- erate" genetic code seems to be the most appealing hypothesis at this time. Unimodal and Bimodal Distributions The DNA from almost every species so far examined forms one discrete band after density gradient centrifugation. Streptomyces viridochrornogenes DNA (Figs. 6, 7) illustrates this unimodal dis- tribution. Several examples have now been found in which DNA manifests a bimodal distribution. Mouse DNA illustrates the bimodal DNA distribution (36). Mouse DNA manifests a major component having a density of 1.701 gcm~^ and a second minor component, comprising about 8 per cent of the total DNA, having a density of 1.690 gcm~^ (Figs. 6, 7). A bimodal DNA distribution is also observed with guinea pig DNA. In this case, the major component Pyrimidine Moieties in Animals, Plants, and Bacteria 99 is the lighter one (1.697 gcm~^) and the minor component is the heavier one (1.703 gcm"^^) (36). Since the density of DNA is linearly related to the molar per cent (G+C), it is possible that the mouse and guinea pig minor components consist of a population of DNA molecules differing in molar per cent (G+C:). However, alternate explanations for the minor component may be offered. This point will be resolved when the guinea pig and mouse minor DNA components are isolated in pure forin. An interesting bimocial distribution has been found by Sueoka (72) in crab testes DNA (Table V). The major DNA component has a density of approximately 1.705 gcm"^ However, a very light minor component also occurs which is indistinguishable from a double stranded polynucleotide in which only two of the four nucleotide bases are present. The bases involved are adenine and thymine and the polymer is called the (deoxy-A-T) polynucleotide. The function of this unusual polynucleotide, deoxy-A-T, is unknown. Heterogeneity of Composition of DNA Since the DNx^ of any species is quite heterogeneous, it is of interest to compute an upper bound on the standard deviation {aac) of the distribution of the guanine-cytosine base pairs over the population of DNA molecules. The upper bound of {ctgc) is given as: [4] O-GC max = 10 O- density where o- density is the standard deviation of the DNA distribution in the CsCl density gradient (54). It should be pointed out that the actual value of (tqc lies considerably below the calculated upper bound because thermal motion of the DNA molecules con- tributes significantly to band width. The DNA's of nine bacterial species form bands in the density gradient with o- density in no case greater than 0.003 gcm~^ The corresponding upper bound on the standard deviation, (tqc, of the molecular content of guanine plus cytosine is therefore in no case greater than 0.03. It is remarkable that the standard deviation of guanine-cytosine content within the molecular population of 100 Information Storage and Neural Control any one bacterial species covers less than one tenth of the range over which the mean guanine-cytosine content varies among the various species. Doty and co-workers (1^) have sliown that the total variance of the DNxA. bands equals the sum of the variance due to molecular weight {(tmw'^) and that due to density heterogeneity (or density)- [5] C'r = (J''mW + 0-" density Since the variance due to molecular weight can be estimated, <^^deiisity can be calculated. The latter value can be expressed in units of molar per cent (G+C); that is, in terms of ctqc. For bacterial DNA, Doty, et al. (16) have shown that age is actually equal to about ±1.7 molar per cent (G + C). The values for animal tissues are considerably greater (Table VI). The standard deviation in units of density (o-density) ranges from 0.0037 to 0.0047 gcm"~^ The latter value is equivalent to a standard deviation corresponding to the molecular content of guanine plus cytosine of about 3 molar per cent (G + C). DNA obtained from various adult tissues of mice and from mouse tumors do not differ significantly in their effective buoyant den- sities or in the standard deviations of the density gradient bands. Although all differentiated tissues of a given organism are pre- sumed to have identical genomes, there is evidence that normal tissues and tumors differ genetically. The fact that no significant differences between the DNA of normal and of malignant tissues have been found does not necessarily contradict the latter concept. Instead, it reflects the fact that existing techniques are insufficiently sensitive to detect such differences. It is quite possible that thou- sands of point mutations exist in the genomes of cancer cells. These could not be detected by the relatively gross physical methods so far employed. Although significant differences between the DNA of adult animal tissues and the DNA of tumors have not been found, it has been possible to recognize specific differences between the DNA of various species of higher animals (37). As shown in Table VI, frog, turtle, and alligator DNA are slightly heavier than other vertebrate DNA's. Chinese hamster and frog DNA have relatively low standard deviations for the DNA bands. Also, mouse and guinea pig DNA manifest bimodal distributions. Pyrimidine Moieties in Animals, Plants, and Bacteria 101 Another point of interest is the fact that the small heterogeneity of base composition among the DNA molecules of an organism seems to be true for smaller regions within molecules (72). This indicates that the intramolecular distribution of (G+C) and (A+T) pairs is fairly unifoim, although in short regions (for example, tri- or tetranucleotides) nonrandomness has been demon- strated. The Formation of Hybrid DNA Molecules and Their Use in Studies of DNA Homologies DNA molecules having the same average molar nucleotide com- position do not necessarily have the same nucleotide sequence along the DNA chains. Since techniques for measuring the nucleotide sequence are not currently available, possible sequence homologies between DNA molecules from different sources must be investi- gated by indirect methods. There have been two general approaches to this problem so far: 1) an analysis of the distribution of oligo- nucleotides in partial hydrolysates of DNA; and 2) a study of DNA hybrids. Burton (6) has measured the distribution of short chain oligo- nucleotides in acid degradation products of DNA. Differences could be detected in the distribution of dinucleotides and tri- nucleotides of four animal and four bacterial species. The formation of hybrid DNA molecules has been investigated by Schildkraut and co-workers (61). These studies depend upon the fact that each DNA molecule consists of two complementary strands which can be separated in solution. Strand separation can be accomplished by heating the DNA to a temperature which will "melt" the hydrogen bonds which hold together the double stranded helix. One of the DNA preparations to be tested is labeled with N^^ C''\ or deuterium, so that it will form a heavy band when it is centrifuged in a density gradient. The second DNA preparation is of normal density. The heavy and the light DNA molecules are mixed and the strands are separated by heating. Wlien DNA is slowly cooled, the complementary strands attract each other and the hydrogen bonds are reformed {renatura- tion). Thus, the DNA duplexes are reconstituted. Let us consider the situation when two DNA molecules from the same species are renatured, but where one molecule is labeled 102 Information Storage and Neural Control with heavy isotopes and the second is of normal density. Upon renaturation, hybrid molecules of intermediate density will be formed: [6] Normal Density DNA II, , II Heavy Density DNA (n'^, c'^) Upon Renaturation Gives Heated Gives Single Strands Hybrid DNA of Intermediate Density The normal density DNA, "heavy" density DNA, and "inter- mediate" density DNA can be resolved as discrete bands by density gradient centrifugation. Hybrid formation, then, is de- tected by the presence of a new DNA band of intermediate density. It should be emphasized that hybrid formation can take place only when long regions of the nucleotide sequences of DNA molecules are identical or very nearly so. Molecules having the same average (G+C) content but differing in the sequence of G, C, T, and A along the polynucleotide chain will not form hybrids. The possibility that renaturation and hybrid formation might take place between the DNA of bacterial strains with close taxo- nomic, physiological, and genetic relationships was investigated by Schildkraut et al. (61). Hybrid formation was readily demon- strated between the DNA of E. coli and of six other E. coli strains. Interspecies hybridization of DNA was also demonstrated in certain instances for bacteria having the same nucleotide content of (G + C). Thus, the DNA from B. subtilis and B. natto formed hybrids, and in addition, the DNA from E. coli K-12 formed hybrids with those from E. coli B. and Shigella clysenterioe. Sig- nificantly, these are instances where genetic exchange has been Pyrimidine Moieties in Aiiimals, Plants, and Bacteria 103 demonstrated by conjugation or transduction. No hybrid formation was detected between the DNA oi E. coli K-12 and that oi Salmonella typhimurium. The latter bacteria mate but transduction from one to the other occurs only to a very limited extent, if at all. Aside from the taxonomic importance of this technique, it offers a rational approach to the study of genetic compatibility where genetic exchanges have not been demonstrated. In addition, the technique has found application in connection with the problem of information transfer between DNA and RNA. The latter experiments will be discussed in the next section of this paper. THE RIBONUCLEIC ACIDS (RNA) General Characteristics The ribonucleic acids (RNA), which mediate the transfer of genetic information between DNA and proteins (Fig. 1), differ chemically from DNA in several ways (35): 1) The sugar com- ponent of RNA is ribose, instead of deoxyribose (Fig. 2); 2) Uracil (U), instead of thymine, is the 6-keto-pyrimidine base in RNA (Fig. 3); 3) RNA is a single stranded, flexible polynucleotide coil unlike DNA which is rather stiff and double-stranded; and 4) Most RNA molecules are much shorter in length than DNA. Also, RNA is less stable in alkaline solutions than is DNA. Four classes of RNA are known: l)transfer-RNA, 2) ribosomal- RNA, 3) messenger or informational-RNA, and 4) virus-RNA. Transfer-RNA Transfer-RNA (T-RNA or S-RNA) consists of a family of molecules which function in the activation of amino acids and in the transfer of the activated amino acids to the ribosomal tem- plates so that they can be linked together to form proteins. Prob- ably, a different and characteristic transfer-RNA molecule is required for each of the twenty amino acids. Yeast T-RNA specific for the activation of the amino acid, valine, has recently been obtained by Stephenson and Zamecnik (70) in highly purified form (65-80 per cent). Holley et al., (31) have partially purified the alanine, valine, and tyrosine T-RNA of yeast and have studied the oligonucleotide content of ribonuclease digests. 104 Information Storage and Neural Control T-RNA molecules have a molecular weight of about 25 to 30,000 (80 to 100 nucleotide chain length). The sedimentation constant of T-RNA is about 4S. Three additional characteristics are of interest: 1) It has been shown that guanine mononucleotide terminates one end of the T-RNA chain, 2) cytidylic acid- cytidylic acid-adenylic acid is the trinucleotide which terminates the other end of the T-RNA chain, and 3) each T-RNA chain contains an unusual mononucleotide, pseudouridylic acid (PsU). The function of pseudouridylic acid is unknown. An amino acid can be attached to the adenylic acid end of the chain as shown schematically in Equations 7 through 9 (30): [7] Amino acid + ATP — ^ Adenyl ~ Amino acid + Pyro- phosphate [8] Adenyl « Amino acid + G- — PsU— -C-C-A -^ (T-RNA) Adenylic acid + G PsU C-C-A « Amino acid (T-RNA with activated amino acid) [9] T-RNA « Amino acid + Ribosomes -^ T-RNA + Ribosomes (Amino Acid) Transfer-RNA comprises approximately 10 per cent of the total cellular RNA. The mononucleotide content of the T-RNA from a number of different sources has been determined (Table X). T-RNA molecules from all sources have a high content of guanine and cytosine (53-61 molar per cent (G + C)). TABLE X % (G + C) Content of Transfer — RNA Species (%G+C) 59 %Psoudoun , referred to as community diversity, is f4] D = -XlAMogPCs,). (=1 Although the diversity problem is not of specific concern here, it should be mentioned that there has been considerable develop- ment of this subject along informational lines (13, 14, 15), repre- senting the most extensive application of information theory which has so far been made to an ecological problem. MacArthur (16) has proposed that community stability be equated to the complexity of the food web as given by an entr'opy measure: [5J S= -Z^^(7.)logP(r/,), where S is stability and Piqj) the probability of energy traversing a particular path qj. The rationale of this suggestion is that removal of a species and consequent destruction of the pathways leading to and from it would be less disruptive to a community with a high value for S than to one with a lower value. Since the P{qj)'s are obviously functions of the A^,'s, it follows that stability and diversity are related, the more diverse systems being the more stable. Since greater stability implies greater success in meeting" 144 Information Storage and Neural Control the imperative of Shannon's Theorem 10, and since stabihty is a function of compositional complexity, it follows that a natural tendency of ecological communities should be to develop to maxi- mum proportions within the limitations imposed by particular environments. This conclusion is consistent with empirical ob- servations. One measure of the extent to which a given community has expanded to fill a physical space is the total quantity of organic matter contained in that space. This variable will be referred to here as the community's biomass. Because community ontogeny (ecological succession) proceeds by means of niche (17) prolifera- tion (more species make more species possible), a reasonable way to assess, in a quantitative sense, the extent of organization of a community might be to oxidize a suitable sample in a calorimeter and to equate heat evolution with intrinsic complexity. Though admittedly crude, such an approach would not be entirely without basis since all information, even that which is abstract, is under- stood to be physically based and is therefore referable to thermo- dynamic negative entropy (18, 19). This broaches the problem of the relationship between information and energy — the reason why information theory is of interest to energy ecologists. ENERGY AS CURRENCY The connection between energy and information has been well established in the context of macroscopic thermodynamics (19) where adiabatically accessible system states are generally regarded as informationally equivalent, while those attainable only non-adiabatically are not (20). In the usual Boltzmann- Gibbs treatments, the role of matter in determining a system's entropy is obscure; however, the recently introduced formalism of Jaynes (21) and Tribus (22) offers considerable clarification, as follows. Consider a system of /?«, rib, ■ ■ ■ particles of matter of kinds a, b, . . . in a phase space with coordinates Xi, X2, .... When the coordinates are prescribed and the number of particles known, the system consists of a finite number of discrete quantum states, J, with energies e^.- Injormation Concept in Ecology 145 |6] ey = eO'; Tia, Ub, . . .;Xi, X2, . . .). If/?, is the probability for a particular small subsystem to be in state I, then, [7] Z V, = 1 i [8] 11V^^^ = [y] Y^V^na.i = <>L>, (a,l>,c, . . .) and [10] *S:= -kY^P^np,, where S is the entropy of the system, and < > denotes expected values. The maximum uncertainty of selecting" a subsystem in state I is obtained when the p^s are all ecjual (10). Maximizing [10] [11] "T=? ^^''^'^ l)dp, = 0; difTerentiating [7], [8] and [9] and introducing the undetermined Lagrangian multipliers ««, «&, • . ■ , 1^, ^a, we obtain [12] (Qo- l)T.dp. = 0 [13] (3Y.e.dp,=0 [14] a^X) na.,dpi = 0. {a,h,c, . . .) Adding" [11-14] and collecting" terms: [15] XI Oil P' + ^^0 -\- ^U + a,, Ha., + abni,, + . . .) fZ/J/ = 0, from which [16] p, = exp ( — Qo — (Se,- — aa nn.i — ab Hb.i —...). Thus, a system's entropy is maximal when the e,'s, na,i's, «6, /s, etc., of all its subsystems are identical, signifying a homogeneous distribution of matter and energy throughout the system. It can be shown (22) that the rate of entropy change is 146 Information Storage and Neural Control dS = k (J3d + aa d + "& d + . . . If two systems with different values of /3, «„, a^, . . . , and witli total energy and matter constant between them, are allowed to interact irreversibly, then the energy gain of one must be equiva- lent to the loss of the other, and the gain in n,j, n,^^ . . . by one corresponds to that lost by the other. In the language of game theory (23) such a relationship is zero-sum. \idXi = dX-i = . . . =0, then the connected system's entropy change is, from [17], [18] dS - k [(j3 - 13') d <€> + {aa - aa) d + (ab — ab') d + . . .] Hence, it is possible for one of the systems (call it community) to decrease its entropy at the expense of the other (environment) since the only requirenient is that dS > 0 overall. Details of energy-matter exchange between such systems are very complex because the parameters 3, aa, at, ■ . . may become reciprocally coupled within a system d _ _ a"Qo [19] ) {a,b,c, . . .) d _ 6/3 ~ ' daa dl3l SO that [20] d = -^ dl3 - -^ daa (a,b,C, ...) and d^- " dl3 daa d''QiO ,,, 3"Qo [21] d = - T ~dl3 - ~r—;daa {a,h,c, . . .). daa dp daa" The important point to distinguish for our present purpose is that for one system to diminish its entropy with respect to another with which it is coupled in communication, it must establish and maintain physical barriers to the free exchange of energy and matter. In short, it must proliferate structural heterogeneity by maximizing the inequality of the /^,'s in [10]. At the community level, such heterogeneities are maintained by a graded series of discrete, functional barriers ranging from, at Information Concept in Ecology 147 the lower end of the scale, quantum states, atoms, molecules, membranes, cells and their ultrastructural components, tissues and organs, to, at the upper end, individual organisms, species, popu- lations, multi-specific evolutionary units (supraorganisms) (24) and finally the community itself. The construction, maintenance and operation of such barriers (with all the morphology and physiology that this implies) are achieved by physical and chem- ical processes which, in net, are endergonic. Without, therefore, a continuous input of energy, the barricades would fail to function and would ultimately be eroded away, with the result that com- munity and environment would become one. This is a trivial conclusion, of course. After all, it is one of the most obvious statements which could be made regarding bio-systems. Yet its articulation seeins necessary to provide a basis for the follow- ing restatement of the Schrodinger-Brillouin proposition: Energy may be regarded as a universal currency with which organisms pur- chase utility, as negative entropy, from the environment. COMMUNITY BIOENERGETICS In view of this proposition, the ultimate source of negativt entropy to an ecological community may be regarded to be photons. When a photon strikes an atom an electron is lifted from ground state to a higher empty orbital (vertical arrow ^~ -^^* in Fig. 2). For most molecules excited electrons usually drop back to ground state immediately, dissipating the excess energy as electromagnetic radiation (broken arrow, Fig. 2). Living systems to paraphrase Szent-Gyorgi (25), have shoved themselves between these two processes by shunting the excited electrons into different downhill pathways in which their energy can be released slowly and put to useful work. The first step in the process is excitation (by photons, or indirectly via accessory plant pigments) of pi electrons in the conjugated portion of chlorophyll a. In cyclic photosynthetic phosphorylation (26) the chlorophyll provides these electrons directly, thereby acting both as electron donor and acceptor (Fig. 2). In noncyclic photophosphorylation the electrons come from H2O, which the excitation energy decomposes to oxygen, freed as O2, and H atoms. The hydrogen electrons sub- 148 Information Storage and Neural Control Fig. 2. Schematic diagram of the energy cycle of an ecosystem, modified and expanded after Szent-Gyorgi (25) and Arnon (26). Anaerobic and chemosyn- thetic processes are not indicated. sequently reduce one of two pyridine nucleotides {PN -^ PNHi, Fig. 2). Concurrently, ATP is synthesized, incorporating into its terminal "high energy" phosphate bond some of the original photon energy. Neither ATP, DPN, nor TPN is stable enough to function in energy storage. This is accomplished by reducing CO 2 to carbohydrates and water, then to lipids (Fig. 2). Energy so stored may be utiHzed directly by the primary pro- ducer, or it may be transmitted to other organisms in the food chain. The retrieval of energy from storage is accomplished by transferring electrons (in H atoms) to PN, releasing the carbon as CO 2. The PNHi then transfers electrons to flavin mononucleo- tide (FMN), whence they cascade down the oxidative chain of cytochromes, generating heat at every step. Most of the energy remaining is converted (in oxidative phosphorylation) to .4 TP, in which form it is available for the performance of cellular work. Finally, the electrons are transferred to Oo which then binds protons to form HoO. Water represents ground state, where the cycle e~ —> e* -^ e^ is completed. If, at a specified time, the system contains more free energy than it did at a prior time, we say that a favorable balance between inputs and expenditures has Injormation Concept in Ecology 149 been achieved. This enables the system to maintain or further diminish its entropy. If the system possesses less free energy after a passage of time, we say the balance is unfavorable and the system is less able to forestall an entropy gain. These relationships can be summarized by a simple transfer function which will be termed cost. This variable represents the amount of energy which must be expended to gain a unit of energy from the environment: P7r~ < 1 (biomass gain) [22] piv~^ = 1 (steady state) P7r~^ > 1 (biomass loss) where tt denotes total energy gained by the community and p represents total energy lost. Let us now consider some specific behavioral and organiza- tional attributes of planktonic systems which exemplify goal- adaptability, the goal being biomass maximization. PROCEDURES The plankton communities under consideration occupied the York River, Virginia, during the summer of 1960. For ten con- secutive weeks, from June 23 to August 25, in situ dark and light bottle differential oxygen studies (27) were performed weekly to assess energy flux through the community. The sampling station was located about 300 yds. off the end of the Virginia Institute of Marine Science pier where the approximate depth of mean low water was thirty feet. Hydrographic determinations included vertical profiles of chlor- inity, temperature, dissolved oxygen, total dissolved phosphorus and total nitrate. Temperature was recorded with a thermistor unit. Chlorinity was titrated with silver nitrate. Dissolved oxygen was measured by the unmodified Winkler method. Dissolved organic and inorganic phosphorus were obtained as follows: Fractionation into dissolved and adsorbed inorganic and dissolved and particulate organic components was achieved by Millipore (type HA) filtration. Inorganic fractions were assayed directly; organic fractions were obtained by digesting samples for twelve hours at 20 psi; the molybdate method (corrected for salt interfer- ence) was employed to estimate the orthophosphate in both cases. 1 50 Information Storage and Neural Control For energy flux determinations, paired dark and liglit bottles containing water samples from two, six and ten feet were sus- pended at various depths in the water column for twenty-four hours (beginning 0730 EST), and then fixed for Winkler titration. The suspension depths included all combinations of the collection depths: (2,2), (2,6), (2,10); (6,2), (6,6), (6,10); (l0,2), (l0,6), (10,10), where the left member of each pair designates collection depth and the right member suspension depth. Additional dark bottles for (l4,14) and (l8,18) were also included. Production variables were detei mined from the initial and final dissolved oxygen concentrations in the bottles: TT = I — d (photosynthesis) [23] p = i — d (respiration) IT — ,0 = I — i (net production) where /, d and / are, respectively, light bottle, dark bottle, and initial oxygen concentrations. The differential oxygen concen- trations were converted to gram calories (gcal) using suitable conversion factors derived from the stoichiometry of the photo- synthesis and respiration reactions. Incident solar radiation at the water surface was measured in gcal cm~" by an Eppley 10-junction pyrheliometer installed a few hundred yards from the station, the output of the thermopile being electronically integrated and automatically printed-out every thirty minutes. Extinction coefficients for "white" light were detei mined on samples obtained from the various depths at the beginning and end of each experiment. The optical densities were measured with a Klett-Summerson colorimeter, using a neutral filter. From these values a mean was obtained for the upper ten feet and was employed to estimate the light intensity at any depth. Total chlorophyll was assayed by Millipore-filtering samples from different depths, grinding filters and residues with sand and extracting the pigment in 90 per cent acetone (A/^COs-saturated), then Seitz-filtering to remove sand and undissolved millipore frag- ments. Absorbancies were determined with a red filter, and were converted to chlorophyll concentrations by comparison with a standard curve prepared from chlorophyll a. Information Concept in Ecology 151 Biomass was estimated as ash-free dry weight of suspended soHds. The inethod involved filtering water through tared Millipore filters (type HA), desiccating" filters plus residues, weighing for total solids, then ashing at 600 °C., rehydrating the ash, desic- cating, and weighing again to obtain the ash weight. Counts of phytoplankton units (chains, colonies or individual cells) were made from Sedgwick-Rafter mounts of fresh samples obtained from the various depths. All flagellates and diatoms were counted; ciliates and other animals, when present, were excluded. COMMUNITY ADAPTATIONS FOR MAXIMUM BIOMASS The York River water column at the station sampled was comparatively unstratified from surface to bottom during the summer of 1960. This is illustrated by the graphs in Figure 3. MAN DIUOLVIO OXTMII 1 MEM eitMLVto rHOtrORU* MCAN NITRATE Fig. 3. Mean vertical distribution of chlorinity, temperature, and the dissolved substances oxygen, phosphorus and nitrate. 152 Information Storage and Neural Control Chlorinity varied from 8.54 to 12.60 parts per thousand, with the surface water generally a little less saline than that near the bottom. The mean gradient for the ten experiments was only 0.75 parts per thousand. Temperature ranged from 24°C. in June to over 27 °C. in mid-August, the surface waters being somewhat warmer than the lower strata. The mean temperature gradient was but 0.29 °C. These two variables, clilorinity and temperature, are determinants of water density. In this case, they indicate a very small gradient of increasing density with depth, thus assuring a fair amount of vertical mixing in the water column. This con- clusion is underscored by the vertical distribution patterns of other dissolved substances for which data were obtained — dissolved SURFACE 2 - 6- DEPTH 10 _ (ft) 14- 18- BOTTOM 4000 6000 MEAN NUMBER OF CELLS ( per ml ) Fig. 4. Mean concentrations of living phytoplankters, in counting units ml, at various depths. Information Concept in Ecology 153 oxygen diminished only slightly with depth, and dissolved phos- phorus and nitrate increased slightly (Fig. 3). In sharp contrast to these physical relationships, the living organisms of the phytoplankton were markedly stratified in the upper layers (Fig. 4). To account for this we note that the domi- nant organisms of the summer flora were motile flagellates closely related to forms which are known to be positively phototactic. Thus, swimming is the probable primary mechanism involved. Other factors of possible influence include rapid cell division in the lighted surface layers, and manufacture of low specific gravity (lipid) storage products. The mean daily vertical distribution of light in the ten experi- ments is graphed in Figure 5, and shows typical exponential extinction with virtually complete absence of light at the bottom. SURFACE DEPTH (ft) 10 - MEAN EXTINCTION COEFFICIENT = 0.97 BOTTOM I I I I 0 200 400 600 MEAN SUBMARINE ILLUMINATION ( gcal cm-2day-' ) Fig. 5. Mean vertical distribution of light in the ten experiments. 154 Information Storage and Neural Control PHOTOSYNTHESIS ( gcol cm-^doy-' ) DEPTH (ft) RESPIRATION ( jcQl cm-*do>-' ) Fig. 6. Mean photosynthesis and mean respiration in the water column. Mean photosynthesis and respiration are depicted in Figure 6. As expected, production was highest near the surface and atten- uated in exponential fashion with depth. Respiration was about equal throughout the upper ten feet, but was only half as great below this level. MEAN ASH-FREE SOLIDS ("BIOMASS") ( mg cm-2 ) DEPTH ,0- (ft) MEAN TOTAL CHLOROPHYLL ( ug cm-2 J Fig. 7. Mean ash-free solids and mean total chlorophyll concentration at various depths. Information Concept in Ecology 155 The concentration of ash-free solids (Fig. 7) was observed to increase markedly with depth. This variable may be equated to community biomass since even the non-living detrital material which it includes represents a source of energy to certain hetero- trophic components of the living plankton. The inverse relationship between the vertical distribution of these ash-free solids and that of living cells is a consequence of the detrital rain from the zone of production at the top of the water column, and also of the upwelling of bottom materials. Since even dead organic material of this type has an oxygen demand, a significant (though un- specifiable) fraction of what was represented in Figure 6 as com- munity "respiration" is a product of non-biological oxidations attending decomposition. Since such oxidations cost the com- munity biomass energy, it is proper that they be included in determinations of energy loss. As in the case of ash-free seston, the vertical distribution of total chlorophyll was different from that which would be antici- pated on the basis of the cell-count data (Fig. 7). Chlorophyll concentration increased gradually with depth. The explanation is that large quantities of chlorophyll- and its degradation products (many of which would be included in this assay) are associated with non-living detritus (28, 29) and sediments (30, 31, 32). The two curves of Figure 7 strengthen the conclusion that we are dealing with a fairly well-mixed water mass since a certain amount of upwelling is indicated. Let us now consider some of the photosynthetic characteristics of the plankton community at various depths. Recall from the description of procedures that water samples for the measurement of photosynthesis were collected at depths of 2, 6 and 10 ft. and were resuspended so that data for all com- binations of collection and suspension depths could be obtained. The graphs in Figure 6 are for results from the particular com- binations (2,2), (6,6) and (lO,10). In Figure 8, ail of the combina- tions are graphed in 3-space with coordinates (collection depth, sus- pension depth, mean photosynthesis). The surface depicted is concave upward, slopes downward toward the viewer, and curves markedly upward on the left. Consider, first, photosynthesis as a function of suspension depth, by looking at the surface from back 156 Information Storage and Neural Control to front. The downward sloping represents the attenuation of photosynthesis with increased depth of suspension due to decreased illumination. The curve of Figure 6 is the locus obtained on this surface by connecting the points (2,2,4.61), (6,6,1.46), and (lO,10, 0.85). Now view the surface from left to right. This gives photosyn- thesis as a function of collection depth. Regardless of the depth of suspension, the populations collected at 2 ft. always photosyn- thesized more than those obtained from 6 and 10 ft.; the latter samples appear to give very similar results. These relationships indicate that the organisms taken from deeper layers of the water column have less capacity for photosynthesis than those which normally occupy the surface waters. This may be a reflection of the fact that the deeper plankters are senescent and sinking; microscopic examination usually revealed the surface organisms to be far more active in swimming than their counterparts from below. Consider now the thermodynamic efficiency of photosynthesis as reflected by the ratio of mean photosynthesis per mean illumi- nance at each depth. These data are presented in Figure 9. The surface generated is concave "upward, slants upward approaching the viewer and toward the left, and rises sharply on the right in front. Studying from back to front first, we observe that photo- synthetic efficiency increases with depth of suspension, hence with diminished light intensity. This result is in accord with the photo- synthesis literature (33). Now studying the surface from left to right, we observe that plankters living nearer the surface are generally more efficient in light utilization when compared at the same suspension depths with those from deeper layers, except that organisms collected from 10 ft. appear to be almost as efficient as those from 2 ft. when both are suspended at the deeper level. In general, then, the relationships of Figure 9 are consistent with those of Figure 8 in denoting greater productive capacity of surface populations compared to those from farther down. The observation that efficiency increases as light decreases can be interpreted to be adaptively significant in respect to the goal of biomass maxi- mization. The extent of this dark-adaptability under natural conditions is emphasized by comparing efficiencies of the popu- lations at the depths they naturally occupy. Thus the points (2,2, Information Concept in Ecology 157 MEAN PHOTOSYNTHESIS (gcal cm"2day"') 5-1 (10,2,257) (2,10,1.06) COLLECTION DEPTH (ft) Fig. 8. Photosynthesis as a function of sample collection and_suspension depths (means for ten experiments). 18.5), (6,6,19.5), and (10,10,44.7) in Figure 9 indicate a 2.4-fold efficiency increase at 10 ft. compared to 2 ft. Two phenomena may be involved in this increased efficiency of the deeper populations: 1) the purely numerical "swamping" effect of more photons in the upper layers of the water column than can possibly be absorbed by the plant pigments (34), and 2) actual increase in the thermodynamic efficiency of chlorophyll with depth. The latter is illustrated in Figure 10 in which mean photosynthesis per unit mean illumination per unit mean initial concentration of total chlorophyll in the ten experiments is graphed. 158 Information Storage and Neural Control MEAN PHOTOSYNTHESIS PER UNIT ILLUMINATION (gcal kcal'i] 10,53.1) •50 -28 (2,2,18.5) --I0 6 10 COLLECTION DEPTH (ft) Fig, 9. Photosynthesis per unit illumination as a function of sample collection and suspension deptlis (means for ten experiments). This graph represents chlorophyll efficiency. Although the surface shown is fairly similar to that of Figure 9, the greatest similarities are on the left (2 ft. collection depth) and in the rear (2 ft. sus- pension depth). The forward part of the Figure 10 surface (10 ft. suspension depth) and the rigiit-hand side (10 ft. collection depth), however, are considerably more elevated than those of Figure 9. These relationships appear at a glance by noting that much more of the underside of the Figure 10 surface is visible than that of Figure 9. Information Concept in Ecology 159 MEAN PHOTOSYNTHESIS PER UNIT ILLUMINATION 8 CHLOROPHYLL ( gcal kcal"' ug"') (2,10,10.9) f-IO -5 (2,2,3.9) (10,10,9.2) ( 10,6,6.1) 10,2,2.6) COLLECTION DEPTH (ft) Fig. 10. Photosyntliesis per unit illumination and chlorophyll as a function of sample collection and suspension depths (means for ten experiments). To test the significance of these relationships, the vertical co- ordinate of the Figure 10 points was divided into that of the Figure 9 points to obtain the ixiean chlorophyll concentrations required to give unit efficiency: 4.86, 4.41 and 4.05 /xg", respec- tively, for samples collected at 2, 6 and 10 ft. These values indicate less chlorophyll to be required as sample depth is increased. Only the first two means were significantly difTerent, however, estab- lishing that the chlorophyll at 2 ft. was less efficient that that at 6 ft. Because of the high variance associated with the 10 ft. samples, 160 Information Storage and Neural Control the 2 — 10 ft. and 6 — 10 ft. means could not be distinguished. There- fore, the tendency toward increased chlorophyll efficiency with depth of collection cannot be formally accepted as a generalization. We may accept it on intuitive grounds, however, noting the high likelihood for a Type II (35) biometrical error due to small sample size, i.e., an error such that the null hypothesis is accepted when in fact it is false. Although physiological mechanisms {e.g., photoinhibition) are doubtlessly involved in the observed increase of chlorophyll efficiency with depth, ecological factors are also implicated. One of the striking features about the vertical organization of summer estuarine plankton communities is variability in species com- position and in cell concentrations. The numerical stratification of the York River phytoplankton has already been described (Fig. 4). Table I is provided to illustrate the nature of species changes with depth. It is a list of phytoplankton species and their concentrations obtained from the third experiment (July 6) of the series under consideration. There is nothing especially atypical about this particular list; it is fairly representative. The table shows that two flagellates (in decreasing order of importance: Massartia, Chilojrtorms) were dominant at the surface. Two feet below, three species dominated in a diff'erent order of abundance {Alassartio, Gjrodimum, Chilomonas). These forms are all highly motile; Massartia and Gyrodinium are dinoflagellates, Chilomonas is a yellow-green flagellate. Both of these groups typically photosynthesize at maximal rates under conditions of high light intensity (36). At 6 ft. the surface forms were no longer of sig- nificance (dominants being Eutreptia, Gyrodinium), and at 10 ft. they were entirely absent (dominants: Eutrepfia, Pyramimonas, Leplocylindricus) . Eutreptia is a euglenoid, Pyramimonas a flagellated green alga, and Leplocylindricus an immotile diatom. The latter two groups, in estuaries, are generally adapted to photosynthesize maximally under conditions of low or medium illumination (36). It would seem from these few general observations that main- tenance of a suitable vertical diversity structure might constitute a significant segment of community strategy in implementing the goal of biomass maximization. That a very definite vertical diversity pattern is maintained in summer is illustrated in Figure 1 1 Information Concept in Ecology 161 5 ^ <^ [2 w z >■ OP Z H §1 §0 < s CN o CN o o I o I r-- m r~- r- ■--, m cn 00 CN o m ■* CO \o -^ en ■r- r^ \o c?\ CN I ^ in o o 1 ■^ un (N CM 00 00 so -^ M c^ O CM t-^ O CO CO in o CM -^ M CM t~- I 1 -- I I I I I r~- I r-- I I o I 1 rv] CM \o <^] \o ro CO ^-c •^ ^-' CM CM T-H ,-. I I i 00 00 00 00 CN) CM CM V3 O C^] C^I C^l C^l CV) r^cM'*-<^rOcOtOT-H.^T-.'rH.^.^T-< I OOCncOcOCMCMCMt-ht-i I I I I I I 1 T-H ^ ^ — br. t« g _. o u i-, ^ -o - c tj: U TO ft w §:! • — o en -o g o - S3-? i« «j O 1^ i- — » -— u 03 J2 « ^ O ^ S ■a, C"':= 2 5 OCi 5 5^ c; O h2 a. ^ . ^^ :^ s ^ G. -S a. o ■~ 0^ _0 '■p. o c3 -G -2 .5 j^ a — ft T3 o u >^ Q T3 ■*^ -a dT3 1 ^V s jj M U ■ S S E: C ft G ~ c 'Z lj "> 5J Q -a ^ c -0 ^:a 162 Information Storage and Neural Control SURFACE DEPTH 10- (ft) BOTTOM 5 10 MEAN COWMUNITY DIVERSITY I bits X lO-yml ) Fig. 11. Mean community diversity, D = — ^ Nj log P(Si), at various depths. in which mean community diversity for the ten experiments, as defined in Equation [4], is plotted against depth. The figure shows maximum diversity at the 2 ft. leveL From the foregoing data (Figs. 4 and 11) it is concluded that the living organisins of the York River summer plankton are vertically stratified in excess of the extent attributable to cor- responding heterogeneity of the physical environment (Fig. 3). Maintenance of such a concentration gradient against the mixing forces of the environment inust therefore be endergonic — the organisms must expend biomass energy to reduce the entropy of their distribution in space. Injormatwn Concept in Ecology 163 In oceanography, there is a prominent and widely accepted theory that energy accrual by a planktonic system cannot exceed respiratory losses in a uniformly mixed water column (37). Vertical stratification of the organisms is therefore necessary for positive energy balance. This theory has never been rigorously developed, however, and as a matter of fact has recently (38) been invalidated by proof for a countertheorem: Vertically homogeneous plankton communities are energetically feasible. Stratification is therefore not essential to positive energy balance. This conclusion makes the foregoing York River observations difficult to understand. Why should a community expend energy to achieve and maintain pro- nounced vertical stratification if it is not thermodynamically essential for it to do so? Consider the following. The important variable relating to community energy balance is the cost as defined in Equation [22]. In Figure 12 mean cost data are graphed as a function of collection and suspension depths; (6,10,2.671 (10,10,2.99), SUSPENSION DEPTH (ft) Fig. 12. Cost, p TT-i as a function of collection and suspension deptlis (means for ten experiments). 164 Information Storage and Neural Control this particular grapii is rotated 90° clockwise around the vertical axis (compared to previous figures of this type) to improve the perspective in which the surface is viewed. Studying the surface from back to front first, we see that cost increases in a generally hyperbolic or logarithmic fashion with depth of collection; the surface is saddle-shaped, being convex upward from back to front. The fact that it rises toward the viewer supports the previous conclusion that the deeper populations are less viable than those nearer the surface — their cost of operation is higher. The ribbon- shaped segment in the figure denotes the loci on the surface and on the horizontal plane where the ratio p7r~^ is unity, i.e., where an exact balance between energy inputs and expenditures is achieved. For any specified collection depth, this ribbon indicates the depth at which the sample must be suspended to achieve a steady state between inputs and losses. This depth is seen to become shallower as the collection depth increases — another indication of the intrinsically higher vitality of populations found nearer the surface. Viewing the surface of Figuie 12 from right to left, cost is shown to increase as suspension depth is increased. In this direction the surface is concave upward. Thus, despite the measure of dark- adaptability demonstrated earlier, the price to a population of inhabiting deeper layers in the water mass is unequivocally in- creased cost of operation. This datum appears to provide an economically logical reason for stratification. A well-known doc- trine from marginal analysis in economics (39) states that the scale of an activity should be expanded so long as marginal profitability (increase in net utility gain) is a positive value, and carried to a point where marginal yield is zero. This corresponds to the procedure in calculus of maximizing a function by setting its first derivative to vanish. Applied to the plankton, this law demands, in view of observed depth-cost relationships, that the community should invest biomass energy to concentrate its com- ponent organisms near the surface up to the point where additional return becomes zero. It would appear that the stratification behavioi of the York River plankton is consistent with sound economic policy. The converse of the marginal profitability law would be: If marginal gains are negative, the scale of an activity should be Information Concept in Ecology 165 reduced at least until a point of no further loss (zero return) is reached. Let us examine the behavior of the York plankton in respect to this proposition. Referring to Figure 6, mean photo- synthesis is observed to exceed mean respiration in the upper water column {pir'~^ < 1) but not in the lower (ptt"^' > 1)- This relationship is so typical in aquatic communities that the depth at which the photosynthesis and respiration curves cross (ptt"^ = 1) is a standard variable — the compensation depth. The mean depth of compensation at the York sampling station during the summer of 1960 was 6.5 ft.; this level is denoted by broken lines in Figures 3-7 and in Figure 1 1 . When phytoplankters drift beneath the instantaneous compensation depth they experience, on the average, a shift fi'om positive to negative energy balance. If a net positive balance is to be achieved for the whole water columii it is necessary that the community reduce energy losses in the lower part of the column. This implies, by the mathematical nature of the cost variable, increasing the rate of photosynthesis and /or depressing the rate of respiration. Community behavior in accordance with tlie former imperative has already been described as dark-adap- tability. We consider now the attenuation of respiration. The data which have been presented indicate that althougli photosynthetic capacity of the plankters was irreversibly (in 24 hours) less at the 6 and 10 ft. levels than at 2 ft. (Fig. 9), vigorous respiration equivalent to that of surface populations persisted down to 10 ft. (Fig. 6). Below 10 ft., however, oxygen uptake was sharply reduced. The extent of actual metabolic failure must be even gi^eater than indicated by Figure 6 since the concentration of oxidizable detritus increased with depth (Fig. 7) producing a continually increasing oxygen demand (reflected in the oxygen curve of Fig. 3). This underscores the conclusion that metabolism is sharply curtailed soon after the organisms drift beneath the compensation depth. This phenomenon constitutes a pei'fect response on the part of tlie community to the converse marginal profitability principle, and is an example of ''beneficial death" (24) at the community level. Beneficial death is u.sually thought of in connection with individuals {e.g., dead cells forming" the matrix of a functional tissue, as in plant xylem or some insect wings) or populations {e.g., annual plants, some social insect castes, genetic 166 Information Storage and Neural Control lethals). That the comparatively loosely organized coinmunity may also derive profit through death of its constituent organisms at an appropriate time is an interesting speculation. First of all, planktonic systems such as these have, of course, evolved. One of the important taxonomic characteristics of the algal phyla is the nature of food storage products. It would seem that with such a capability already well developed generally in these groups there could have evolved, in the time available, species able to maintain robust metabolic activity right down to the bottom if it were consistent with community design. This would be especially adaptive in water masses where expectation for return to the trophogenic zone (above the compensation depth) through vertical turbulence would be good. Indeed, evolutionary theory asserts that such foims would enjoy a selective advantage over more labile ones. In phytoplankton, the smaller motile species typically possess rapid dynamics and short generation times, but lack the capacity for sustained yields characteristic of larger forms with more conservative dynamics (15). Clearly, from the standpoint of the York system in summer with its slight vertical density gradient, the latter type of organism would not be nearly so satisfactory a component as the former. Their production rates per unit of biomass would be slower in the upper water column, and their collective respiration higher in the depths; the result might be a net energy loss to the conmiunity. Under winter con- ditions when hydrography is such that vertical turbulence is extreme and the water column thoroughly mixed, larger species with longer generation times, lower light optima, and greater capabilities for food storage niight be more serviceable con- stituents. Perhaps, therefore, the seasonal replacement of summer flagellate floras by diatomaceous communities in winter and spring may be taken to reflect community adaptability in response to a changing environment. In view of this, it does not seem unreason- able that particular species may be selected for occupancy in a com- munity under a specific environmental regime, not only for their Darwinian competence in competition, but as well for their compati- bility as functional components of a goal-adapted "machine" (40). This possibility would seem to add another dimension to the classical concept of ecological community because of its implicit Information Concept in Ecology 167 demand that the success or failure of species be related to and interpreted in a broader sociological context. Acceptance of such a context carries with it the important advantage of making some of the elegant formalisms (9, 23) developed in connection with the study of situations of conflict available for ecological analysis. The theory of games and decisions is, however, notoriously teleological in basis: litigants come to odds through mutual impairment of purposive behavior. This objection can be ameliorated to a very large extent by regarding community goal-adapted behavior in a teleonomic, not teleological, sense; i.e., the community is "programmed" for goal achievement though possessing no "con- scious" knowledge of the goal. This kind of thinking is widely accepted in connection with the problem of DNA coding, and it has been formalized in Bellman's (40) concept of information pattern. In such a framework, the mechanism of natural selection may still be construed to operate at an infraspecies level; for example, by acknowledging" that the information pattern of a species (a program containing the accumulated history of its past and rules for decision making) can enable the latter to make, in a completely mechanistic manner, a choice between alternative strategies such as those embodied in a recent theorem (41) due to Rashevsky: If two individuals work on the production of some object of satisfaction (utility) and if their cooperative efforts result in an increased overall productivity, then each individual will have less of the object of satisfaction if each adopts a strategy of maximizing his own satisfaction (egoism, competition) than if each tries to maximize the sum of the satisfactions of both individuals (altruisin, cooperation). The importance of epistemological bearing in determining the character of questions which one may ask of biosystems and, consequently, that of the answers elicited can be illustrated as follows. Consider a proposition of the form, "The organism (species, community) is adapted to . . . ." This is completely acceptable biological rhetoric. Constructed in the passive voice, the statement carries the implication that it is the fortuitous environment which does the selecting. If we go to the active form, "The organism adapts to . . .," we provide the biological sub- ject with a degree of initiative in the process. This is still quite 1 68 Information Storage and Neural Control acceptable. If now we change the verb akogether and posit, "The organism adopts a strategy for . . .," we pass for many readers rather too abruptly into the realm of purpose. Thus, it might be more suitable to say instead, "The organism is pro- grammed for a strategy of . . .." The important point to im- press here is that all of these statements mean essentially the same thing mechanistically, though epistemologically they are poles apart. Consequently, they give rise to very different ways of asking questions, therefore to divergent investigational approaches, and finally to quite different classes of answers. To illustrate, if in the present instance the hrst-mentioned point of view is adopted, then only the empirical sections of this paper would have relevance, and its content might be summarized by saying: The York River plankton community appears to be eminently adapted to its environment as indicated by 1) stratifica- tion of organisms near the surface where there is more light, 2) increase of chlorophyll efficiency with depth due to both physiological and species compositional reasons, and 3) sharp curtailment of respiration in the lower part of the water column as the organisms die and sink, making possible a positive balance between energy gains and losses in the community. This is a descriptive approach, and it yields purely descriptive answers with limited power to provide real insight into the marvel of organiza- tion and behavior which is the community. Contrast this with the summary which might result from acceptance of the last point of view: Based on the above-mentioned observations, the York River community appears to be pro- grammed for a strategy of maximizing its biomass, therefore its energy content, therefore its ability to purchase utility and increase its information reserves, therefore its diversity or richness of form, and therefore its stability in a variable environment. In the process, the community, inchoate a biological system as it is, meets some fundamental thermodynamic and economic impera- tives, as well as the dictum of Shannon's Theorem 10. The new level of abstraction so attained may or may not qualify the community as a Wienerian (42) machina ratiocinatrix, but if there is a distinction, it would seem to lie largely in the realm of logic and semantics, not of biology. Information Concept in Ecology 169 REFERENCES 1. Lindeman, R. L.: The trophic-dynamic aspect of ecology. Ecology, 2J.-399-418, 1942. 2. MacFadyen, A.: The meaning of productivity in biological systems. J. Anim. EcoL, 77.-75-80, 1948. 3. Schrodinger, E.: What is Life? Cambridge, England, University Press, 1945. 4. Brillouin, L.: Life, thermodynamics, and cybernetics. Sci. Amer., 37: 554-568, 1949. 5. Branson, H. R. : A definition of information from the thermodynamics of irreversible processes and its application to chemical communi- cation in biological systems, in Quastler, H. (ed.), Information Theory in Biology. Urbana, Univ. Illinois Press, 1953, p. 25. 6. Linschitz, H. : The information content of a bacterial cell, in Quastler, H (ed.), Information Theory in Biology. Urbana, Univ. Illinois Press, 1953, p. 251. 7. Patten, B. C: An introduction to the cybernetics of the ecosystem: the trophic-dynamic aspect. Ecology, 40:22\-23\, 1959. 8. Patten, B. C: Negentropy flow in communities of plankton. Limnol. Oceanogr., 6.-26-30, 1961. 9. Thrall, R. M., Coombs, C. H. and Davis, R. L. (eds.): Decision Processes. New York, Wiley, 1954. 10. Shannon, C. E.: A mathematical theory of communication. Bell Syst. Tech. J., 27.-379-423, 623-656, 1948. 11. Ashby, W. R.: An Introduction to Cybernetics. New York, Wiley, 1956. 12. Ashby, W. R.: Design for a Brain, ed. 2. New York, Wiley, 1960. 13. Margalef, D. R.: Informacion y diversidad especifica en las com- munidades de organismos. Invest. Pesquera, 3.-99-106, 1956. 14. Margalef, D. R.: La teoria de la informacion en ecologia. Mem. R. Acad. Cien. Artes, Barcelona, J2.-373-449, 1957. 15. Margalef, D. R.: Temporal succession and spatial heterogeneity in phytoplankton, in Buzzati-Traverso, A. A. (ed.), Perspectives in Marine Biology. Berkeley, LIniv. California Press, 1958, p. 323. 16. MacArthur, R.: Fluctuations of animal populations, and a measure of community stability. Ecology, J(5." 533-536, 1955. 17. Hutchinson, G. E.: Concluding remarks. Cold Spring Harbor Symp. Qimnt. BioL, 22:415-427, 1957. 18. Wiener, N.: The Human Use of Human Beings. New York, Doubleday, 1950. 19. Brillouin, L.: Science and Information Theory. New York, Academic Press, 1956. 170 Information Storage and Neural Control 20. Rothstein, J.: Communication, Organization, and Science. Indian Hills, Colorado, Falcon's Wing Press, 1958. 21. Jaynes, E. T.: Information theory and statistical mechanics. P/owVa/. Rev., 7O6.-620-630; 7C5.-171-190, 1957. 22. Tribus, M.: Information theory as the basis for thermostatics and thermodynamics. Gen. Syst., 6.-127-138, 1961. 23. Von Neumann, J. and Morgenstern, O.: Theory of Games and Economic Behavior. Princeton, N. J., Princeton Univ. Press, 1947. 24. Allee, W. C, Emerson, A. E., Park, O., Park, T. and Schmidt, K. P.: Principles of Animal Ecology. Philadelphia, Saunders, 1949. 25. Szent-Gyorgi, A.: Submolecular biology. Radiation Res., 1960, supp. 2, p. 4. 26. Arnon, D. I.: The role of light in photosynthesis. Sci. Amer., 203: 105-118, 1960. 27. Gaarder, T. and Gran, H. H.: Investigation on the production of plankton in the Oslo Fjord. Rapp. et Proc. — Verb., Cons. Internal. Explor. Mer., ^i.-l-48, 1927. 28. Gilli)richt, M.: Untersuchungen zur Productions Biologic des Plank- tons in der Kieler Bucht. I. h'ieler Meeresf., <§.• 173-1 91, 1951. 29. Krey, J.: Untersuchungen zum Sestongehalt des Meerwassere 1: Der Sestongehalt in der westlichen Ostsee und unter Helgoland. Ber. Deutsch Komm. Meeresf., 72:431-456, 1952. 30. Vallentyne, J. R.: Sedimentary chlorophyll determination as a paleobotanical method. Canadian J. Bot., Ji.-304-313, 1955. 31. Vallentyne, J. R., and Bidwell, R. G. S.: The relation between free sugars and sedimentary chlorophyll in lake muds. Ecology, 37: 495-500, 1956. 32. Vallentyne, J. R., and Craston, D. F.: Sedimentary chlorophyll degradation products in surface muds from Connecticut lakes. Canadian J. Bot., 35:35-42, 1957. 33. Rabinowitch, E. I.: Photosynthesis and Related Processes, New York, Interscience, 1951, vol. 2, pt. 1, p. 603. 34. Odum, H. T., McConnell, W., and Abbott, VV.: The chlorophyll "A" of communities. Publ. Inst. Mar. Sci., Univ. Texas, 5.-65-96, 1958. 35. Snedecor, G. W.: Statistical Methods Applied to Experiments in Agri- culture and Biology, ed. 5, Ames, Iowa, Iowa State College Press, 1956. 36. Ryther, J. H.: Photosynthesis in the ocean as a function of light intensity. Limnol. Oceanogr., 7.-61-70, 1956. Information Concept in Ecology 171 37. Sverdrup, H. U.: On conditions for the vernal blooming of phyto- plankton. J. dii Conseil, 7<§;287-295, 1953. 38. Patten, B. C: Energy-depth relationships in plankton. In manu- script. 39. Baumol, VV. J.: Economic Theory and Operations Analysis. Englewood Chffs, N. J., Prentice-Hall, 1961. 40. Bellman, R.: Adaptive Control Processes: A Guided Tour. Princeton, N. J., Princeton University Press, 1961. 41. Rashevsky, N.: A contribution to the theory of egoistic and altruistic interactions. Bull. Math. Biophys., 2J.-115-134, 1961. 42. Wiener, N.: Cybernetics. New York, Wiley, 1948. DISCUSSION OF CHAPTER VII Walter Abbott (Houston, Texas) : You are undoubtedly aware of the criticisms of the light-dark bottle technique, mainly because of the reduction in turbulence. Does this mean that your data represent minimum estimates? Bernard C. Patten (Gloucester Point, Virginia): I do not know what it means, actually. There are four or five classes of criticism of the light and dark bottle method. Some of them work in opposition, i.e., an error in one direction may be cancelled or partially mollified by another error in the reverse direction. I think the technique is marginal, at best, for obtaining absolute measures of energy flux, but quite adequate for relative com- parisons of the activities of different populations, which is how we used it. If you perform enough experiments and observe that a fairly consistent pattern emerges, you can begin to feel confident of the reality of the pattern even though the data may represent minimal estimates. Heather D. Mayor (Houston, Texas): Would you have to allow for extra energy gain to your system, brought about by the process of measurement? Is there, for example, additional "noise" added to the respiration term because of the measurements? Patten: I am not quite certain what you mean, but we do make corrections of the type you suggest. We make a correction for the fact that the photosynthetic quotient is generally greater than unity in marine phytoplankton, but I do not believe that this is the kind of thing to which you are referring. 1 72 Injormatwn Storage and Neural Control Mayor: Using a quantum analogy, I know you would be introducing an additional perturbation by measuring your param- eters. I was wondering" whether you have considered this problem, or whether you are working at a level where this type correction is not necessary. Patten: If there is an analogous problem here, I am not aware of it. I believe we may be thinking as well as working at different levels. CHAPTER VIII EXCHANGE OF INFORMATION ABOUT PATTERNS OF HUMAN BEHAVIOR Gregory Bateson, M.A, A .T THE outset I wish to make two acknowledgments. First, I would like to credit Attneave's work (1) in which he points out the synonymy between what in information theory is called "redundancy" and what in popular parlance is called "pattern." You will see, as I develop what I have to say, that this synonymy is basic. Second, I want to acknowledge a less definable debt to conversations with Alex Bavelas about his experiments involving varieties of contingency in learning contexts. I had hoped that the outcome of these conversations would be a paper in which his name would be included as joint author. Since our diverse professional commitments have prevented our getting together on this, I must take responsibility for the thoughts which his work has stimulated in me. A major part of this paper will be devoted to defining that order of information which I regard as "infor- mation about patterns of human behavior." This involves a restructuring of learning theory. Let us assume that all receipt of information is "learning." This will bring within a single theoretical spectrum the whole range of phenomena, beginning with the receipt of a pip by a receiving machine at the end of a wire, up to and including such complex phenomena as the development of neurosis or psychosis under environmental stress. Notice first of all that the receipt of a bit, a yes or no answer to a question, is not usually called "learning" if the receiver already knows to what question the bit is an answer. Psychologists who perform what are usually called learning experi- 173 1 74 Information Storage and Neural Control ments generally ignore phenomena of this order. Their experiments are concentrated upon a change in the way the receiving entity responds to what is supposed to be the same bit when this bit is presented on successive occasions. "Learning," as the word is used by psychologists, denotes the receipt of a meta-bit, i.e., a piece of information which will change the subject's response to some bit. Over and over the psychological experimenter presents the stimulus, a buzzer, followed by meat powder. He observes that, after a number of trials, the animal which formerly did not salivate when it heard the buzzer now does salivate. This change is called "learning." But when this process has approached completion, if the psychologist again presents the buzzer and the animal salivates, this receipt of infor- mation— the receipt of the sound of the buzzer as a yes or no answer to a question which the animal can now identify — is not usually regarded as learning. By including this simplest phenom- enon, i.e., the receipt of the single bit, within the spectrum of learning, the question as to whether a computer is or is not learning when it receives appropriate input is answered out of hand. This is learning of the simplest order. Second order learning arises when the subject changes his ability to receive the yes or no answer to a question. This is, in fact, the phenomenon which psychologists have studied maximally in learn- ing experiments; the dog learns that the buzzer means future meat powder. But beyond this, there is obviously a third order of learning called the acquisition of "test wisdom," or "set learning." Here the subject learns that he is to be on the lookout for sequences of a certain sort in his universe, which include both external events and his own behaviors. For example, he learns to behave instrumentally in order to solve the problems presented by stimuli. If the laboratory is Pavlovian, he learns to expect the stimuli to be direct predictions of future reinforcements which will come regardless of his action. I shall speak of this as third order learning, referring to those changes whereby the subject who encounters and solves repeated problems of a certain sort comes to expect his universe to be structured in ways related to the formal struc- turing of these previous problems. Patterns of Human Behavior 175 To this formal structuring of contexts, we can apply the language which invokes contingency. We shall then say, for example, that the animal which has undergone recurrent classical Pavlovian experimentation will expect his universe to be so structured that reinforcements are contingent only upon stimuli, not upon his re- sponses. If his universe is totally structured in this way, all he can do is to prepare for the coming reinforcement, e.g., by autonomic measures such as salivation. He can predict but he cannot control. Note that a subject, acting in terms of this philosophy or in terms of any philosophy of this order, will, in general, have such experience of his universe as will validate his philosophy. If he does not believe it is worthwhile to behave instrumentally, he will never engage in behavior which would disprove or test the philosophy. And, conversely, if he has had past experience only of instrumental contexts, he will have learned to behave instru- mentally and will encounter, as it seems to him, a universe in which instrumental behavior is appropriate. Attempting to make a reinforcement come, he will try out various courses of action; and when the reinforcement does come, he will believe that the action which immediately preceded it was an effective instru- mental action. His experience of his universe will validate his theory of instrumental magic, even though the causal contin- gencies assumed by this magic may be mythological or delusory. Let me now extend what I have said about individual learning to what would superficially seem to be much more complex phenomena— those of interpersonal exchange. To do this, we have only to personify the experimenter as well as the learning subject and to see the learning experiment as a small segment of an interchange between two persons: A, the experimenter, provides the stimulus; B, the subject, responds to the stimulus; and A follows B's response with a reinforcement. Notice that these categories {stimulus, response, and reiti for cement) which we are putting upon the behaviors cannot be empty. If the experimenter does not provide a reinforcement, this in itself is a reinforcement; and, if the subject does not respond to the stimulus, this failure to respond represents the subject's response to the stimulus. Notice also that if there were no stimulus, this in itself would be the stimulus to which the subject responded. In 1 76 Information Storage and Neural Control the world of communication, a message does not have to be an event or an object in order to be a message. As my friend Ray Birdwhistell says, "Nothing never iiappens." If we look at an on-going interchange between persons who behave alternatively, they can never "not behave." The inter- change has been going: . . . A, B, A, B, A, B, . . . From this we cut out, for our analysis, any triad: a sequence A, B, A, or, if you like, a sequence B, A, B. Within any such triad, we can now recognize that the third item is necessarily a reinforcement be- cause, in this triad, if the third item had been something other than what it was, or if it had been something, for example, which made the second item inappropriate, it would obviously have been a negative reinforcement. So if the third item is appropriate, it is, in fact, a positive reinforcement of the second item. By the same token, the second item can always be regarded as a "response" since it follows a first item and is reinforced by a third. Cor- respondingly, the first item is necessarily a stimulus since it precedes the second, which is reinforced by the third. These are purely formal relations between items and must necessarily obtain in any triad of an interchange between learning entities. It follows that, in a long interchange of this kind, any behavior of B is necessarily simultaneously a stimulus, a response, and a reinforcement, according to how we slide our identification of the triad up and down the series. The same is true for any behavior of A. Such a scheme has the advantage of presenting to the scien- tist all the possibilities for punctuating a sequence of interchange at the level of complexity of the triad. It is, however, arbitrary in that it excludes the simpler (dyadic) units of interchange and also the more complex (polyadic) units. The arbitrary selection of the trigram, however, does raise a number of interesting problems. Note that each item in any tri- gram is also a member of two other trigrams. Clonsider such a sequence as the following: A . . . 23 25 27 29 . . . B ... 22 24 26 28 30 . . . In the sequence, the odd numbers represent items of A's behavior while the even numbers represent those of B. The sequence is Patterns of Human Behavior \11 deliberately imagined to be far from the beginning and from the end of the total interchange. It will be observed that B's item 26 is a response in the trigram 25-26-27, but it is also a reinforce- ment in the trigram 24-25-26 and a stimulus in the trigram 26-27-28. The formal truth, however, may not represent the natural history of the relationship as it is perceived by the par- ticipants. They are busy putting their labels, imposing their Gestalten, on the items and on the trigrams. It is perfectly possible, for example, for A to punctuate this interchange in such a way that he will see only the trigrams 23-24-25 and 27-28-29 and ignore or brush off B's items 22 and 26, creating a picture in which A always provides the stimuli and reinforcements while B provides only the responses. If A succeeds in maintaining this system and in making B see the relationship in the same way, we may say that A is, in this particular sense, the dominant participant in the relationship. On the other hand, B, by pulling his punches on items 22 and 26, may succeed in forcing A to think that he (A) has the initiative. It may then be difficult to decide who is "dominant." At this point it is not appropriate to go into all the possible details of the punctuation of such sequences. However, a part of this matter has been explored in earlier publications (2) in which the formal resemblances and differences between dominance, dependency, and spectatorship were discussed. It was pointed out that these themes of relationship could be reduced to paradigms of learning and that various types of "end-linkage"' could occur. For example. A, in his relationsliip to B, could take the dominant end of a dominance-submission relationship and the succoring end of a succoring-dependence one. These patterns could also be reversed, in which case A would combine dominance with de- pendency. Very basic differences between cultures, e.g., between the cultures of England and America, might be expressed as contrasts of end-linkage in parent-child relationships. But, if it is true of human natural history that people punctuate their interchanges into sequences which are, in fact, contexts of learning, it follows that in interpersonal interchange we must also face at least the three levels of learning which have already been defined in the learning experiments. That is, each person is 178 Information Storage and Neural Control receiving bits of information, and these bits are already falling into place as yes or no answers to questions of which the person already has understanding. But the second order learning must also be occurring, i.e., he must be changing his identification and understanding of the questions to which the bits are answers; and third order learning must also be going on, namely, he must be learning the characteristic patterns of contingency in this re- lationship. The reality of these three levels of learning, especially the reality of the third level and perhaps of higher levels, can only be demonstrated convincingly from phenomena of pathology. Wlien all is going smoothly, it is not possible to get a clear picture of what orders of learning are operating. It is when certain orders of learning are disturbed that it becomes possible to analyze and recognize these orders. For a long time psychologists have been performing various experiments which amply demonstrate what I am trying to say. Unfortunately, the conventional phrasings used in the psycho- logical laboratories are not along the lines I am advocating here. The experiments to which I refer are those called experiments in "experimental neurosis." Traditionally, these are described with- out invoking any theory of levels of learning. For example, we are told that the dog starts to exhibit psychotic or other sympto- matology when his "discrimination breaks down." Let me dissect a typical experiment for you so that you may see that what happens is not necessarily a matter of breakdown of discrimination but can be seen as a matter of disruption of the learning process at what I am calling the third level. Classically, the animal is presented with an ellipse, which means x, and with a circle, which means y. If the dog performs X in response to the ellipse and y in response to the circle, it either gets its reward or avoids its punishment. But, if the dog fails to "discriminate" between these stimulus objects, it receives punish- ment or fails to get a reward. Having taught the dog this dis- crimination, the experimenter begins to fatten the ellipse and to flatten the circle. The dog responds by exerting greater effort to tell the difference between the symbols, and at first these eff"orts will be successful. As a further stage is reached and the discrimina- Patterns of Human Behavior 1 79 tion becomes more difficult, the psychologist makes a pencil mark on the back of the ellipse in order to distinguish it from the "circle." He also uses a coin or some other randomizing device to decide which of the stimulus objects he is going to administer next. He cannot afford to administer them in any patterned order which the dog might learn. Finally, these two objects become indis- tinguishable; i.e., from the point of view of the dog they are one object or, rather, they would be one object if the dog had not been told previously, "This is a context for discrimination." This message was underlined during the period when discrimination was difficult but still possible. The message, "This is a context for discrimination," is carried partly by the earlier training and partly by every circumstance of the laboratory, the harness, the smell of the experimenter, and so forth. All these ancillary stimuli are, in fact, indications to the dog that he is now in a context for discrimination. At this point, the dog starts to show grossly disturbed behavior; it may bite its keeper, refuse food, become comatose, etc. If the experiment is started with a naive dog and the preliminary training in discrimination is omitted, the dog does not go crazy. If you start with a dog untrained in discrimination and present a single stimulus object (flipping a coin to decide what this object shall mean), the dog has to guess and will do the appropriate thing; it will gamble on the difference. The dog cannot toss a coin, but it settles, in general, to approximately the probabilities which it experiences. If the stimulus object means \ 70 per cent of the times and )' 30 per cent of the times, the dog will settle to guessing at .V 70 per cent of the time and guessing at y 30 per cent of the time. This is not the ideal course which the sophisticated gambler would follow; he, of course, would bet on x 100 per cent of the times be- cause it gives more frequently the positive reinforcement. What happens, it seems to me, in the pathogenic experiment is that the experimenter succeeds in communicating to the dog a message about the contingency patterns in which it is to find itself, and this message happens to be an untrue message. The dog is in a probabilistic situation, but the experimenter has con- vinced the dog that it is in a discrimination situation, at which point very severe pathological changes start to appear. 1 80 Information Storage and Neural Control These are the situations which, in our work on schizophrenia, have come to be called "double-binds." These may now be defined very simply as pathological alterations of communication at the third level. Let me illustrate this pathogenic pattern, or perhaps I should say broken pattern, rather briefly with an excerpt from a book entitled Mary Poppins (3). This is an English children's book by P. L. Travers about an English nanny Mary Poppins. She has taken the two children to a little old gingerbread shop owned by Mrs. Corry, a tiny old woman with two large "sad" daughters: "I suppose you've come for some gingerbread?" "That's right, Mrs. Corry," said Mary Poppins politely. "Good. Have Fannie and Annie given you any?" She looked at Jane and Michael as she said this. "No, Mother," said Miss Fannie meekly. "We were just going to, Mother — -" began Miss Annie in a frightened whisper. At that Mrs. Corry drew herself up to her full height and regarded her gigantic daughters furiously. Then she said in a soft, fierce, terrifying voice, "Just going to? Oh, indeed! That is very interesting. And who, may I ask, Annie, gave you permission to give away my gingerbread — ?" "Nobody, Mother. And I didn't give it away. I only thought — " "You only thought! That is very kind of you. But I will thank you not to think. I can do all the thinking that is necessary here!'' said Mrs. Corry in her soft, terrible voice. Then she burst into a harsh cackle of laughter. "Look at her! Just look at her! Cowardy-custard ! Cry-baby!" she shrieked, pointing her knotty finger at her daughter. Jane and Michael turned and saw a large tear coursing down Miss Annie's huge, sad face, but they did not say anything, for, in spite of her tininess, Mrs. Corry made them feel rather small and frightened . . . In this episode Mrs. Corry indicates that this is a context in which to have given gingerbread to the children would be re- warded and not to have given gingerbread might be punished. Patterns of Human Behavior 181 The daughter Annie tries to alibi for not giving gingerbread, and Mrs. Corry promptly punishes her. This is not, was not, that sort of context at all; it was one in which the daughter had no right to give away gingerbread and was wicked to even think of doing so. The problem, then, for every individual in every interchange is to maintain an up-to-the-minute grasp of understanding of the state of the contingency patterns between himself and his vis a vis. Consciously or unconsciously, he has to be able to recognize what sorts of trigrams, or more complex sequences, should characterize the relationship at every moment and to act in terms of these recognitions. The individual has to predict from what occurred previously which pattern is appropriate at the moment. This is what we call understanding between persons. Without it or when such understanding is traumatized or punished, very severe patho- logical behavior may follow. But such understanding is only possible because we are able to predict, to guess correctly at a given moment, within what pattern we are operating and within what pattern the other person is operating. Prediction is the essence of the matter, and it is at this point that double-bind theory links up with information theory. Redundancy, as the term is technically used, is that charac- teristic of the sequence of events that enables an observing subject to make a better than probable guess at the next item in the sequence, so that this next item, when it actually occurs, does not provide 100 per cent new information. It is rather unfortunate that the word redundancy has been used in this sense, because coinfortable communication between people (we may even say efficient communication between people) depends entirely upon such ability to predict. It might have been happier to describe the phenomenon of redundancy as a necessary condition of efficiency rather than as a characteristic excess since it is economical to deal with patterns rather than with multiple bits. It is now appropriate to think for a moment about the place in human natural history of patterns of this order. Bavelas (per- sonal communication) has shown that these orders of learning are singularly difficult to modify when erroneous learning has occurred. The experimental material is somewhat as follows: The subject is presented with a board on which there are a number of buttons 1 82 Information Storage and Neural Control and is told to find the correct way to press these buttons. He is told that when he presses them correctly, a bell will ring. The subject proceeds to press buttons, and after he has pressed, say, fifty buttons, the bell rings. The experimenter now asks him if he knows how to do it and if he will do it again. The subject again presses buttons, and after he has pressed about forty-five buttons, the bell rings. He is again asked to repeat the task, and this time after about forty pressings the bell rings. The subject is doing better and better. When the subject has reduced the number of pressings to about twenty, Bavelas stops the experi- ment and tells him that there is no connection between the buttons and the bell, that the bell is only geared probabilistically to a hypothetical learning curve. The subject will then look Bavelas firmly in the eye and tell him he is lying. This, of course, is true except that the subject is wrong as to which lie he is attributing to Bavelas. The truth is that Bavelas was lying initially when he told the subject there was a connection between the bell and the buttons, but he is now telling the truth. The subject, however, cannot be convinced of this and will reassert his theory of the interrelation between the buttons, usually quite a complex theory with a lot of paren- thetical cautions in it: "At this part of the sequence you should not go too fast"; "If you go too fast, you can only correct it by going back to the beginning of the sequence," etc. The subject is perfectly certain that what he was doing was related to the theory he built up and that his experience has validated this theory. He has been, after all, well reinforced in this belief by his steadily increasing success. There is, I understand from Bavelas, only one way of dis- illusioning the subject in regard to his theories about these buttons. This is by asking him to perform the experiment upon a second sub- ject. As he does this and sees the second subject develop analogous but dissimilar illusions, he realizes the nature of the situation and the process through which he has gone. The point I want to make is that these impressions, illusions at the third level, are held very deeply and are exceedingly difficult to disturb; the same must be true of knowledge and wisdom at the third level. I have mentioned that the subject trained in an Patterns oj Human Behavior 183 instrumental philosophy will, of course, encounter a universe which will seem to him to validate that philosophy and that a subject trained in a Pavlovian universe will correspondingly, as it seems to him, encounter a universe in which the Pavlovian philosophy is appropriate. It is a formal characteristic of this level that opinions about it are, in general, self-validating, and, of course, a great deal of the difficulty in psychotherapy occurs in wrestling with this particular fact. The interchange between therapist and patient always seems to the patient to validate those third level premises with which he entered the therapy room. This is the phenomenon of "trans- ference." The therapist's task is to endeavor to break up those learnings at the third level for which the patient has been deeply reinforced in the past, those learnings of which he is, in general, almost unconscious and which necessarily have this characteristic of being self validating ... no mean task. At this point, we are approaching a fourth learning level: the problem of changes at the third level. I have said that this is no mean task for the therapist, and I think it is worth noting that this is a task in which considerable meanness, in another sense of the word, may be a necessary ingredient. To change one's basic premises at this third level is always in some degree painful and always difficult, and the therapist may be compared, if you will, to Mrs. Clorry. He must, of necessity, put the patient in the wrong at the third level. It is therefore essential that psychotherapy shall be double-binding in the sense in which the word is defined here. Mrs. Corry is pathogenic because she goes on doing this without mercy. The therapist is curative insofar as he does it with wisdom and with consistency. After all, Mrs. Corry, is inconsistent even in her inconsistency and can, therefore, always surprise her victim; whereas, the therapist must instruct his patient, albeit by implicit methods, so that new expectations may replace the old and may be rewarded. This problem of fourth-level learning, of changes at the third level, is a necessary part of human life. It obtains in courtship; it obtains in initiation; it obtains in psychotherapy; it obtains, in fact, wherever important reconstruction of relationship must occur. We know very little about such phenomena, and I cannot tell 1 84 Information Storage and Neural Control you much today. Certain aspects, however, are conspicuous enough to be worth mentioning. First, it seems that such deep clianges and the processes by which they occur are almost in- variably cloaked with unconsciousness and with amnesia. The ability of any couple to tell you what it really was that they went through in courtship is approximately zero. They can tell you dates, times, and places. They may be able to identify a single striking episode, something that he did or she did which struck the other with a moment's flash; but, in general, such processes are not subject to recall and have not been investigated. Wliile there is a great deal of fantasy about courtship, there is, as a matter of fact, no recorded data regarding it in any culture of the world. Similarly, the patient and the therapist are both virtually unable to tell you what happened that led to psychotherapeutic change. Theories are many; fantasies are many; recipes are many and are always unsatisfactory. It is not too much to say that this is a region of almost total scientific ignorance. I believe, however, that it has to be analyzed, has to be studied, and will be studied in the next twenty years, and that in this study, the branch of information theory dealing with patterns of patterns, redundancies about redundancies, will be a central tool. REFERENCES 1. Attneave, Fred: Applications of Information Theory to Psychology, New York, Henry Holt and Co., 1959. 2. Bateson, Gregory: Morale and national character, in, Civilian Morale, 2nd Yearbook of the Society for the Psychological Study of Social Issues, edited by Goodwin Watson, New York, Houghton Mifflin and Co., 1942, p. 71. 3. Travers, P. L.: Mary Poppins, New York, Reynal and Hitchcock, 1934, p. 121. DISCUSSION OF CHAPTER VIII Herman Blustein (Chicago, Illinois): Doesn't an adequate communication system actually preclude the knowledge of the rules of the game by both communicators and receptors of the communication system? Patterns of Human Behavior 1 85 Gregory Bateson (Palo Alto, California): I think there are two questions combined here. One concerns the case where com- munication is going along "smoothly," as I called it earlier. Is a knowledge of the rules necessary? Obviously it is not. The rules are provided; they are built in, and that is all we ask. To be able to cough them up and inspect them is not necessary. So far I think we are in agreement; however, behind this is the question of "rules about rules" and "rules about rules about rules." I think we always walk around wishing to be in the state of "things going along smoothly," and wishing, therefore, not to turn over all this disturbing stuff, i.e., unwilling to raise questions about the rules. We may be forced to do this when things go wrong. We want some of the rules to be steady. We hope we can operate on the common assumptions of the culture which we share, and we hope to try to get mutual understanding at that level. If we cannot, we may be pushed into reexamining blemishes of the culture, but this will be painful and always at an upper level which we do not want to disturb. Yasuhiko Taketomo (New York, New York): In the com- ments on expectation in relationships, were you referring to something like role-taking in psychiatric communication? Bateson: I was doing so in a terribly loose context. I think the evidence is going to come from such work as that of Birdwhistell, studying expressive movement and expressive posture. This is not a study of those movements which are quasi-linguistic, such as thumbing a ride, but the study of those much less conscious and much less voluntary elements in our movements. I think it is going to appear that, while we talk with words, mathematical equations, and other highly sophisticated devices, we are, in fact, either leaning forward on the rostrum or scratching in our pockets looking for a cigarette or some other object. All these movements can be interpreted and handled, and are going to be interpreted and handled, at this third level as sequence markers or signals about the relationship. But when I lean forward or draw back from you, these movements indicate to you whether I want you to come forward and shoot me with questions or whether you should beware of my defenses, and so on. I think the implementa- tion is going to come from this area and from the field of micro- 1 86 Information Storage and Neural Control linguistics in which modulations of loudness of voice, emphasis, rasp, etc., are going to be the key signals. Myron F. Weiner (Dallas, Texas) : Assuming that somebody comes to you because he has had a breakdown of relationships because, in turn, his metacommunications or metaconcepts, or what he expects of the world, are somewhat different from what he says he expects, do you think it would be of some value in correcting his behavior to bring to his consciousness the fact that his metaperception is quite different from what he thinks he perceives? Bateson: This is a problem of technique of psychotherapy. Let nie reword Dr. Weiner's question: "Does it help to give him insight?" I would not agree that insight is necessary and sufficient. It may be sufficient, but I do not think it is necessary. I think that experiences of effect in communication at these levels probably are therapeutically necessary, but I do not think it is necessary that these communications take the form of providing a guide to conscious insight into the mechanics of these levels. Surely it never happens. I do not know of any school of psychotherapy that, as yet, has enough language for talking about these levels to even attempt to give insight at these levels. We just do not have the language to give that insight. I think we know that psychotherapy occurs; but since it occurs in a culture which does not have sufficient language to say what is happening, it follows that linguistic insight is certainly not necessary. W. R. Beavers (Dallas, Texas) : These remarks about the con- text, or metalanguage, reminded me of Bion and his primitive group concepts. He felt that in working with groups, he saw and began to communicate with them, not about their intrapsychic assumptions, but about the primitive group assumptions. As I recall, there was an assumption of the fight-or-flight and of the pairing group. This sounds very much like your ideas concerning the basic mammalian assumptions underneath that which is con- ventional conversation. Bateson: I am slightly familiar with the ideas, but have not worked with them. PART III— NEUROPHYSIOLOGICAL ASPECTS OF INFORMATION STORAGE AND TRANSFER Moderator: Hebbel E. Hoff, M.D., Ph.D. CHAPTER IX INFORMATION STORAGE IN NERVE CELLS Frank Morrell, M.D. Bi ►EHAVIORAL observations have generally supported the notion that (aside from genetic information) there are two quali- tatively different forms of information storage in the nervous system. So-called "recent" memory is made of particularly labile stuff. A cerebral concussion produces an amnesia not only for the injury itself but also for the events immediately leading up to the injury, a circumstance about which many lawyers are painfully aware. The impact of experience requires time for fixation. If neural activity is interfered with during this fixation or con- solidation period by electro-shock (13, 14, 53, 54), trauma (50), severe cold (44), or rapid induction of ether or barbiturate anes- thesia (1), subsequent recall of the experience may be seriously compromised. For example, Duncan (13) and Gerard (14, 44) have shown that rats or hamsters trained in an avoidance situation or in maze-learning have a normal learning curve if a maximal electro-shock is delivered four hours after each training session. If the shock follows the training by one hour there is slight de- terioration; at a fifteen minute interval there is major interference with retention and at five minutes or less, learning is completely prevented. Acute anoxia introduced at similar time intervals has the same effect (53). Since all of the agencies known to produce amnesia or loss of recent memory are also known to alter electrical activity of the central nervous system, the mechanisms subserving the initial stage of memory recording are inferred to be electrical in nature. Other evidence supporting a clear distinction between short-term and durable memory mechanisms is the finding that 189 1 90 Information Storage and Neural Control focal epileptogenic lesions prevent new learning, i.e., impair memory recording, but do not disturb behavior learned before establishment of the epileptic lesion (31, 34, 35, 42, 51). A limiting case in the requirement for a finite fixation time is the classical example of one-trial learning. However, even in this instance, it has been supposed (Hebb) (24) that the neural con- sequences of the single experience persist in the form of rever- berating impulses for a considerable time after the environmental signal has ceased. x\lthough all or none impulses circulating in closed neuronal chains represent one possible mechanism for the initial imprinting or short-term memory the actual kind or kinds of electrical activity involved remain unknown. In fact, there is nothing in the experimental evidence concerned with manipula- tion of the consolidation process which affords compelling proof that consolidation depends upon reverberating impulses of the all or none type (41). Other kinds of electrical activity, that is, other than the classical axon spike or even the conventionally recorded EEG may well be equally important. I should like to present some evidence which suggests that cortical steady potential gradients may have a determining in- fluence in the process wherein a sequence of impinging impulses is transformed into structural change in the nervous system. This portion of the paper, therefore, is concerned with the initial or electro-sensitive stage of memory recording. Significant shifts of the cortical steady potential have been shown to occur consequent to stimulation of peripheral receptors (19) as well as when stimulating electrodes are applied directly to brain substance (2, 7, 8, 9, 16, 17, 18). Some years ago, we found (37) that the surface negative DC shift resulting from low frequency stimulation of nucleus centrum medianum in the thalamus would appear as a conditioned response to a pure tone after thirty to forty paired trials. Figure 1 illustrates the first paired trial. The tone elicited no response. Upon onset of the thalamic stimulus, a pronounced negative shift of the base line of the EEG occurred, which was confined to the hemisphere ipsilateral to the stimulated site. After about forty paired trials (Fig. 2) a similar DC shift was regularly induced by the tone alone. Note particularly that this Information Storage in Nerve Cells 191 RC LC Mm v/-^^^v^MVUWv^ Tone '^^^-h^^ t Stim. ■U .J B I stim. off Fig. 1. Initial trial in which a low intensity 500 cycle per second tone lasting ten seconds is paired with four per second shocks (6 volts, 1 millisecond duration) delivered through bipolar stimulating electrodes in the left centre median. The tracing is from an unesthetized rabbit. Electrodes derived from the somato- sensory regions of both hemispheres and recorded monopolarly to a reference on the pinna. A and B are a continuous sequence, (A) indicating the pronounced negative shift witli slight after-positivity on the onset of thalamic stimulation and (B) the reversal of steady potential shift at the cessation of thalamic stimu- lation. Note that the steady potential change is limited to the ipsilateral hemi- sphere. Calibration: 50 microvolts and one second (37). R C r L C •Jr, V^/.V ^>^^^.-^/V■•f'^^^ '^V^ ^ . , ^YKr}r^f'r,^\-J^r^\j'. --r ' Tone •/ V ../I Fig. 2. Same experiment as in Figure 1. After forty trials the tone alone elicited the same ipsilateral negative-positive steady potential shift. This gradually dis- appeared over a series of sLx unreinforced trials but was restored by a single subsequent reinforcement witli thalamic stimulation. Calibration: 50 microvolts and one second (37). 1 92 Information Storage and Neural Control conditioned DC shift was also restricted to the previously stimu- lated hemisphere although the tone was presented to both ears equally in open field conditions. In addition to such direct conditioning of a cerebral electrical event, the increasing availability of DC amplifiers made it possible for Rusinov (49) and, more recently, Rowland (47) to identify steady potential shifts occurring regularly in the course of classical behavioral training. Rusinov (48) also discovered a most intriguing behavioral eff'ect when low-level surface positive polarizing cur- rents were applied to a part of the motor cortex. The current levels employed were sub-threshold with respect to direct production of limb movement. But during the period of current flow (and for some minutes afterward) any ambient sound, light or touch would produce the limb movement to be expected from adequate (supra-threshold) stimulation of the motor area to which the current was applied. Rusinov felt that the anodal polarization produced a "dominant focus" of excitation which facilitated the development of a temporary connection between, for example, the auditory and motor systems. We have been able to confirm the Rusinov experiment in our own laboratory and, in addition, have made some observations on the activity of single nerve cells in such polarized regions (40). Single cells in motor cortex did not respond to acoustic stimulation before polarization (Fig. 3A). During the passage of anodal cur- rent (10 microamps) cells of several different types (Figs. 3B, C and D) were easily triggered by the same acoustic signal. Since we were interested in mechanisms for information storage we per- formed the experiment in a slightly diff'erent way, a way which allowed observation of a selective sensitivity with respect to signals differing in their history of exposure to polarizing current. A group of stimuli was chosen and all members of the group were presented repeatedly to the animal until habituation (as judged by lack of behavioral or EEG response) was complete. A polarizing electrode together with a fine microelectrodewas placed on the motor cortex and the current was turned on. One member of the previously habituated stimulus group was selected (in this case a 200 cycle per second tone) and was presented to the animal about thirty times in the course of forty-five minutes. A burst of unit activity Information Storage in Nerve Cells 193 B Fig. 3. Patterns of response in single units to an acoustic stimulus. Duration of the tone of 200 cycles per second is indicated by the two upward deflections in the second channel of the oscilloscope. Before polarization (A) there was no effect on the discharge frequency of a unit in motor cortex. During polarization responses to sound appeared either in the form of a single high frequency burst (B), a sudden cessation of firing (C), or high-frequency bursts at the "on" and at the "oflf" of the tone (D). Calibration: 5 millivolts and one second (40). 194 Information Storage and Neural Control A B Fig. 4. "Generalization and differentiation" in single unit responses. During the passage of anodal current (A & B) the critical tone (A) and a single presentation of the indifferent tone (B) were equally effective in provoking high frequency bursts. Twenty minutes after cessation of current flow (C & D) the critical tone (C) continued to elicit the response while the indifferent tone (D) did not. Forty minutes after discontinuing polarization (E & F) neither signal produced any change in unit discharge frequency. Calibration: 2 millivolts and 500 milli- seconds (40). occurred with each stimulation (Fig. 4A). Another member of the habituated stimulus group (500 cycle per second tone) was presented once in the polarization period and also elicited unit discharge (Fig. 4B). The current was then discontinued and in the following twenty minutes the 200 cycle per second tone con- sistently provoked unit activity (Fig. 4G) while the 500 cycle Information Storage in Nerve Cells 195 Fig. 5. Conditioning of a rhythmic burst response to a single flash. Anodal polarization was applied to the visual receiving area. Single flash elicited a single burst in a quiescent (A) and in a randomly firing cell (B). Three per second stroboscopic stimulation (C) produced driving of unit discharge at tliat frequency. A single flash (D) delivered thirty seconds after termination of the rhythmic stimulus resulted in repetitive unit discharge at about three per second. Unit potentials are seen in the upper channel of tlie oscilloscope; stimulus artifacts in the lower channel. Amplitude calibration: 2 millivolts. Time calibration: 500 milliseconds (A & B) and one second (C & D) (40). per second tone (Fig. 4D) invariably failed to induce a change in the pattern of unit firing. About forty minutes after cessation of polarization neither signal was effective (Figs. 4E and F). Under these circumstances it seemed evident that the polarized cell population had retained some stipulation of signal characteristics so that for a brief period in the post-polarization interval the cells behaved differentially with respect to the two signals. Short-term storage of a temporal pattern has also been observed in cells of the visual cortex. Figure 5A illustrates the response of a quiescent cell to a brief flash of light, (the flash artifact is recorded on the second beam of the oscilloscope). Figure 5B shows a similar burst response in a spontaneously active cell. During anodal polarization it was extraordinarily easy to "drive" such cells with 196 Information Storage and Neural Contiol c en a; (/) c o Q. cr o E .d cr o 00 r B 80 - 60 - 40 - 20 05 I 10 20 30 Time in minutes after "conditioning" train of 3 /sec flicker Fig. 6. Time course for "decay" of conditioned rhytJimic response of single cortical unit. low frequency intermittent light (Fig. 5C). After a few minutes of stimulation the three per second flash was discontinued and thirty seconds later a single flash resulted in a series of bursts having a three per second frequency (Fig. 5D). Single flashes delivered at intervals longer than thirty seconds were less and less likely to provoke such a rhythmic response but occasional rhythmic responses to single flash were noted as long as twenty minutes after the end of the conditioning train (Fig. 6). This seems a particularly clear illustration of the capacity of the polar- ized cells to retain some representation of an imposed stimulus pattern for a relatively long period of time. Indeed the order of magnitude of this time interval is itself significant. It correlates well with the data of Gerard (14), Duncan (13) and others (44, 53, 54) on the abolition of learned responses consequent to massive electro-shock delivered at various intervals following the training session. Information Storage in Nerve Cells 197 CJ-2 -\^\^\^\^ A R»E. (((( Tet. -v^v Post. S R T S R T J /v^ ^A^r- Pre. Tet. Post. Fig. 7. Oscillographic tracings from a deeply anesthetized (A) and an unanes- thetized (B) cat. Derivations are from implanted bipolar electrodes (R) arranged as indicated in the diagram. Recording electrodes (R) are situated between the stimulating electrodes (S) on one side and the tetanizing electrode pair (T) on the other. Explanation in text. Calibration: 100 microvolts and 100 milliseconds. Negativity at the recording electrode produces a downward deflection of the beam in this and the two succeeding figures. (Chow, K. L. and Dewson, J.: unpublished data.) Short-term storage of a temporal pattern may also be demon- strated in another way. Following a technique originally described by Roitbak (45), Doctors K. L. Chow and James Dewson (10) have used tetanization of a local cortical region to produce an effect similar to that of the ''dominant focus." Three pairs of implanted electrodes were arranged as shown in the diagram of Figure 7 so that the stimulating pair was at one end of the array, the tetanizing pair at the other end, and the recording electrodes in between. In the deeply anesthetized animal (Fig. 7A) nine per second shocks delivered before tetanization produced only a small direct cortical response (DCR) arising almost immediately out of the shock artifact. The nine per second stimulus was continued through- out the fifty per second tetanus and into the post-tetanization period. Immediately following the tetanus the response to the shock was altered and could be distinguished from the DCR by 198 Information Storage and Neural Control ■,'».~'v»,(^llf(|lJ,YfV'*''''' 'As//^/-^yv'A;-*A,Ay-^//,^;s/v,\,^^-,-v.yvv^^/* 7 v*'V^'^''''^^''',*AVy.»^/ Fig. 10. Unconditioned (A) and conditioned (B) avoidance response in the rabbit as recorded by electromyogram of the right forelimb. The upper six channels are EEG tracings obtained at the same time. Electrodes 1, 3, and 5 derive respectively from left motor, somatic sensory and visual corte.x. Electrodes 2, 4, and 6 are corresponding placements on the right hemisphere. Calibration: 50 microvolts and one second. Further explanation in the text. so b60 -a *-50 :30 ;?*io \ \ \ CoThodal Vo\Q of visual correx (§) Polor'iz Qi-ion (-) of ear I \ r \ r b. Anodol Polarization (R290-0OJ O Polarization (■>-) of motor cortex 0 Polariz-otion (.-I-) of visual cortex @ Polorizotiani.-*-) of ear 10 - 10 15 20 Titne in daqs 25 30 Fig. 12. Effect of cathodal (A) and anodal (B) polarization on learning curves in two different animals. See text. ation, although on the second and third occasions when training was omitted for four and three days, the usual decrease did occur. It seems unlikely that the impaired performance on the day following cathodal polarization was due to a long persisting effect of the cathodal current. Indeed as we shall see later, the level of decrement on the day after cathodal polarization of visual cortex Information Storage in Nerve Cells 205 100 90 80 70 60 50 ^0 •/I "5 30 'C •I- ^20 t- c ? •I- olOO Cil i_ o90 o c80 o b TO a 60 50 ^0 30 20 10 a. Cathodal Polari'zafion ('/?/9c-6(3j 0 Polar) zat'i on (-) of motor corTex 9 PolariT-Qtioni-) of viiual correx- (§) Polar I z Qfion(-)ofear' \ I . I b. Anodal Polarixation (Rloa-oO) O PolariTiotionCf) of motor cor re.\ 0 Polar} zotioti (-t) of- v'S uol cartas @ Polar/ zotion C^) of ear 15 10 15 20 25 30 Time in claims Fig. 13. Effect of cathodal (A) and anodal (B) polarization on learning curves in two additional animals. Explanation in text. corresponds closely to that occurring after a lapse in training. It would appear that training under conditions of cathodal polari- zation of visual cortex did not result in registration or retention of that day's experience. The animal behaved as though there had been no training at all on that day. Figures 12B and 13B present curves of the typical response pattern in two animals subjected to bilateral anodal polarization of motor cortex, visual cortex and the ear. There is no evidence 206 Information Storage and Neural Control Carhodoi Polarization Ancdal PolorizQt-ion R3Z Qc -60 O 0 Polarization of motor cortex -0' O Polarisation of viiual cortex © ® Polarization of ear 25 Tim<2 in claims Fig. 14. Effect of cathodal and anodal polarization applied on different occasions in the same animal. Note again the marked decrease in performance following a six day break in the training schedule. that performance was significantly altered on the day on which polarization was carried out at any of the three sites. It is interesting that in every instance (see also Fig. 14) there was an abrupt rise in the response per cent on the day following anodal polarization of visual cortex. Figure 14 presents the data on one of the two animals receiving both anodal and cathodal sequences. Note again the depression of performance during passage of cathodal current in visual cortex and the maintenance of depression on the day following. The usual performance decay following a lapse in training is also apparent. In the same animal the application of anodal current to the same three areas resulted in no significant change. Yet on the day following the visual anodal polarization performance reaches its all time peak for this animal. Figure 15 illustrates sample records of conditioned responses obtained under the various conditions in this experiment. Figure 15A represents cathodal polarization of the motor cortex. Note the characteristically long latency of the CR although the animal responded correctly as many times per session as it would without Information Storage in Nerve Cells 207 A B c D w«^-v-^.v-W'Av^\^/^WwV'^ '*^.-''>^'f^j(./v^*^'"j--»-^v"'^''^ ,^-.^.>^^^l'Wf^^•l||^'V^vV^^ "^"^wUff 1 ^'^-^■^-'^^^''V" Fig. 15. EEG tracings and conditioned response performed during various con- ditions of polarization. A. — cathodal polarization of motor corte.x. B. — anodal (motor). C— cathodal (visual). D. — anodal (visual). Calibration: 50 microvolts and one second. Explanation in text. polarization. On the other hand, during anodal polarization of the motor cortex (Fig. 15B) the latency of CR was very short and the amplitude of photic "driving" was much reduced. A similar EEG pattern was observed during application of cathodal current to the visual areas (Fig. 15C). Reversing the direction of current flow (Fig. 15D) resulted in considerable augmentation of photic "driving." A summary of data in critical training sessions is given in Table I. The number of observations and the median and range for the number of conditioned responses per session are listed for the following conditions: Polarization of ears; motor cortex, anodal, cathodal; visual cortex, anodal, cathodal; session before anodal and cathodal (visual) polarization; session after cathodal (visual) polarization; session after anodal (visual) polarization; session after break in training. Statistical analysis of these findings leads to the following con- clusions: 1) Performance under tlie condition of visual cathodal polari- zation differs from that of visual anodal, motor anodal and cathodal and ear (anodal and cathodal) at better than the 1 per cent level of confidence. 208 Information Storage and Neural Control 2) There is no significant difference in performance between any one of the polarization concUtions (except visual cathode) and any other. 3) Performance on the day JoUowing visual cathodal polari- zation differs from that on the day preceding it at the 1 per cent level. 4) There was no significant difference between performance on the day following visual cathodal polarization and the day after a break in training. 5) Comparison of performance on the day after visual anodal polarization with that on the day preceding (visual, anodal) polarization yields a difference significant at better than 1 per cent level of confidence. With respect to the last "conclusion'' listed, one must hasten to add that there is no justification for attributing the improve- ment to an effect of anodal polarization. The difference may simply reflect the rising learning curve or the gain normally expected in two days of practice. Only if the experiment were performed at a point on the learning curve where the gain to be expected in two days of practice was negligible could one attribute the change to the neurological intervention. Such was not the case in these experiments and therefore the influence (if any) of anodal polarization of the cortical receiving area for the conditional signal remains uncertain. In summary, it is clear that the imposition of a surface negative potential gradient, along the axis of the main neural elements of the cortical receiving area for the conditional signal, interferes with conditioned performance and prevents retention of the experience acquired during such polarization. Although the evi- dence is much less conclusive it seems possible that surface positive currents, while not producing any improvement in performance, may lead to increased retention of the information transmitted during the period of current flow. These last experiments lend some support to the notion that the electrophysiological changes secondary to imposed potential gradients, illustrated in the previous studies, may have behavioral significance and may be relevant to the manner in which the central nervous system achieves short- Information Storage in Nerve Cells 209 pa oi ^^r ^ TT< '^ ^ (o o ^ >") (U ^» G 210 Information Storage and Neural Control term information storage. Despite these brief and uncertain insights, it is most tantaUzing to realize that even at the single unit level of analysis the nature of the neural code has still strangely escaped detection. Thus in Figure 5D the segment of record preceding the application of the single test flash revealed no trace of the information which the subsequent stimulus demonstrated had been retained in that particular cell. This negative evidence argues against the notion that the short-term memory trace is preserved by means of nerve impulses continuously circulating in more or less closed neuronal chains. The recording systems employed should have been adequate to discern such activity had it been present. Perhaps the relevant electrical signs are more likely to be found in tlie slow local oscillations of synaptic potential or other sources of slowly varying voltage. Such oscillation would have been missed by the short time-constant recording system. It is also possible, of course, that the encoding process took place in cells penultimate to the one monitored. I should now like to leave the question of electrical mechanisms for information storage in the nervous system and turn to some other approaches which have recently gained prominence. While it seems reasonable to postulate an electrical basis for the labile short-term storage mechanism, it is certainly difficult to assume that the relatively permanent memory trace which remains undisturbed by the drastic perturbations of cerebral function produced by convulsions, electro-shock, concussion, or anesthesia so deep as to cause electrical silence, can be based upon continuously circulating nerve impulses (6, 41). Most workers, therefore, have tended to think more in terms of morplio- logical or chemical alterations. The essential thesis argues that recurrent impulse impingement or synaptic bombardment results in a durable morphological or chemical change which renders tliat particular junction or cell more easily susceptible to subse- quent activation via the same pathway. Recently hypotheses implicating ribonucleic acid (RNA) in the molecular organization responsible for long-term information storage have been proposed quite independently by a number of workers (14, 26, 27, 29, 30, 32, 39, 43). To my knowledge the first statement of this general hypothesis in the English literature was Information Storage in Nerve Cells 211 that of Katz and Halstead in 1950 (30). Most recently in a series of lectures and articles Hyden (26, 27) has forcefully argued for implication of RNA in the molecular mechanism of memory. Although the concrete evidence is sparse, some is now available. Kreps (32, 56) in the Soviet Union, is reported to have demon- strated an alteration in RNA synthesis in regions of the nervous system related to the conditioned stimulus after establishment of the conditioned response. In cats, John, Wenzel and Tschirgi (29) noted that intraventricular injection of ribonuclease was followed l^y deterioration of pattern discrimination lasting" about four days. Avoidance CRs in the same animal were unaffected. Unfortunately no control data were presented with regard to local or general changes in brain RNA content or turnover. Therefore, since other substances such as calcium or potassium ion also reversibly impair CR performance, it is not certain that the disturbance was related to an alteration of the RNA substrate rather than to another more nonspecific action of ribonuclease. Our own explorations in this area were derived from an inci- dental observation made in the course of investigation of an entirely separate problem. We had been studying some physio- logical properties of the chronic focal epileptogenic lesion produced in animals by local freezing of a small area of the cortical surface (36). Following this procedure, the gradual establishment of an epileptogenic lesion may be verified by recording the paroxysmal electrical activity which appears in the cortical tissue immediately adjacent to the frozen zone. We were interested in studying the ontogenesis of an epileptic lesion from a chemical as well as an electrical point of view, and among a number of findings was the observation that nerve cells in the area of epileptic discharge stained densely with methyl green pyronin (39). Methyl green pyronin is one of the substances generally used for the histochemical demonstration of RNA. It was not particularly surprising to find increased concentra- tions of RNA in cells discharging at abnormally high rates. Hyden and co-workers (4, 5, 20, 21, 22, 23, 25) using much more elegant technicjues had already demonstrated increases in cellular RNA consequent to prolonged stimulation. Recently lizuka et al. (28), in Japan, have confirmed our own observations specifically with 212 Information Storage and Neural Control 1_3 A R 459 3-5 '»A»jf\VvK/V^'^'*''V^Ai>^^ 5-7 [/Vf/VWVv^ v<^,^/JNv^VJ■ vJ^/~v^J\/v'AA^^vVV^--'^,A'vv^'VJV^''^-'^/VW^^ Fig. 16. Electroencephalogram of an unanesthetized rabbit twenty-four hours (A) and three days (B) after production of an ethyl chloride lesion. The site and ex- tent of the lesion are indicated by the cross-hatched area on the diagram. Deriva- tions are bipolar from implanted electrodes over the indicated regions. Cali- brations: 50 microvolts and one second (39). respect to cells undergoing convulsive discharge. However, there was another phenomenon noted in the animals with chronic experimental epilepsy which made it possible to probe more deeply into the relationsliip between ribonucleic acid and cellular memory (43). Freezing a small segment of the surface of one cerebral hemi- sphere results within a few hours in the appearance of high voltage epileptiform spikes confined to the site of the primary lesion. These are illustrated in the upper part of Figure 16. Simultaneous recordings from the opposite hemisphere and from other portions of the same hemisphere did not reveal any abnormality. After a time, varying from a few days to three weeks, one may observe (Fig. 16B) the development of similar paroxysmal activity in an Information Storage in Nerve Cells 213 Fig. 17. Characteristics of the dependent mirror focus. In the ink-written tracing the upper two channels record the primary focus and the lower two channels the mirror focus. The ethyl chloride lesion is indicated by cross hatching. Cali- bration: 100 microvolts and one second. In the oscillographic tracing, the upper channel records the primary region and the lower one the secondary region. Calibration: 100 milliseconds (39). area of the opposite hemisphere homotopic with that of the primary lesion. The contralateral hemisphere had not been exposed or damaged in any way during the original operative intervention. The paroxysmal discharge in the contralateral hemisphere results directly from massive synaptic bombardment over known ana- tomical pathways from cells of a primary epileptogenic lesion. Consequently the electrical abnormality in the contralateral focus is considered to represent a secondary epileptogenic lesion (38). At first the secondary discharge w^as clearly dependent upon the primary in the sense that spikes only occurred in temporal con- junction with those in the primary lesion, had a measureable latency following the primary spike (Fig. 17) and disappeared altogether after excision or neuronal isolation of the original focus. The pattern of activity in the secondary area looks like a "reflec- tion" of that in the primary and thus has earned the colloquial name of "mirror focus." If the primary lesion was not excised or isolated the mirror focus eventually became independent. Secondary spikes were then unrelated in time to those in the primary focus (Fig. 18) and did not subside if the original lesion 214 Information Storage and Neural Control 1-2 3-4 4-5 6-7 7-8 Fig. 18. Electrographic characteristics of the independent mirror focus. Electro- encephalogram of an unanesthetized rabbit taken three weeks after production of an ethyl chloride lesion in the area designated by crosshatching. Discharges originating in the primary lesion (electrode 2) and in the secondary region (electrode 7) are unrelated in time of occurrence. Note also that there is some depression of activity in electrodes just posterior to the primary lesion while this is not true in electrodes posterior to the mirror focus. Calibration: 50 micro- volts and one second (39). was subsequently ablated. We have demonstrated that the func- tional characteristics of the cell network within the mirror focus are more or less permanently altered and that the alteration is manifested both by the spontaneous behavior of these cells and by their response to stimulation (38, 39, 43). The sequence just described may be prevented by section of the corpus callosum either before production of the primary lesion or within twenty-four hours afterward. In addition, the develop- ment of independent secondary discharge may also be prevented if the callosal connections remain intact, but a sub-pial partial isolation of the contralateral cortex is carried out within the same time interval. Figure 19 illustrates such a preparation. The isolation deprives the cortex of all of its subcortical connections as well as those relating it to other cortical areas in the same Information Storage in Nerve Cells 215 Fig. 19. Dissectionof rabbit brain to illustrate features of the extracallosal isolation. A slab of cerebral cortex in the hemisphere opposite the ethyl chloride lesion is dissected so that the cortex is separated from all subcortical connections and from the surrounding intracortical regions as well. The callosal pathway remains intact and is the only connection through which input is available to the dissected region. For photography, the cortex was lifted to demonstrate the underlying white matter but the operative procedure, of course, is done in such a way as to preserve the pial circulation to the cortical slab (38). Fig. 20. A Weil stained cross section of the extracallosal isolation. The integrity of the callosal pathway is well visualized (38). hemisphere. Dependent secondary discharge does occur in such a preparation as do electrically evoked trans-callosal potentials, indicating that the undercut region is viable and the callosal pathway intact. A Weil-stained section is shown in Figure 20. 216 Information Storage and Neural Control It appears then that the enduring" changes in synaptic function which form the basis of the independent mirror focus require that at least two forms of input be available to the cortical region concerned. It seemed appropriate to inquire whether the change in excita- bility or irritability of the mirror focus was dependent upon impulses circulating in closed chains of neurones or whether it was based upon structural alterations of cells within the network. As a first step, neuronal isolation of the region of primary discharge was carried out according to the technique of Kristiansen and Courtois (33). Figure 21 A illustrates persistent, perhaps even augmented activity, in the mirror focus after isolation of the pri- mary lesion. There was cessation of paroxysmal discharge in the isolated primary lesion. The mirror region was then similarly isolated (Fig. 21 B and C). Some residual spiking sometimes persisted for several minutes in the isolated mirror region (Fig. 2 IB) but soon disappeared to be replaced by electrical silence (Fig. 21C). After these isolations were performed the calvarium was replaced and the animal returned to its cage for several months. Surface recording during" that period indicated no return of paroxysmal discharge. The lack of grossly recordable spontaneous paroxysmal activity was associated with a corresponding absence of spon- taneous unit discharge when at a later date, single cells of the isolated epileptic zone were probed with microelectrodes. The last two observations afford reasonably compelling proof that self- re-exciting impulse chains do not persist after the isolation pro- cedure. If the increased excitability characteristic of the epileptic focus is dependent upon continuous self re-excitation, the isolation procedure should abolish the abnormal excitability. A direct test of this prediction was then undertaken. The animals which had been subjected to complete neuronal isolation of both primary and secondary epileptogenic regions were prepared for an acute experiment. Several non-epileptic animals had had a comparable isolated cortical slab prepared at the same time as those in the epileptic group. In a third group of animals neuronal isolation of normal cortical tissue in one hemisphere was accomplished prior to the introduction of an epileptogenic lesion in the opposite hemisphere at a point exactly biformation Storage in Nerve Cells 217 3- s/Ia-u~^|vV''^'-^A'^'-^H^V^i 7-! ,/vw\-/V^wVV^'l^v^JV*^vv|v/ 6-4 I -2 3-5 6-4 Fig. 21. Bilateral cortical isolation in an animal with well developed independent mirror focus. This is the same animal shown in Figure 18. Recordings were made with the animal under 40 milligrams per kilogram of nembutal anesthesia. Derivations are from the electrodes indicated in the diagrams. Isolation of the primary lesion is first carried out (21A) and demonstrates loss of paroxysmal spike discharge in the primary region while the secondary area continues to discharge actively. Isolation of the secondary region is then undertaken (21 B & C). For a few moments abnormal discharge persists in the isolated secondary region (B) but soon disappears (C) to be replaced by almost complete electrical silence. Note that the electrode positions have been changed in B and C. Cali- bration: 50 microvolts and one second (39). 218 Information Storage and Neural Control contralateral to the center of the isolated slab. Once the epileptic lesion had begun to discharge actively it too was isolated in the same way. It was thus possible to compare the properties of neurally isolated non-epileptic tissue in one hemisphere with similarly isolated but epileptic tissue in a comparable region of the opposite hemisphere in the same animal. Although many different test situations were investigated, only one will be discu.ssed at this time. Approximately three months after the cortical isolations were made the animals were prepared for an acute experiment. Wide exposure of both cerebral hemi- spheres and a tracheotomy were performed under ether anesthesia after which the ether was allowed to dissipate and the animals were maintained under Flaxedil and artificial respiration. The pial surface was covered with warm mineral oil or saline. Epilep- tiform after-discharges were induced in the intact normal cortex outside the isolated zones either by direct electrical stimulation or by placement of small pledgets of filter paper soaked in Metrazol. Propagation of these after-discharges was monitored by means of recording electrodes distributed throughout the intact cortex and within the isolated area. The extent to which high voltage dis- charge originating externally spreads across the solution of neural continuity to excite cells within the isolated zone is considered to be a measure of the excitability of those cells. In our experience it was rare indeed for paroxysmal discharge to cross the neural gap and excite non-epileptic isolated cortex (Fig. 22). As may be seen in Figure 22 this was true even when the epileptiform activity was of extremely high voltage, long duration, and spread quite readily to the opposite hemisphere. On the other hand the isolated epileptic tissue of the mirror focus was quite easily invaded by epileptiform activity arising ex- ternally (Fig. 23). In the experiment illustrated in Figure 23, tungsten micro- electrodes having tip diameters of 1-5 micra were inserted to a depth of 500-1000 micra into the isolated slab. Since a search for spontaneously firing units was rarely successful it was necessary to rely upon multiple placements at a depth where unit discharge might reasonably be expected in connection with surface electro- graphic paroxysms. Microelectrode recording was employed in Information Storage in Nerve Cells 219 2-4, 7-M Fig. 22. Failure of epileptiform after-discharge to invade non-epileptic neuronally isolated region. Unanesthetized rabbit. Implanted electrodes at sites indicated on diagram. Channel designations refer to the correspondingly numbered elec- trodes and denote grid 1 and grid 2 respectively. Electrical stimulus had been applied to cortical surface at site of electrode 4. The electrode wiUiin the isolated region (7) is connected to a reference (M) sewed into the cervical muscles. Calibration: 50 microvolts and one second. Normol corlex (1-21 Normol cortex (2-3) ^ Isolated cortex (4-M) Normol cortex Microelectrode Fig. 23. Propagation of epileptiform discharge into an isolated epileptic region. Two examples with both surface and simultaneous microelectrode recording. A pledget of filter-paper soaked in Metrazol was placed on normal cortex out- side the isolated slab at 2 cm. distance. The electrographic discharge so induced spread slowly across the cortex and after some delay invaded the isolated zone. Single units within the slab were recorded through tungsten microwires having a tip resistance of 10-40 megohms. Calibration: 50 microvolts and one second for the ink-writer tracings and one second for the cathode ray oscilloscope (39). 220 Information Storage and Neural Control order to avoid the ambiguity engendered when high amplitude potentials arising externally are conducted in volume to the large electrodes resting upon the surface of the isolated segment. Thus in the first part of the tracings in the two experiments illustrated in Figures 23A and B the large electrodes on the surface of isolated cortex (Channel 3, Fig. 23A and Channel 2, Fig. 23B) record potential variations precisely concordant in time with those in the surrounding normal cortex where the seizure was initiated. Not until seconds later did the microelectrode tracing reveal that single elements within the slab had developed high frequency self-sustained discharge. Since the high impedance of the micro- electrode tip precludes recording at any distance, we cannot escape the conclusion that nonsynaptic activation of ganglionic elements within the isolated region had occurred. Although only negative evidence can be presented for the case of non-epileptic isolated cortex, the contrast between that and the ease with which invasion of epileptic zones can be demon- strated has led to the conclusion that abnormal excitability per- sists in the secondary epileptogenic focus for several months after an isolation procedure which eliminated a self-reexcitation mech- anism. Presumably, therefore, the persistence of abnormal behavior in these cells depends upon structural or biochemical alterations rather than upon continuing electrical input. On the basis of the reasoning discussed earlier the ribonucleic acid distribution in the mirror focus was examined histochemically, first with the methyl green pyronin method and subsequently (with concordant results) with Azure B and Gallocyanin at acid pH. After preliminary electrical studies had clearly indicated the extent and distribution of both primary and secondary dis- charging areas the animals were sacrificed and brains perfused in situ. Serial sections were prepared and those from primary and secondary foci were compared with those from electrically un- involved areas of brain. Figure 24 demonstrates a small nest of darkly stained cells in a section taken from the electrically defined mirror focus. The border of the densely stained region is fairly sharp, and to the left is the adjacent normal cortex, so that one may compare the dye-binding property of normal cortical tissue with that of the electrically abnormal zone on the right. A slightly Information Storage in Nerve Cells 221 1 . ^ . i" ^' Fig. 24. Sections Lhruugh ilic region ul ilit- uiinur lucus. Note the collection of densely stained cells to the right of the photomicrograph compared with the characteristic staining of normal cortex to the left. Methyl green pyronin stain. Magnification x75 (39). higher power photomicrograph (Fig. 25) illustrates the pene- tration of the pyronin-positive material into the dendrite and also indicates the wedgelike distribution of the stained cell system. Pigmented cells extend throughout the depth of the cortex. At still higher magnification the extent of penetration into the dendrite is clearer (Fig. 26) and one may observe a concentration of the pyronin-positive material in a dense layer along the inner surface of the cell membrane. The altered tinctorial properties of these cells were abolished by pre-treatment of the slide with ribonuclease and were unaffected by similar treatment with deoxyribonuclease and other enzymes. Although the histochemical picture was some- what obscured by surgical artifacts the cells in the isolated mirror focus exhibited the same pyronin-dense-pattern as had the intact secondary region. Further controls may be found in the original report (39). 222 Information Storage and Neural Control ~ f ^ ' 'y / ./ Fig. 25. Slightly higher power photomicrograph through region of mirror focus. The appearance of normal cortical cells with this method is seen in the lower right and upper left hand corners. Methyl green pyronin stain. Magnifica- tion x85 (39). Interpretation of the histochemical results is still an entirely open question. The evidence is not sufficient to conclude that the alteration in RNA is specifically related to afferent bombardment since, although the general areas coincide, there is no way to know whether a given pyronin-dense cell has participated in the epileptiform activity. Furthermore the nature of the nucleotide- dye mole interrelation is still incompletely understood (52). Increased staining with basic dyes does not necessarily indicate an increase in absolute amount of RNA. It is also possible that changes in polymerization and possibly submolecular factors affecting charge distribution may influence dye-binding. Despite many areas of uncertainty the bulk of experimental evidence is consistent with the notion that except for certain plant viruses the nucleotide sequence in RNA is specified by genetic information in DNA. If the DNA-RNA specification hiformation Storage in Nerve Cells 223 ^ Fig. 26. Higher power photomicrograph demonstrating the characteristic con- centration of pyronin-positive material along the inner surface of the membrane. The stained material extends far into the dendrite. Note also the appearance of a bilobed nucleus. Methyl green pyronin stain. Magnification x840 (39). system were susceptible in a random way to ionic fluxes induced by nerve impulses the ordinary metabolic machinery of the cell would be rapidly undone. This could be avoided only if the nerve cell represented a special case of uncoupling of the DNA-RNA specification system, thus allowing a degree of freedom for the nucleotide sequence in RNA to be influenced by environmental factors. Or alternatively one might assume that only certain pre- selected molecules are available to influence by ionic flux. One may entertain the view that all possible RNA nucleotide sequences and their correspondingly coded proteins are already available within the cell. An incoming pattern of electrical impulses might select or re-orient some of these molecules at the expense of others. Availability of the stored information might be based upon a cellular "recognition" of the same pattern of impulse impingement 224 Information Storage and Neural Control or synaptic activation which established the original alteration. Such "recognition" may be similar in mechanism (still known) to those occurring in morphogenesis and in antigen-antibody reactions. However, it is well to be aware that when we substitute electric currents (whether or not generated by chemical trans- mitters) for the "antigen" we enter a realm of biological phe- nomena not based upon the classical chemistry of atoms and molecules — one in which electron or charge transfer reactions afford the more crucial energizing mechanisms. It is also apparent that those who would consider a role for the nucleic acids in the molecular basis of memory must also explain how an electrical current could induce a molecular rearrangement which is there- after irreversible and immune to further perturbations of its electrical surround. Perhaps the binding of an appropriately modified RNA protein complex to phospholipid would not only protect it from further electrical influence but also fix it to the cell membrane where the function of "recognition" is most likely to take place. Finally I should like to return to the beginning and add one more note of complexity to an already complex story. We have mentioned the retrograde amnesia produced by a cerebral con- cussion. Clinical experience gives clear evidence that immediately following an injury the memory loss may extend backwards in time for weeks, months, or even years, so that the patient reports his age as several years younger than is actually the case. During recovery the memory gap decreases gradually with recall of more distant events first and recent events last. Russell and Nathan, in an extensive review (50), have emphasized that the pattern of recovery shows no relationship to the importance of the events remembered. Thus one patient remained amnesic for his marriage, which had occurred three weeks prior to the injury, but recalled perfectly reading a trivial newspaper story six weeks earlier. It is clear that memory returns not in order of importance but only in order of time. To be sure, even under the best of circumstances recovery is never complete; it is almost always possible to demon- strate a complete and permanent loss of memory for the events immediately preceding an injury. Perhaps it is this last, brief, Injormation Storage in Nerve Cells 225 blank interval which is relevant to the electrical aspects of the memory mechanism discussed earlier. Yet the total recovery pat- tern in retrograde amnesia stresses the lability and vulnerability of the most recently acquired experience and suggests that it is not only the electrical or short-term aspects of memory which consolidate; some form of consolidation must also occur in the structural or "permanent" stage of information storage (12). A molecular mechanism for information storage must embrace all these features. As a provisional target we might envision a molecular species which may be altered by ionic flux but once altered is immune to other electrical interventions, which has nothing to do with basic metabolic processes, which can replicate itself within a cell and which can alter the output of that cell so as to disseminate its "spoor" to the next cell along the pathway. But to envision is not to identify. The target promises to be elusive. The analogy of the inirror focus may be rough indeed but it is pertinent to recall that the alterations observed in electrical and chemical properties are brought about through the same neural pathways available to physiological stimulations. No quantitative relationship between these data and the events responsible for behavior is implied. Perhaps there is no relationship at all. Never- theless, used as an experimental tool this model and the observa- tions it has yielded so far indicate that we have in hand, to see and to investigate, clear-cut and permanent changes in cellular and synaptic properties related to the past history of that cell or synapse. ACKNOWLEDGMENTS These studies were supported by U.S.P.H.S. grant B-3543. I wish to express my gratitude to the many individuals who have helped in various aspects of these investigations. Special appre- ciation is due to Dr. K. L. Clhow and Mr. Paul Naitoh for help in some of the experiments and to Professor Lincoln Moses for the statistical analysis. Gratitude is hardly the word to express the indebtedness to my wife, Dr. Lenore Morrell, whose forebearance with dinners grown cold and evenings in the laboratory made this work possible. 226 Information Storage and Neural Control REFERENCES 1. Abt, J. P., Essman, VV. B. and Jarvik, M. E.: Ether-induced retro- grade amnesia for one-trial conditioning in mice. Science, 733: 1477-1478, 1961. 2. Arduini, A.: Enduring potential changes evoked in the cerebral cortex by stimulation of brain stem reticular formation and thalamus. In: Reticular Formation of the Brain, edited by, Jasper, H. H., Proctor, L. D., Knighton, R. S., Noshay, W. C, and Costello, R. T. Boston, Little, Brown and Co., 1958, 333 pp. 3. Bishop, G. H., and O'Leary, J. L.: The effect of polarizing currents on cell potentials and their significance in the interpretation of central nervous system activity. Electroenceph. Clin. Neurophysiol., 2: 401-416, 1950. 4. Brattgard, S.: The importance of adequate stimulation for the chemical composition of retinal ganglion cells during early post- natal development. Acta Radiologica Supp. 96., 1952. 5. Brattgard, S. and Hyden, H.: Mass, lipids, pentose nucleoproteins and proteins determined in nerve cells by x-ray microradiography. Acta Radiologica Supp. 94, 1952. 6. Brazier, M. A. B.: Long-persisting electrical traces in the brain of man and their possible relationship to higher nervous activity. In: The Moscow Colloquium on Electroencephalography of Higher Nervous Activity, edited Ijy Jasper, H. H. and Smirnov, G. D. Electroenceph. Clin. NeurophysioL, Supp. 13, 1960. 7. Burns, B. D.: Some properties of the cat's isolated cerebral cortex. J. Physiol., 777.-50-68, 1951. 8. Burns, B. D.: The production of after-burts in isolated unanesthetized cerebral cortex. J. Physiol., 725.- 427-446, 1954. 9. Burns, B. D.: The mechanism of after-bursts in cerebral cortex. J. Physiol., 727.- 168-1 88, 1955. 10. Chow, K. L. and Dewson, James: Unpublished data. 11. Chow, K. L.: Brain Functions. Ann. Rev. Psychol., 72.-28 1-310, 1961. 12. Deutsch, J. A.: Higher nervous function: the physiological bases of memory. Ann. Rev. Physiol., 24:259-286, 1962. 13. Duncan, C. P.: The retroactive effect of electroshock on learning. J. Camp. Physiol. Psychol., 42:32-44, 1949. 14. Gerard, R. W.: The fixation of experience. In: Cioms Symposium on Brain Mechanisms and Learning, edited by Delafresnaye, J. F., Fessard, A., and Konorski, J. Oxford, Blackwell Scientific Pub., 1961. Information Storage in Nerve Cells 227 15. Gerard, R. W. and Libet, B.: The control of normal and convulsive brain potentials. Amer. J. Psychiat.. 96.- 1 125-1 152, 1940. 16. Goldring, S., and O'Leary, J. L.: Experimentally derived correlates between EGG and steady cortical potential. J. Neurophysiol., 14: 275-288, 1951. 17. Goldring, S., and O'Leary, J. L.: Summation of certain enduring sequelae of cortical activation in the rabbit. Electroenceph. Clin. Neurophysiol., J.-329-340, 1951. 18. Goldring, S. and OTeary, J. L.: Cortical D.C. changes incident to midline thalamic stimulation. Electroenceph. Clin. Neurophysiol., 9: 577, 1957. 19. Gumnit, R. J.: The distribution of direct current responses evoked by sounds in the auditory cortex of the cat. Electroenceph. Clin. Neurophysiol., 13:889-895, 1961. 20. Hamberger, C. A.: Gytochemical investigations on N. vestibularis. Acta Otolaryng. Supp., 78:53-62, 1949. 21. Hamberger, C. A. and Hyden, H.: Production of nucleoproteins in the vestibular ganglion. Acta Otolaryng. Supp. 75:53, 1949. 22. Hamberger, C. A. and Hyden, H.: Transneuronal chemical changes in Deiter's nucleus. Acta Otolaryng. Supp. 75:82, 1949. 23. Hammer, G.: On quantitative cytochemical changes in spiral gang- lion cells after acoustic trauma. Acta Otolaryng. Supp. 740.T37-144, 1958. 24. Hebb, D. O.: The Organization of Behavior. New York, Wiley, 1949. 25. Hyden, H.: Determination of mass of nerve-cell components. J. Embryol. Exp. Morph., 7;315-317, 1953. 26. Hyden, H.: Biochemical changes in glial cells and nerve cells at varying activity. In: Proc. Fourth International Congress of Bio- chemistry. Vol. Ill Biochemistry of the Central Nervous System. London, Pergamon Press, 1960. 27. Hyden, H. and Pigon, A.: A cytophysiological study of the functional relationship between oligodendroglial cells and nerve cells of Deiter's nucleus. J. Neurochem., 6:57-12, 1960. 28. lizuka, R., Takeda, T., Tanabe, M., Takahata, N. and Suwa, N.: Histochemical studies on the nucleic acid and thiamin of nerve cells under electroconvulsion, insulin coma and malononitrile injection. Folia Psychiat. Neurol. Jap., 13:\-\A, 1959. 29. John, E. R.: High nervous functions (Brain functions and learning). Ann. Rev. Physiol, 2J;451-484, 1961. 30. Katz, J. J. and Halstead, W. C: Protein organization and mental function. Comp. Psychol. Monog., 20:\-58, 1950. 228 Infort?iation Storage and Neural Control 31. Kraft, M. S., Oljrist, W. D. and Pribram, K. H.: The effect of irritative lesions of the striate cortex on learning of visual dis- criminations in monkeys. J. Cornp. Physiol. Psychol., 53:\7-22, 1960. 32. Kreps, E.: Cited in the use of radioactive isotopes in the study of functional biochemistry of the brain by Palladin, A. V. and Vladimirov, G. E. Proc. International Conference on the Peaceful Uses of Atomic Energy. Vol. 12, United Nations, New York, 1955. 33. Kristiansen, K. and Courtois, G.: Rhythmic electric activity from isolated cerebral cortex. Electroenceph. Clin. JVeurophysiol., 7;265-272, 1949. 34. Morrell, F.: Interseizure disturbances in focal epilepsy. Neurology, (5; 327-334, 1956. 35. Morrell, F.: Effects of experimental epilepsy on conditioned elec- trical potentials. Univ. oj Minn. Med. Bu'l., 29:82-102, 1957. 36. Morrell, F.: Experimental focal epilepsy in animals. AM A Arch. Neurol., 7:141-147, 1959. 37. Morrell, F.: Microelectrode and steady potential studies suggesting a dendritic locus of closure. In: The Moscow Colloquium on Electroencephalography of Higher Nervous Activity, edited by Jasper, H. H. and Smirnov, G. D. Electroenceph. Clin. Neurophysiol., Supp. 13, 1960. 38. Morrell, F.: Secondary epileptogenic lesions. Epilepsia 7;538-560, 1960. 39. Morrell, F.: Lasting changes in synaptic organization produced by continuous neuronal bombardment. In CIOMS Symposium on Brain Mechanisms and Learning, edited by Delafresnaye, J. F., Fessard, A., Konorski, J. Oxford, Blackwell Scientif. Pub., 1961. 40. Morrell, F.: Effect of anodal polarization on the firing pattern of single cortical cells. Ann. New Tor k Acad, of Sci., 92:860-87 6, 1961. 41. Morrell, F.: Electrophysiological contributions to the neural basis of learning. Physiol. Rev., 47.-443-494, 1961. 42. Morrell, F., Roberts, L. and Jasper, H.: Effect of focal epileptogenic lesions and their ablation upon conditioned electrical responses of the brain in the monkey. Electroenceph. Clin. Neurophysiol., 8:2X1- 236, 1956. 43. Morrell, F., Sandler, B. and Ross, G.: The "Mirror Focus" as a Model of Neural Learning. Proc. XXI Internat. Physiol. Cong., Buenos Aires, 1959, 193 pp. 44. Ransmeier, R. E. and Gerard, R. W. : Effects of temperature, con- vulsion and metabolic factors on rodent memory and EEG. Amer. J. Physiol., 779;663-664, 1954. Information Storage in Nerve Cells 229 45. Roitbak, A. I.: Bioelectric Phenomena in the Cortex of the Cerebral Hemispheres. Part 1 Acad. Sci. Georgian SSR, Tiflis, 1955. 46. Roitbak, A. I.: Electrical phenomena in the cerebral corte-x during the extinction of orientation and conditioned reflexes. In: The Mo.scow Colloquium on Electroencephalography of Higher Ner- vous Activity, edited by Jasper, H. H., and Smirnov, G. D. Elec- troenceph. Clin. NeurophysioL, Supp. 13, 1960. 47. Rowland, V.: Second Conference on Brain and Behavior, edited by Brazier, M. A. B. In press. 48. Rusinov, V. S.: An electrophysiological analysis of the connecting function in the cerebral cortex in the presence of a dominant area. Communications at the XIX International Physiological Congress., Montreal, 1953. 49. Rusinov, V. S.: General and localized alterations in the electro- encephalogram during the formation of conditioned reflexes in man. In: The Moscow Colloquium on Electroencephalography of Higher Nervous Activity, edited by Jasper, H. H. and Smirnov, G. D. Electroenceph. Clin. NeurophysioL, Supp. 13, 1960. 50. Russell, W. R. and Nathan, P. W.: Traumatic Amnesia. Brain, 69: 280-300, 1946. 51. Stamm, J. S. and Pribram, K. H.: EfTects of epileptogenic lesions in frontal cortex on learning and retention in monkeys. J . Neuro- physioL, 2J;552-563, 1960. 52. Steiner, R. F. and Beers, R. F., Jr.: Polynucleotides. Natural and Synthetic Nucleic Acids. Amsterdam, Elsevier Pub. Co., 1961. 53. Thompson, R.: The comparative effects of ECS and anoxia on memory. J. Comp. Physiol. Psychol., 50.-397-400, 1957. 54. Thompson, R. and Dean, W.: A further study on the retroactive eflfect of ECS. J. Comp. Physiol. Psychol., 45.-488-491, 1955. 55. Tobias, Julian M.: Experimentally altered structure related to function in the lobster axon with an extrapolation to molecular mechanisms in excitation. J. Cell. Comp. Physiol., 52.-89-126, 1958. 56. Vladimirov, G. E.: The influence of excitation of the central nervous system on some aspects of metabolism in the cerebral hemispheres of animals. Proc. XIX Internat. Physiol. Cong. Montreal, pp. 856-857, 1953. CHAPTER X HOW CAN MODELS FROM INFORMATION THEORY BE USED IN NEUROPHYSIOLOGY?* Mary A. B. Brazier w. HY is it that information theory has had such an attraction for neurophysiologists? From the earliest dissemination among scien- tists of Shannon's information theory (9), developed in the context of communications technology, and of Wiener's communication theory (14) which expanded its frontiers, neurophysiologists have been prominent among those who wished to explore the potenti- alities for their field. There were many reasons for this, but I would suggest that there were three major ones, namely: 1) de-emphasis on energy-coupling within systems and emphasis on informational coupling; 2) the formulation of models for dealing with signals-in-noise; and 3) the exploration of piobabilistic models rather than determin- istic ones. These are, of course, all interrelated. I would like to discuss the first two items briefly, together with some other facets of information theory that impinge on neurophysiology, and then give more detailed attention to the subject of probabilistic models of nervous system activity. *The work reported here was supported by USPHS Grant NB-03160 and Contract Nonr 233(69) from the Office of Naval Research. 230 How Can Models Jrom Information Theory he Used in Neurophysiology)? 231 Let us look first at the difference between energy-based concepts and information-based concepts. All down the ages the nerves have been recognized as message-carriers and as late as the last century the most distinguished physiologist of the time, Johannes Miiller, was using the term "nerve energy.'' "We are," he wrote, "com- pelled to ascribe, with Aristotle, peculiar energies to each nerve, energies which are vital qualities of the nerve." Even the later 19th century neurophysiology, dominated by Du Bois Reymond, was primarily focussed on the concept of the conservation of energy. You will remember that it was because of his adherence to energy concepts that Sherrington (10) found himself unable to envisage a physiological basis for mental processes. In Man on His Nature he wrote: "No attributes of 'energy' seem findable in the process of mind. That absence hampers explanation of the tie between cerebral and mental." He goes on to write of the brain being "a physiological entity held together by energy-relations" and expresses his despair of being able to correlate such a physiological entity with a mental experi- ence. "The two for all I can do," he wrote "remain refractorily apart. They seem to me disparate; not mutually convertible; un- translatable the one into the other." Coming to our own times, we have seen a great deal of investiga- tive effort go into a search for energy correlates of brain function. An example is the search for a metabolic change underlying the sleep state. Anesthesiology is another field in which one finds many studies centering around alterations of brain metabolism as the major factor of importance in the changing levels of consciousness. It is only with recent years that we find attention being diverted from the question "What is the level of activity in the brain as a whole?" to "Which system within the brain is now dominantly active?" The latter question contains the implication that it is a re-routing of nerve impulses, a change in the informational coupling rather than in the general metabolic level of the brain's activity that may yield the clue to functional changes. In order to effect a coupling of parts within the nervous system there does not have to be a great interchange of energy— only the infinitesimal transfer concomitant with the passage of the nerve impulse. 232 Information Storage and Neural Control Taking" the examples just given, neurophysiologists are finding closer correlates with the states of sleep and anesthesia from studies of the coupling between the cortex, the thalamus and the brain stem than they have found in their measurements of energy-transfer reflected in arterio-venous differences between carotid and jugular blood. In the limbic system there is now evidence for re-routing taking place during the learning process in animals being trained in a T-box (1). Many other examples could be quoted. The second attraction that I mentioned was the way information theory handles the problem of signals-in-noise; but here, neuro- physiologists generally use this term in the vernacular rather than in its critically defined sense. This is because we do not usually apply the criteria for randomness when speaking of biological noise. As a matter of fact, many use the term 'noise' in quite the opposite sense from that defined by mathematical theory. In the neuro- physiological journals we frequently find 'noise' used to describe disorderly, unpredictable activity in which no regularity can be detected. On the other hand, the mathematical approach (5, 11) has a very clear-cut set of criteria for random processes — criteria based on probability distributions that effectively result in statistical regu- larity, statistical orderliness and statistical predictability. The whole gamut of criteria for a mathematical model of random processes would be very difficult to apply to the nervous system, but already some consideration has been given to this problem (6). The probability functions that have seemed to be the least difficult to carry from the mathematical model into the 'real' nervous system have been those of means, spectra and correlation functions. These comparatively simple factors bring us only to a limited and fractional descriptive usefulness, and hence an increasing number of neurophysiologists are exploring this approach. The statistical regularity of a random process bears considerable interest for the neurophysiologist because of his familiarity with the concept of the statistically steady state that has earned itself the name of homeostasis. The fact that the brain, in its evolution, has reached a stage in man where the neuronal mechanisms for homeo- static control of his milieu interieur are handled by his medullary brain stem, frees the cortex from these concerns and reserves it for How Can Models from Injormation Theory be Used in Neurophysiology? 233 higher functions, thus giving man what Claude Bernard, in his famous phrase, called "la condition de la vie libre." This concept of a statistically regular, predictable randomness of 'noise' against which the neurophysiologist emphasizes his 'signal' when averaging by computers, has a close relationship to one of the most basic principles of information theory. This prin- ciple is that information is carried by departure from orderliness or, in other words, by departure from the predictable. Even the intuitive concept of information is a change from what you already know and can predict. Several neurophysiologists have now invoked this principle to explain such phenomena as "attention" and "habituation" and "the orienting reflex," together with their attendant electrical concomitants. One such example is the model proposed by Sokolov, (12) which envisages novelty, i.e., departure from the statistically expected state, as being the factor that evokes activity in the brain stem and the resultant orienting reflex. The concept of "attention" being related to matching the probability of a neuronal event against the expected distribution of possible events, will be found in the work of many neurophysiologists. * This brings us to the third major attraction of information theory for the neurophysiologist: the use of a probabilistic model for the nervous system rather than a deterministic one. I would like to approach this from the neurophysiologist's angle. I have spoken earlier about Johannes Miiller and you will re- member his famous "Law of Specific Nerve Energies" by which each of the myriad facets of sensation was assigned its special nerve — how uneconomical, but how simple. A single output would be obtained from a single input. Nothing could be more deterministic. One could design no simpler code. But it was too good to be true. In the earlier part of this century, the belief in a ubiquitous all-or-nothing law for the nervous system and the demonstrations of the coding of intensity by frequency of discharge in single fibers of the peripheral nervous system, led eventually to exploration of single cell discharges within the brain itself. Iminediately, it became clear that coding was no simple problem. Miiller would have been chagrined to see how many difl'erent "For early examples see references (2) and (7). 234 Information Storage and Neural Control peripheral loci could fire an individual cell in the brain. Many examples drawn from the somatic and other sensory systems have been published, but Miiller would have been even more dismayed had he been shown that there are cells within the brain that are no respecters of sense modalities. Proof has been given of con- vergence of sensory modalities onto individual neurons of the midbrain and thalamic brain stem and onto units in the limbic system. Even cortical neurons are not simon-pure. The complexities do not cease there, for such coding as can be established for individual presynaptic fibers is found to be trans- formed at the synapse and to send on a different pattern of discharge in the postsynaptic output. Some of the exquisite response patterns that investigators have been able to identify in the primary neurons from the receptors are therefore only one link in a chain of se- quential codes. Moreover, the evidence that recoding in post- synaptic fibers varies in different neuronal aggregates is over- whelming, for there is great variation in the degree of convergence and divergence. The biologist has long known how rare is a one-to-one relation- ship between input and output of a synapse but is now beginning to realize that the relationships may not even be linear. He may well have to wait for the mathematicians to progress further with their analyses of nonlinear systems before he himself can master the transformation characteristics of the code as it passes seriatim through a chain of synaptic relays. As an example of changing code, the findings of Whitfield (13) in the auditory system may be quoted. Whitfield found that the rate of firing becomes progressively less as the impulses proceed through the serial synaptic stations on their way to the cortex. Moreover, the rate of unit firing becomes less and less dependent on the strength of the stimulus at each successive relay station. In other words, the intensity of the stimulus is no longer being signalled simply by frequency of discharge. The coding has changed and some clues to its nature are already known. These point to the distribution of excitation and inhibition among the fibers of the pathways as being a crucial factor. Although in this outline which poses the problem, all the facets that must enter into any consideration of neural coding cannot be How Can Models from Information Theory be Used in Neurophysiology? 235 enumerated, the phenomenon of lateral interaction among mem- bers of a neuronal population should not escape attention. This is seen, perhaps most strikingly, in the zone of inhibition that develops around a locus of excitation. The "inhibitory surround" has now been demonstrated for the visual, auditory, and somatic afferent systems and emphasizes complex patterns of interaction rather than conduction over isolated paths. There is also the interference with direct routing of impulses from the receptor to the cortex, mediated by recurrent collaterals and centrifugal feed-back control over afferent pathways. Of two more contributions to knowledge which have added to the neurophysiologist's task, one is the realization that the all-or-nothing discharge is a comparatively rare event in the central nervous system, graded responses (which may or may not lead to cell discharges) being all important. Wliat of these graded changes? How do they affect the code? One aspect of the problem has been approached by the analysis of the intervals between nerve discharges. Although the action potential of an axon is all-or-nothing and hence digital, the graded, analog character of the receptor's action can be preserved in the code by the intervals between discharges, for intervals between spikes are continuously variable and therefore can transmit graded input. In fact, a great deal of work in many laboratories is cur- rently being devoted to pulse-interval analysis of the message set up by stimulation of receptors. More difficult is the problem of graded delivery of the inessage at the higher cerebral level where its result may be effector cell discharge, passage into association areas, passage into storage neurons changing their cellular function, modulation of other currently incoming messages, or dissipation of a type about which we have, as yet, no knowledge. Graded responses in dendrites do not necessarily induce discharges of their cell bodies; nevertheless, their influence as modulators inay be critical for the "meaning'' of the message. Last, but not least in importance, is the evidence for a great deal of endogenous discharge of many neurons of the central nervous system in the absence of overt external stimuli. How is the brain to select those discharges that are evoked by messages initiated in its environment from those that form its background activity? 236 Information Storage and Neural Control If this selection can be made, on what grounds is a resukant efferent discharge determined? No one yet knows what the mechanism effecting these decisions may be, but it has occurred to many that discriminations may be made by the brain on a statistical basis, i.e., on the probability that the afferent patterns are significantly different from those which are currently taking place in the brain or which its past experience has set its neurons to "expect" by a change in their cellular function. This statistical viewpoint may be defined as the "probabilistic" model in contrast to a "deterministic" one in which a given stimulus elicits a stereotyped response irrespective of the likelihood of its occurrence. The probabilistic approach recognizes the need for the brain to assign iinportance to those signals which require effector action and suggests that this assay of importance must be on a basis of the probability of the signal not being a chance variation. With all the on-going discharges of cerebral neurons that workers with microelectrodes have so convincingly demonstrated, some pro- cedure must surely take place before a 'meaningful' signal can be selected from this incessant activity. No assessment of probability can be made without averaging. Therefore, those who have begun to explore a statistical model for coding in the nervous system have turned to techniques for averag- ing neuroelectric activity over the passage of time as well as over space as represented by neuronal aggregates. To aid in this task many have adopted a prosthesis Just as the microanatomist has adopted the microscope as a prosthesis to enrich his visual ability, so has the neurophysiologist begun to use the computer as a pros- thesis for his calculating ability. The statistical characteristics of spontaneously discharging neurons must be known to the brain before it can react appropri- ately to an unexpected, meaningful signal requiring action. One might even speculate that the nonresponding but spontaneously discharging neurons that so many observers have found with their microelectrodes, are "comparison" generators and the responding neurons "information" generators. If this were so, only when the normally expected difference between the two categories of genera- How Can Models from InJormatio7i Theory be Used in Neurophysiology? 237 tor activity was exceeded by a statistically significant amount would the incoming" signal be meaningful. To analyze the myriad complexities of the brain's function by nonstatistical description of unit discharges is too gigantic a task to be conceived, but exploration in terms of probability theory is both practical and rational. In characterizing nervous activity, therefore, one would not attempt the precise definition that arithmetic demands but would seek the statistical characteristics of the phenomena that appear to be relevant. The margin of safety that the brain has for appropri- ate reaction is thus much greater than a deterministic, arithmetic- ally precise operation would impose. Chaos would result from the least slip-up of the latter, whereas only a major divergence from the mean would disturb a system working on a probabilistic basis. The rigidity of arithmetic is not for the brain, and a search for a deterministic code based on arithmetical precision is surely doomed to disappointment. Turning now to the scanty data which are all that today's neuro- physiologist has as yet. In terms of actual data culled from the brain I propose to mention only two categories here: 1 ) The averaging, over time, of intervals between unit discharges in the brain; 2) The averaging, over time and space, of activity in neuronal aggregates. An example of averaging units over time, the first category, is the work of Mountcastle (8) in which he has been designing experi- ments to test the hypothesis that an intracortical mechanism exists which integrates frequency over short periods of time and responds only when intervals of sufficient brevity occur. These experiments have revealed a striking change in pulse-interval distribution in circumstances that give support to this hypothesis. This investigation is alluded to so briefly at this time because it is being quoted solely as an example of the first category of statistical approach, i.e., averaging over time only. But the central nervous system must have some mechanism for dealing with multiple complex inflow, and it would seem more profitable to expand this approach to the second category of study that I men- 238 Information Storage and Neural Control tioned; namely, averaging" not only over time but over neural aggregates, in order to get the profile of a population of neurons. This is of particular importance in the brain because of the demon- strated interaction of units within populations. This second ap- proach necessitates the use of electrodes large enough to record from populations of neurons and thus able to average over space as well as time. I will illustrate this approach by brief mention of some examples drawn from our laboratory. Suppose we take the response of an unanesthetized cat to a flash of light that is repeated monoton- ously without any change in timing, or intensity, or in any other of its parameters. At the beginning of a train of such stimuli, the message the brain will receive will contain at least three major components; 1) the stimulus is visual, 2) the stimulus is repetitive, 3) the stimulus is novel. On prolonged repetition, however, the third of these messages (that the stimulus is novel) is no longer being sent. The probability of its arrival is now very high. If the hypothesis is to be regarded as tenable, one of the tests the neurophysiologist must make is a demonstration that the response of the brain to a novel stimulus is difTerent on the average from its reponse to a familiar one. What would be demanded by the hypothesis under discussion? Averages of a sample of responses late in the series would be ex- pected to carry two of the same components of the message as are carried by the first set of flashes; namely, that the stimulus is visual and that it is repetitive, but the third component, i.e., that the stimulus is novel, would need some change of signal. When the responses to a repetitive train of flashes are recorded from the visual cortex of an unanesthetized cat with permanently implanted electrodes, one finds that the short latency responses that have been identified with transmission in the specific aflferent systems persist for the whole duration of the train. They apparently carry the first two components of the message (that the stimulus is visual and that it is being repeated). How Can Adodels from Information Theory be Used in Neurophysiology? 239 At the beginning of such a train of flashes there are also long latency responses in the visual cortex. These have been shown to reach the cortex by the nonspecific afferent systems of the midline brain stem and thalamus. It would seem possible that tlie third major component of the message, the one signalling 'novelty' in the stimulus, may be carried by these nonspecific afferents, for as repetition continues, this sequence of later waves fades out. Averag- ing of the first sixty to arrive, then the second sixty, the third sixty, and so on, shows this late component of the multiple response to be dropping out as the novelty wears off. The effect can be fractionated even farther in the nonspecific system by actually recording in a nucleus of this midline nonspecific system (the centre mechan) where one of the most prominent of its average electrical responses to flash (the late wave) can be seen to fail with repetition of the unchanging stimulus, while the earlier components persist. The serial change in the late component of the multiple response is very marked. Whatever the mechanism for this depressed respon- siveness may prove to be, it is tempting to propose that this forms part of the mechanism that conveys presence or absence of novelty. This work has been described in detail and illustrated elsewhere, (3, 4) so it will not be given more space here. However, lest these examples appear to suggest too simple a picture of the brain's message-receiving systems, let me add that not only does one find presence or absence of a component of the response, as novelty wears off, as in the foregoing sample, but one also finds situations in which the time-relationships of the components of the brain's electrical responses may change. Possibly it is in the time domain that the neurophysiologist will find the most clues for the solution of this problem. I make only a brief allusion here to the laboratory work. This is not intended as the report of a research, but as an example of work carried out with a probabilistic model in mind, and to illus- trate the point that the response probabilities of the nervous system are influenced by the past events it has experienced. Returning now to the more general topic of the utilization by neurophysiology of information theory, let us not forget that one of the innovations of information theory was the axiom that in- 240 Information Storage and Neural Control formation is measurable, and tliat, in fact, Shannon in his classical paper gave a precise mathematical definition of information. It is so difficult to define information measures for ensembles in biology that most biologists who use information theory usually do not attempt to do so in a quantitative way. Generally, they do not actually measure the information; and hence, they fail to exploit the full potentialities of the theory. Yet many feel that someday, somehow, more exactly defined information measures may be brought into neurophysiology. I need only mention as an example Shannon's formulation of the problem of channel capacity and his solution for dealing with equivocation. C^hannel capacity is surely a basic factor in the communication functions of the nervous system. It is so tempting to think of information transfer in the brain as being simply a matter of transniission in specific nerve tracts. If this simple-minded concept could for one moment be defended, one would then begin to study such communication channels in terms of the finite set of signals that can be initiated in the channel, the set of signals that arrives and the probability of the reception of any given signal. If only the brain worked like a simple telegraph system we would immediately be able to make precise statements about such things as channel noise and would be able to calculate channel capacity. In contrast, all the work that the neurophysiologists have pursued has revealed to us the enormity of interaction within the brain — the correlations, couplings, linkages, and statistically inter- dependent elements that contribute to its organization and make any measurement of its interacting ensembles or any mathematical statement of its entropy conditions formidable in the extreme. In closing, let me say that the application of quantitative in- formation theory to neurophysiology lies largely in the future. Possibly a partial answer to the question in the title of this paper is that if information theory has not led to the uncovering of many new facts in neurophysiology, it may have led to many new ideas. REFERENCES 1. Adey, W. R., Dunlop, C. W., Hendrix, C. E.: Hippocampal slow waves; distribution and phase relations in the course of approach learning. AM A Arch. Neurol., 3.-74-90, 1960. How Can Models from Information Theory be Used in Neurophysiology? 241 2. Bates, J. A. V.: Significance of information theory to neui-ophysiology. In: Information Theory Symposium. London, 1950, p. 137. 3. Bi'azier, M. A. B.: Responses in non-specific systems as studied by averaging teclmiques. In: Specific and Unspecific Mechanisms of Sensory-Motor Integration, Ed. G. Moruzzi (in press). 4. Brazier, M. A. B.: Information carrying characteristics of brain responses. In: The Physiological Basis of Mental Activity, Ed. R. Hernandez-Peon (in press). 5. Davenport, W. F., Root, W. L.: An Introduction to the Theory of Ran- dom Signals and Noise. New York, McGraw-Hill, 1958. 6. Goldstein, M. H.: Averaging techniques applied to evoked responses. In: Computer Techniques in EEC Analysis, Ed. M. A. B. Brazier, EEG. Clin. Neurophysiol., Supp. 20, 1962, p. 59. 7. Grey Walter, W.: In: Brain Mechanisms and Consciousness, Ed. J. F. Delafresnaye, Oxford, Blackwell Scientific Publications, 1954, p. 372. 8. Mountcastle, V. B.: Duality of function in the somatic aflferent system. In: Brain and Behavior. Ed. M. A. B. Brazier, Washington, D. C., American Institute of Biological Sciences, 1961, p. 67. 9. Shannon, C. W.: A mathematical theory of communication. Bell System Tech. J., 27.-379-424; 623-657, 1948. 10. Sherrington, C. S.: Man on His Nature. London, Cambridge University Press, 1951. 11. Siebert, W. M.: The description of random processes. In: Processing of Neuroelectric Data, Tech. Report 351, Communications Bio- physics, RLE MIT, 1959, p. 66. 12. Sokolov, E. N.: Neuronal model and the orienting reflex. In: Central .Kervous System and Behavior, Ed. M. A. B. Brazier, New York, Josiah Macy, Jr. Foundation, 1960, p. 187. 13. Whitfield, I. C: The physiology of hearing. In: Progress in Biophysics and Biophysical Chemistry 8:\-Al , 1957. 14. Wiener, N.: Cybernetics, New York, John Wiley, 1948. DISCUSSION OF CHAPTER X Harold W. Shipton (Iowa City, Iowa): In the averaged record that you showed, were the stimuli being delivered ran- domly, or were they as regular as you could make diem? Would you care to comment on whether the time characteristics of the external drive signal appear to you to be important in the con- struction of the model? 242 Information Storage and Neural Control Mary A. B. Brazier (Los Angeles, California) : They were certainly not absolutely random. On the contrary, the intervals between stimuli were as constant as we could make tliem. In an experiment such as tliis, there is very great difficulty for the neurophysiologist because the responses depend so much on the state of the animal. Although one would like to have a longer interval between stimuli, it is, in my experience, almost impossible to hold an animal in the same stage of the sleep-wakefulness continuum for as many stimuli as we use, if the interval between flashes is longer than one second. L. M. N. Bach (New Orleans, Louisiana) : I am curious about the disappearance of the second component in tlie centrum medianum response with repetitive stimulation as a possible inverted index of post-tetanic potentiation. Do you consider it a testable proposition that the disappearance of the second com- ponent could be correlated with post-tetanic potentiation, or do you consider tliat there is no relationship at all? Brazier: It should be testable, but it is rather difficult to design an experiment in wliich to test this. Gregory Bateson (Palo Alto, California): Would you have expected the part of the signal which denotes novelty to follow the other two components? Would it not have been a better arrangement to have the system, when it had diagnosed novelty, transmit the information ahead of the other components of the signal? Brazier: I had no "expectation," tliough now that you raise the question, would you not expect the brain to need to receive the signal before it could assess its novelty? What you have sug- gested would make a very good design for a communication system, although the nervous system does not appear to be de- signed in this manner. T. CHAPTER XI NEURAL MECHANISMS OF DECISION MAKING* E. Roy John, Ph.D.** GENERAL CONSIDERATIONS ABOUT MEMORY HIS paper is largely concerned with the rather specialized decision-making involved when a cat decides which of two previ- ously experienced frequencies of flickering light is being pre- sented. Since the constituent flashes of the two flicker frec[uencies are identical, such decision-making, or differential discrimination, would seem difficult to perform on the basis of the instantaneous quality of the stimulus. In contrast to existential discriminations, based on the presence or absence of a stimulus, differential dis- crimination of this sort logically would seem to require the nervous system to analyze the temporal sequence, or pattern, of stimulation. Although one can conceive of possible alternate niechanisms for the mediation of such behavior by time-measuring devices or filter networks, a plausible mechanism for the analysis of sequential stimuli would be a coincidence detector which compared patterns of incoming activity with patterns generated by a stored representa- tion of previously experienced sequences — a memory. This hy- pothesis, with some relevant electrophysiological evidence, has been presented in detail elsewhere (7, 9). My purpose here is to review *The work described in this paper was supported in part by Research Grant MY-2972 from the National Institute of Mental Health, and Grant G21831 from the National Science Foundation. **The author takes pleasure in acknowledging the kindness of Marc Weiss for making available the data shown in Figures 6, 7. 8, and 9, and the assistance of Arnold L. Leiman and Anthony L. F. Gorman in acquisition of portions of the data here reported. 243 244 Information Storage and Neural Control that evidence and to supplement it with a number of recent findings in our laboratories which will also serve to illustrate some technical innovations we were utilizing for these purposes. Before I undertake this task, I wish to emphasize that the hy- pothesis stated does not imply the mediation of memory by regenerative electrical activity. The large literature on the con- solidation process reviewed recently (4) shows that there are at least two phases of memory storage: 1) An early, labile consolida- tion phase, in which the representation of a recent experience is susceptible to severe interference or destruction by numerous chemical or electrical perturbations, and during which memory may well consist of persisting electrical patterns of a reverberatory sort; and, 2) a later stable phase in which such perturbations have no effect, and during which memory is stored in some other fashion, perhaps as a structural modification. This necessitates a coupling mechanism whereby the reverberatory electrical activity main- tained during the consolidation phase gradually stipulates the structural change which will serve to represent it. A number of workers have discussed the possibility that such structural changes might be the specification of macromolecular configurations (5, 6, 25); and, as Dr. Morrell has told you, a number of labora- tories, including his and mine, have presented data suggesting that ribonucleic acid (RNA) may play a role in this function (1, 2, 3, 11, 18). Whether or not RNA does participate in long-terin memory storage, it seems reasonable at present to assume that some form of long-term structurally mediated storage does exist. Various data seem to require, further, that the postulated coupling between electrical patterns and the long-term storage device be reversible — that the pattern of iterated or sustained electrical activity stipulate some representational structural modification, and that this structural modification be able to generate an elec- trical pattern identical to the one which established it. Time does not permit detailed review here of the evidence which I believe is relevant to the dynamics by which such a representa- tional system is built, but such a detailed discussion has been pre- sented elsewhere (7). For our present purposes, I hope it will suflSce to summarize what I consider to be the salient characteristics of these representational systems: 1) The repeated occurrence of as- Neural Mechanisms of Decision Making 245 sociated neural activity in anatomically extensive regions of the nervous system causes a functional relationship to become estab- lished between these regions. Subsequent to such association, stimulation of one region causes a response to occur in other regions, although this response did not occur before the associated activity; 2) such altered response relationship cannot be interpreted as merely a reflection of altered threshold, since Morrell has shown that the new response is differential, and is displayed only to the stimulus to which the association was established and not to closely similar stimuli; and 3) discharge can occur from such a representa- tional system with a temporal pattern which reflects the pattern of stimulation while it was established. TRACER STIMULI, LABELED POTENTIALS, AND INFORMATION The bulk of the data which I wish to present here was obtained in studies of changes in the electrophysiological response to inter- mittent stimuli during the establishment of conditioned responses. The technique, used most profitably before us by Livanov and Polyakov (14), was applied by Killam and me in our studies of conditioned avoidance and approach responses in cats (8, 9). We reasoned that, whatever the nature of the new responses established in the brain during conditioning, such responses should appear fairly reliably whenever the stimulus was presented. We selected an intermittent light flash, which we called a "tracer conditioned stimulus" (TCS), and searched the electrical activity of the brain for the appearance of waveforms at the frequency of the TCS, which were called "labeled potentials." Such a procedure greatly en- hances one's ability to detect stimulus-bound signal in the midst of the tremendous amount of ongoing business in the nervous system. The appearance of labeled potentials in a structure during the presentation of a TCS is sufficient evidence to conclude that in- formation about the TCS is reaching that structure. It is clear that such labeled potentials are not necessary for a structure to be in- fluenced. A structure which shows no labeled potentials can be receiving information about a TCS. 246 Itiformation Storage and Neural Control In recent reviews, both Morrell (19) and I (6) have summarized the large amount of data obtained in many laboratories from many different species of experimental animals, showing that striking- changes in the amplitude and distribution of labeled potentials take place during the establishment of conditioned responses to intermittent stimuli. Although the appearance of labeled potentials in a structure justifies the conclusion that information about the presentation of a TCS is reaching that structure, one cannot assume that such labeled potentials actually are the neural coding of information about stimulus frequency. Labeled potentials may simply be nonfunctional correlates of the actual processing by nerve cells of otherwise coded information about the TCS. Con- versely, one cannot prove on the basis of present evidence that labeled potentials are not the effective neural code for stimulus frequency. ASSIMILATION AND MEMORY TRACE Many phenomena observed in earlier work directed my at- tention to this problem because they suggested a functional role for labeled potentials. The first of these phenomena was called "assimilation of the rhythm" by Livanov (14), who first observed it. It has since been described by many workers utilizing various species in diverse experimental situations (6, 19). If one studies the resting electrical activity of various brain regions in an animal learning a conditioned response to an intermittent stimulus, one observes that during the intertrial intervals a marked hypersyn- chrony appears at the stimulus frequency, or at a harmonic thereof. This spontaneous, frequency-specific activity can dominate the resting electrical activity in early stages of learning. In our experi- ence, it tends to diminish and disappear as the conditioned response becomes well established, but will return briefly following per- formance of an erroneous response. Figure 1 illustrates assimilation and is taken from a paper by Killam and me (8). Note that the slow hypersynchrony, in this case at one-half the stimulus fr-equency, appears in the reticular formation, fornix, and septum in close relationship. Assimilated rhythms, in our experience, appear earliest, are most marked, and persist longest in nonspecific Neural Mechanisms of Decision Making SPOHTAWOUS ACTIvmr AT WFFEREMT STAGES OF TNAININO 247 ... ^ I. M TNAINM Mr 10%) KMh TMUNMa DAY (24%) I I 20t« TKAININS Mr (*9%l Fig. 1. CON — Bipolar transcortical (visual) derivation IPSI — Bipolar derivation from the same optic gyrus RF — Midbrain reticular formation SUP COLL — Superior colliculus FX — Fornix SEP — Septum AUD — Auditory cortex AMYG — Lateral amygdaloid complex POST HIPP — Dorsal hippocampus "Assimilation of rhythm" during avoidance training using ten per second flicker as conditioned stimulus. (From John, E. R. and Killam, K. F.: J. Pharmacol. Exp. Therap., 725:252-274, 1959.) 248 Information Storage and Neural Control regions. Other workers have reported that assimilation of the rhythm appears only in the training situation and is not observed when the animal is in his home cage. Such observations demonstrate that the brain has the capacity to generate temporal patterns of potentials which are substantially the same as those elicited by a previously experienced stimulus. Although obtained under different experimental conditions, the phenomena described by Morrell and his colleagues in studies of cortical conditioning (17, 20, 21) and by Stern et al. (24) in their studies of trace conditioning provide additional evidence of this capacity. It is of interest that one can see these assimilated rhythms appearing with apparent simultaneity in regions relatively distant from each other, as if an anatomically extensive system were activated. One might reasonably ask whether such sustained patterns in the absence of a previously experienced stimulus do not reflect neural processes which represent that ex- perience. These endogenously generated potentials may be a mani- festation of the elusive "memory trace." ASSIMILATED RHYTHMS AND GENERALIZATION Some evidence presented earlier by Killam and me is com- patible with a functional role for such endogenously generated frequency-specific patterns. We observed that an animal trained to perform a conditioned avoidance response to a ten per second flickering light characteristically displayed twenty per second po- tentials in the visual cortex, as seen in Figure 2. On presentation of a seven per second flicker after the animal reached criterion to the ten per second flicker, the animal showed evidence of generaliza- tion by repeatedly performing the conditioned response to the new stimulus frequency. Examination of the electrical records showed that the response of visual cortex to the seven per second flicker was a twenty per second potential, as is visible in Figure 3A. The arrow denotes the beginning of the behavioral response. After repeated presentation of the seven per second flicker, the animal no longer performed the generalized response but sat quietly. At this time, the seven per second flicker elicited pre- dominantly seven per second activity in the visual cortex. Presenta- tion of the original ten per second conditioned stimulus at this point failed to elicit performance of the conditioned response for Neural Mechanisms of Decision Making 249 SIGNAL ^AAA/VV^AAA-/V^AAAAJ^AAAAAAAAAAAAAA ipsi vis ^ SEP AUD I VA f- ^ AMYG I Fig. 2. LG — Lateral geniculate VA — Nucleus ventralis anterior. Abbreviations otherwise as in Figure 1. Characteristic electrical response on presentation of the ten per second flicker conditioned stimulus to the fully trained animal (100% performance). (From John, E. R. and Killam. K. F.: J. Pharmacol. Exp. Therap., 125:2S2-274, 1959.) several trials, during which a slow wave at about seven per second could be observed in cortex, as shown in Figure 3B. When per- formance reappeared to the ten per second stimulus, twenty-per- second potentials again were elicited in cortex. Another example of this is provided by Figure 4, which illustrates recordings obtained during generalization to a ten per second flicker by a cat previously trained to perform a conditioned avoid- ance response to a four per second flicker. Note that upon presenta- 250 Information Storage and Neural Control C O s ^ it: „ TD O -a c o V «J c o .i^ he ^ C C 2 3 II o •-: V -5 aj |j-l a. o ^ m s c .o; 03 lU be — . bO C .i3 ^ 3 C O i3 o <^1 en o -S^ r^ ^ C o S < bb I- ' Neural Mechanisms of Decision Making 251 > :^ -^. 4 J '■%- .^ >. ^, *. ^ f^ 'j; % 1 V ' IM .5 t •*>. "5 r s. \ ,y _>- ^ '-4 v. J. r" V "V i. -<■' '"'^-^ > "^4 '<. ?• y' ^ > J3 ■" 'i^ .2 =5 -B 3 "3 "O o ;5 'T3 .a ^ :5 '5 'a .-c Qh U -owe t) o c c^ V .^ O J pri Q I I Q "^ ^ MOO ™ O 1- O < CLqn ?3 a !^ i-i pd J 1^ 5 5P 3 -^ -t ^-. O u > o i! ^ Vi '^' J> ;r~ ^ ? V- "¥: 1^ !^ C O o c S ^ ^ 5^ ^ .^ -?. « ^ - C/J i5 0; C3 0 c 1 < < u Q CO u 1 0 P. h H S 0 ^§ ^ W h ■fi > 0 u ^ w u 0 0 c 0 u 0 5P 252 Information Storage and Neural Control tion of the ten per second flicker, potentials at the same frequency clearly appear in visual cortex and lateral geniculate, with less marked evidence of response in the intralaminar nuclei and the reticular formation. The cortical response suddenly shifts to a hypersynchronous slow wave at a frequency between four and five cycles per second, while the animal shows a startled movement and four seconds later performs the conditioned lever press estab- lished to a four per second flicker. Notice that the lateral geniculate maintains ten per second potentials during this period, although potentials at lower frequency are visible in the reticular formation and occasionally in the intralaminar nuclei. A comparable observation has been reported by Majkowski (16). After a rabbit was trained using a three per second light, generaliza- tion was obtained upon presentation of a five per second light. As can be seen in Figure 5, during such a generalization a three per second wave can be observed in motor cortex, although the re- sponse of visual cortex is at five per second. Related findings have been described by other workers (6). Data of this sort suggest that during generalization a neural system, which has become established as a consequence of experi- ence with the intermittent conditioned stimulus (CS), is somehow released by the new stimulus, but discharges with the character- istic temporal pattern of the original conditioned stimulus. This system seems to include the mesencephalic reticular formation and the intralaminar nuclei in association with the visual cortex. It is interesting that during generalization phenomena of the sort described, regions of cortex other than the region of the conditioned stimulus, such as ectosylvian or medial suprasylvian, tend to display potentials at the frequency of the peripheral stimulus. Additional data on this phenomenon have recently been obtained in our laboratory by Marc Weiss (26) who trained a cat to perform a conditioned avoidance response to a four per second flickering light. After establishment of this response, the cat generalized readily to a ten per second flicker. Figure 6 (Top) shows the EEC's obtained from various brain regions during such generaliza- tion. Note the irregular slow activity in the visual cortex contrasted with the regular ten per second response in the lateral geniculate. Figure 6 (Bottom) shows records obtained after differentiation of Neural Mechanisms of Decision Making 253 *- ■- i_A_ *- — *— -\>^^- R. MOTOR L. VISUAL SEC. 50>iV I I i I 1 » 1 > I I I I L. EMG ^\r -~\ \: R. EMG ^'»^- Fig. 5. L. MOTOR — Left motor cortex R. MOTOR — Right motor cortex L. VISUAL — Left visual cortex L. EMG— EMG of left hind limb R. EMG— EMG of right hind limb Electrical responses to five per second flicker during generalization of right hind leg flexion response after training with a three per second flicker (rabbit). (From Majkowski, J.: Acta Physiologica Polomca, /A'( 5): 565-581, 1958.) the conditioned response, during which the animal was taught to discriminate between ten and four per second flicker. Note that the visual cortex now displays markedly increased regularity of ten per second potentials during the ten per second flicker. As is evident from the stimulus trace, these two records were obtained using a "limp circuit" which periodically deleted a flash from the flicker train. The purpose of this technique was to attempt to evaluate the extent to which endogenously generated potentials would "fill in" the period of the deleted flash. Sufficient data have not yet been obtained to warrant discussion of this aspect of the records, and it is not central to our present purpose. 254 Information Storage and Neural Control GENERALIZATION TO 10 cp» FLICKER AFTER TRAINING TO 4 cos SOIIV R VIS ex. ' 'J t J i ■. L VISClt.1*^'>,l'-'- ' . " ^' '■'■'''' ■, ' ' . , ' L.IAT «t»(»^^' .'','"''.'■•", /'•»."'•' ■''■''■■-' IS. WW . J L. HP-VW^yiJVnV.'AV .KIM-——'-' — , ,, -, , „.^,^,_^„^ , , .1 J J ,■■,,, J... ..,...,.. ,,,,f,ff '•■'"" ' I sec ' RESPONSE TO 10 cps FLICKER AFTER DIFFERENTIATION lOpv L OORS ner .■**''VV>'-■^V^^''■'v'''"»--•-^v'•■^-A/•%'"v■A'J^«^A<;v■-•r--^'^^"^v.■'■^^ Li«s «M^*,,^_,w\V'^W"f-^V^VvV''''A'^vv\WVi.>A^^^^^^ Fig. 6. R. NUC. RET. — Right nucleus reticularis L. MSS CX — Left medial suprasylvian cortex R. VIS CX— Right visual cortex L. VIS CX — Left visual cortex L. LAT. GEN. — Left lateral geniculate L. DORS. HIPP.— Left dorsal hippocampus L. RF — Left mesencephalic reticular formation L. CM — Left centre median (Histological verification not yet available.) {Top) Electrical responses to ten per second flicker during generalization, after avoidance training using a four per second flicker tracer conditioned stimulus. (Bottom) Electrical responses to ten per second flicker following differentiation of avoidance response. (4 per second — S^, 10 per second — S'^). (From Weiss, Marc: Unpublished master's thesis, University of Rochester, 1962.) Figure 7 illustrates an average response waveform obtained from the lateral geniculate body of this animal during generalized per- formance of the conditioned response to the ten per second flicker. This computation was obtained using a Mnemotron average re- sponse computer and is based on 100 periods of ten per second Neural Mechanisms oj Decision Making 255 AVERAGE RESPONSE OF LATERAL GENICULATE DURING GENERALIZATION TO lOcps AFTER TRAINING TO 4cps 100 SWEEPS lOcps on Fig. 7. Average response computed from lateral geniculate during generalization to ten per second flicker, after avoidance training using a four per second flicker tracer conditioned stimulus. (From Weiss, Marc: Unpublished data.) flicker, each period beginning at a deleted flash and lasting for 625 milliseconds. Note the regularity of the computed waveform. Similar regularities were observed in average responses computed during generalization to ten per second flicker from dorsal hippo- campus, centre median, nucleus reticularis, and medial supra- sylvian cortex. * Figure 8A shows the average response waveform computed from the visual cortex at this stage of training during correct performance to a four per second flicker. Figure 8B shows a comparable average response waveform com- puted from potentials recorded from the visual cortex during generalized performance to a ten per second flicker. Note the complex, irregular waveform. Figure 8C shows the average response waveform computed from the visual cortex during correct performance to the ten per second flicker after diff'erentiation. In contrast to Figure SB, note the markedly increased simplicity and regularity of the waveform. I was impressed by the fact that these data might provide the basis to test the hypothesis that the waveforms observed during the generalization represented the interaction between an endogen- ously generated representation, or memory, of the stimulus fre- *Histological verification of electrode placements has not yet been obtained. 256 Information Storage and Neural Control AVERAGE RESPONSE OF VISUAL CORTEX 4cps AFTER AVOIDANCE TRAINING DARK PERIOD B. DURING GENERALIZATION lOcps AFTER TRAINING TO 4cps lOcps ON lOOms 1/ DARK PERIOD c. lOcps AFTER DIFFERENTIATION Ocps ON V , lOOms I — 1 DARK PERIOD 100 SWEEPS iOO SWEEPS 100 SWEEPS CALCULATION OF B. FROM C+A AND C-A 26 ms ^ ., GENERALIZATION WAVEFORM o- — oCALCULATED WAVEFORM 0=10 + 4 •=10-4 Fig. 8. Average response computed from visual cortex: (A) In response to four per second flicker after avoidance training using a four per second flicker tracer conditioned stimulus. (B)During generalization to ten per second flicker. (C) In response to ten per second flicker after difTerentiation training. (D) Com- parison of generalization waveform with calculated interference pattern. (A, B, C from Marc Weiss: Unpublished data.) quency used during training" and the exogenously derived neural response to the new stimulus eliciting generalization. Therefore, I explored the interference patterns which could be constructed by algebraic addition or subtraction of the waveforms (Figs. 8A and Neural Adechanisms of Decision Making 257 8C) obtained from the visual cortex during behaviorally appropri- ate response to ten per second and four per second flicker. Figure 8D shows the approximation to the generalization wave- form which can be produced by these simple algebraic manipula- tions. At each point of the curve, the manipulation which gave the better approximation (10+4 or 10— 4) was selected. It is not clear what the physiological basis might be for the particular sequence of algebraic operations used to achieve this approximation. AVERAGE RESPONSE 4cps AFTER AVOIDANCE TRAINING RETICULAR FORMATION 100 SWEEPS 250ms DARK PERIOD 100 SWEEPS B. DURING GENERALIZATION lOcps AFTER TRAINING TO 4cps C. lOcps AFTER DIFFERENTIATION lOOms DARK PERIOD CALCULATION OF B FROM C + A AND C-A 26 ms - — GENERALIZATION WAVEFORM °—o CALCULATED WAVEFORM 0=10 + 4 •=10-4 Fig. 9. As Figure 8, but data derived from mesencephalic reticular formation. 258 Information Storage and Neural Control Figure 9A shows the average response waveform obtained from the mesencephaHc reticular formation at this stage of training during correct performance to a four per second flicker. Figure 9B sliows the average response waveform obtained from the mesencephalic reticular formation during generalization to the ten per second flicker. Note the highly complex and irregular waveform. Figure 9C shows the average response waveform obtained from the mesencephalic reticular formation during correct performance to the ten per second flicker following differentiation. In contrast to Figure 9B, note the increased simplicity and regularity of the waveform. Figure 9D shows the fit to the generalization waveform of the interference pattern which can be obtained by arbitrary algebraic addition or subtraction of the two waveforms elicited from the reticular formation during behaviorally appropriate performance to four per second and ten per second flicker, as shown in Figures 9A and 9C. Again, that manipulation (10 +4 or 10 — 4) which gave the better fit was selected. Thus, one can synthesize interference patterns from average response waveforms computed during behaviorally appropriate response to two different stimuli and can approximate closely the actual average response waveform obtained when an animal re- sponds to one of these stimuli by a previously learned behavior appropriate to the other. This demonstration provides striking evidence in support of the suggestion that the neural response to the ten per second stimulus actually presented was modified during generalization by an electrical influence identical with the response to the four per second conditioned stimulus repeatedly experienced during the earlier establishment of the conditioned response. At the moment I see no way to evade the conclusion that the conse- quence of experience with the four per second flicker during learn- ing somehow caused a modification of neural structure which there- by gained the capacity to generate electrical activity like that which established it. These data support the interpretation that such patterns of potentials are of functional significance and are closely related to the actual processing of information. At our present stage of knowledge, no mechanisms come to mind which might serve to generate and mediate an interaction of the Neural Mechanisms of Decision Making 259 sort described. Yet some insight may be offered from the fact that only visual cortex and reticular formation, among the structures studied in this animal, displayed these peculiar waveforms during generalization. Lateral geniculate was notably regular in its response. This configuration suggests that somehow an interaction between visual cortex and reticular formation may be central in the mediation of phenomena of this sort. Further work is obviously necessary before the interpretations offered here can be accepted as accurate. CHARACTERISTICS OF MISTAKES DURING DIFFERENTIATION Although the data presented in the preceding section are of a different sort from those which Killam and I described previously, they are in accordance with observations we made while studying the difference in electrical recordings obtained during correct and erroneous performance of flicker discriminations in a differential approach-avoidance situation (9). In those animals, we observed that among the most marked changes in labeled potentials during differential conditioning were those which occurred in the reticular formation, intralaminar nuclei, and hippocampus. A particular relationship between the configuration of potentials in these struc- tures and in visual cortex seemed to be closely related to appropriate performance. During signal presentation, potentials in the non- specific structures could often be observed at either of the two flicker frequencies between which differential response had been established. Wlien behavioral performance was appropriate to the peripheral OS, the frequencies of potentials in visual cortex and in nonspecific regions were in good correspondence to the OS. However, when behavioral performance was inappropriate, the cor- respondence of labeled potentials to tlie CIS diininished and periods of hypersynchrony appeared at the frequency of the stimulus appropriate to the behavior actually performed, particularly in centralis lateralis, dorsal hippocampus and reticular forma- tion. In Figure 10 are presented recordings obtained from a cat trained to perform a lever press to obtain milk during a ten per second flicker without reinforceinent during six per second flicker. 260 Information Storage and Neural Control After Operant Conditioning to IO/5 "$<=', 6/s -5^ LG AA«'^,VWvVw''^*^i/^^ 1 Fx ^''.A^^vv^^\AAV^\^4^f*^(^^M i 10* ; , SIG W/WVVWVVWWVWWWWVWVWVVW/WWWWWVWV'JVW^ VH ^>/\^ht^\f^^/^j>f^^ I Correct FX \\\hY¥4^'^'*^^^^ s iG /vwwwv ;;;;M\/ww\/'//w'^vw/w\/'yw^AWM/vv;MWAVvvwwvMVWvwv^^ Error Fig. 10. MG — Medial geniculate VC— Visual cortex LG — Lateral geniculate FX — Fornix VH — Ventral hippocampus CL — Centralis lateralis MSS — Medial suprasylvian cortex Records obtained during differential approach conditioning (10 per second — S , 6 per second — S^). (Top) Correct response to ten per second flicker. (Bottom) Error of omission to ten per second flicker. (From John, E. R. and Killam,K. F.: J. Nerv. Merit. Dis., 73/.-183-201, 1960.) The top records were taken during correct performance to ten per second flicker. Note in particular the marked frequency- specific response in fornix and centraHs lateralis. The bottom records were obtained during an error of omission when the cat failed to press the lever in response to the ten per second signal. Note the diminished ten per second labeled potentials and, in particular, Neural Mechanisms of Decision Making 261 After Operant Conditioning to ICVs-S*^, 6/i-S^ s'G — ^ s;f-o^.^^^vA\\^^^^^\J\J^^^\^^J\,\^J\^ Correct siQ — ^v:vVvV^.\\Vv\\\^,■"^^^^^^^^J\\ tnror Fig. 11. MG — Medial geniculate VC — Visual cortex LG — Lateral geniculate FX — Fornix VH — Ventral hippocampus CL — Centralis lateralis MSS — Medial suprasylvian cortex Records obtained during differential conditioning (10 per second — S^, 6 per second — S ). (Top) Correct response to six per second flicker. (Bottom) Error of commission to six per second flicker. (From John, E. R. and Killam, K. F.: J. Nerv. Merit. Dis., 737.- 183-201, 1960.) the slow potential at about six per second seen most clearly in centralis lateralis. Figure 11 shows the converse phenomenon in the same cat. The top record shows correct performance to the non-reinforced six per second flicker. The bottom record shows an error of commission to the six per second flicker. Note the lessened frequency specificity 262 Information Storage and Neural Control After CAR to 6/S Fig. 12. Records obtained during lever press to 10 per second flicker after avoid- ance training to the 6 per second flicker. Arrow indicates conditioned response. (From John, E. R. and Killam, K. F.: J. Mrv. Merit. Dis., 7J7.-183-201, 1960.) of potentials in the lower record as contrasted with the upper; in particular, observe the period of approximately ten per second po- tentials in centralis lateralis. Figure 12 shows the potential configuration reliably obtained in this cat in response to ten per second flicker following the establish- ment of a conditioned avoidance response to the six per second flicker, while the conditioned lever pressing" response to ten per second flicker was maintained. At this stage in this animal, presenta- tion of the ten per second flicker elicited clear labeled potentials in visual cortex and several other structures, while an initial slow wave at about six per second appeared in centralis lateralis and fornix. Superimposed on this slow potential, almost as a modulation, is a ten per second potential which gradually becomes clearer and eventually dominates the record. Wlien lever press occurred to the ten per second flicker, it almost invariably took place during a period when the ten per second labeled potential dominated the activity of centralis lateralis. Characteristically, as this correspond- ence between the frequency of the dominant activity in the non- specific structures and in the visual cortex occurred, a change Neural Mechanisms of Decision Makirrg 263 During Blockc3dc of CAR After Rcserpinc vc -*v^A^AM^^MMM'V^^A^^MAA/W\A^ -^o^/v i LG ^H^aT'''^^^ I Fig. 13. Records obtained in response to ten per second flicker after performance of the avoidance response to six per second flicker was blocked by injection of reserpine (1007/kg). (From John, E. R. and Killam, K. F.: J. Nerv. Ment. Dis., 737.- 183-201, 1960.) was observed in the recorded waveforms. This change was a shift from rounded "waves" to more sharply peaked spikes and was foUowed one or two seconds later by performance of the conditioned response. Some indication of the possible functional relevance of the slow six per second centralis lateralis waves seen during the approach signal after avoidance training is provided by the data in Figure 13. When performance of the avoidance response to the six per second TCS was completely blocked after administration of 100 7/kg. of reserpine, presentation of the ten per second TCS no longer elicited the previously marked slow potentials in centralis lateralis and elsewhere, but instead resulted in the appearance of massive labeled responses at ten per second frequency. Wlien 0.5mg/kg. of amphetamine was administered to this cat, the reserpine blockade of the conditioned avoidance response performance to the six per second TCS was completely reversed in a few minutes. Presenta- tion of the ten per second TCS for the lever pressing response to 264 Information Storage and Neural Control obtain milk once again elicited the same slow potentials in centralis lateralis and elsewhere as seen previously in Figure 12. These observations seemed to support the interpretation that the labeled potentials reflected some aspect of information process- ing and might be of functional significance. Such an interpretation would also be in agreeinent with the findings of Livanov et al. (13) and Liberson et al. (12) who have reported that direct electrical stimulation of various brain structures at frequencies like those of the intermittent conditioned stimuli used in establishing a condi- tioned response resulted in performance of the learned behavior. Nonspecific structures seem to play a central role in the processing of information during differentiation. Evidence of differential suppression of potentials after habituation, of the major signs of assimilation, of the inost marked increinents in labeled potentials during differential training, and of shifts in the frequency of labeled potentials during behaviorally inappropriate response have all been observed in these structures. The particular configuration of potential patterns during differential response suggested several hypotheses: 1) The role of specific sensory systems may be con- ceived of as the central propagation of information representing the present state of the environment to a particular cortical region; 2) this information may be compared, via the diffuse projection system, with a representation of past experiences activated in the rhinencephalon and the reticular formation by the similarity be- tween past and present environment, modified by the state of the organism in terms of effect and drive level; and, 3) the appropriate selective performance of adaptive behavioral responses may depend upon achievement of a sufficient congruence, via some unknown coincidence detection mechanism, of the potentials reflecting present and past experience. CONCURRENT PERIPHERAL AND CENTRAL STIMULATION These various considerations led our group to investigate further the question of whether temporal patterns of potentials might be information. When animals are trained to perform a differential discrimination between two flicker stimuli differing in frequency, Neural Mechanisms of Decision Making 265 are the observed frequency-specific potentials a reflection of the coding" and processing of information causafly related to the behavioral performance, or do they merely reflect generalized processes of local excitation and inhibition that are not specifically informational and bear only a relationship of concomitance to the behavioral performance? In the initial studies which we undertook to resolve these ques- tions (10), an attempt was made to evaluate directly the functional significance of labeled potentials observed in various brain struc- tures in cats fully trained to perform diff'erential avoidance re- sponses to two flicker conditioned stimuli of different frequencies. We studied the behavioral effects of direct electrical stimulation of the brain at frequencies concordant or discordant with the frequency of the peripheral conditioned stimuli presented simul- taneously. After pilot studies showed that low frequency electrical stimulation was not effective, a modulation technique was devised. A standard "carrier" waveform, consisting of a 100 cycle per second biphasic square wave with a 2 millisecond pulse duration, was modulated at the frequency of the peripheral TCS. This pro- duced trains of bursts of 100 cycle per second square waves, with the burst frequency identical with the flicker frequencies to which the animals were conditioned. Trains at different frequencies could be manipulated to achieve equal duration of constituent bursts or to equate total electrical energy by selection of appropriate burst durations. Most structures were explored both unilaterally and bilaterally. For each structure, we determined the current level at which central stimulation at both the reinforced (S ) and the non- reinforced (S^) frequency blocked performance to concurrent photic stimulation at the S frequency. This current level was defined as the occlusion threshold, or cut-off. The current intensity at which conditioned response perfoimance returned to concurrent photic and central stimulation at one central frequency but not the other was defined as the differential threshold. If a differential threshold was observed, a series of trials was carried out to de- termine the reliability of such an effect. Throughout such stimula- tion sessions, central stimuli were presented in counterbalanced frequency sequence, and each sequence was bracketed by trials 266 Information Storage and Neural Control using only the peripheral conditioned stimuli. Only central se- quences bracketed by correct performance to the peripheral stimulus alone were considered acceptable. Intensive studies of the effects of concurrent central and peri- pheral stimulation have been carried out in two cats. One of these animals (Cat 4) was conditioned to press a lever to avoid shock within fifteen seconds after the onset of a four per second flicker, but was punished if lever press was performed during" a ten per second flicker. The other animal (Cat 10) was trained to the opposite significance of flicker frequency, pressing" the lever to ten per second flicker but not to four per second. Results of the concurrent stimulation studies on these two animals are sum- marized in Table I. Note that the data show, at a very high significance level, that a four per second electrical stimulation of the visual cortex is much more effective than a ten per second input in achieving inhibition of conditioned avoidance response performance to a simultaneously presented TCS in both Cat 4 and Cat 10, although the meaning of a four per second flicker was opposite for these two animals. Since this was true both for central stimuli of equal burst duration and for those of equal energy, the severe disruption can be at- tributed to the frequency of the simulated input. Four per second central stimulation was much more inhibitory than ten per second. This effect was not observed in auditory or medial suprasylvian cortex, but appeared to be rather specific for the cortex of the CS modality. This suggests that the input in some way interferes with activity in the visual system and that the visual cortex or regions to which it projects are involved in the mediation of the conditioned response. Such conclusions would be consonant with those of Zuckermann (27), who observed interference with per- formance of conditioned responses to visual stimuli during after- discharge following stimulation of visual cortex but not of motor cortex or reticular formation. Such a conclusion is difficult to reconcile with the remarkable ability of Cat 4 to sustain appropriate behavioral response to a four per second flicker when 2.5 times more electrical energy was applied to the same visual cortex at ten per second. In contrast to the cortical current values for dis- ruption, note the exceedingly low current required in subcortical Neural Mechanisms of Decismi Making 267 ;^ o o o VVV o o o o o A A A A A -^ -* ^ -^ •* o o VV V V AAII AAAAAAAAAII All A ■*0 TfOO-^OOOOO o o i O (M O I I I I I I I I I I I . Tj-oO-^O[X. (X-)-X(X)>LaJ (X)X(;^)> x' — [x — X Figure 4 B Figure 5 sions for a net of three neurons whose output neuron goes through X and requires interaction, as does the left-hand neuron. Obviously no more is possible, for the output would always or never fire. One more trick served by interaction is the use of separate shifts in d that are produced by feedback to secure flexibility of function. Consider the net of Figure 5 in which the feathered arrows indicate feedback affecting 0's. This net can be made to Anastomotic Nets Combating Noise 289 compute fifteen out of the sixteen possible functions. Had I drawn it for neurons with three inputs each, it could have been switched so as to compute each of 253 out of the 256 logical functions of three arguments. I strongly suspect that this is why we have in the eye some 100 million receptors and only approximately one million ganglion cells, but note that it depends upon interaction of afferents. Finally, Manuel Blum has recently proved that this interaction enables him to design nets that will compute any one specified function of any finite number of inputs with a fixed threshold of the neuron at a small, absolute value, say, 1 or 0. This prevents the neuron from having to detect the small difference of two large numbers, thus allowing the brain a far greater precision of response to many inputs per neuron, despite a fluctuation of a given per cent of the threshold 6. This fluctuation of 6 is the first source of noise which I wish to consider. The effective threshold of a neuron cannot be more constant than that of the spot at which its propagated impulse is initiated. This trigger point is a small area of membrane, with a high resistance, and it operates at body temperature. It is, therefore, a source of thermal noise. The best model for such a trigger is the Node of Ranvier, and the most precise measurements of its value are those of Verveen. For axons '■^A^ in diameter, he finds it to be ^^ ±1 per cent of 0; it is larger for small axons. Moreover, his analysis of his data proves that the fluctuations have the random distribution expected of thermal noise. There are, of course, no equally good chances to measure it in the central nervous system, for one cannot tell how much of a fluctuation is due to signals or to stray currents from other cells. Our own crude attempt on the dorsal column of the spinal cord indicates far greater noise, but not its source. What goes for thresholds goes, of course, for signal strength; and for fine fibers, say, 0.1^, the root mean-square value of the fluctuation calculated by the equation of Fatt and Katz is —0.5 mv. If we accept a threshold value of 15 mv., this is several per cent. It may be much larger. Moreover, it is impossible that the details of synapsis are per- fectly specified by our genes, preserved in our growth, or perfected by adaptation. They are certainly disordered by disease and injury. 290 Information Storage and Neural Control A B kX Figure 6 Nevertheless, it is possible to cope with these three kinds of noise — 0, signal, and synapsis — as long as the output of a neuron depends, in some fashion, on its input by an anastomotic net to yield an error-free capacity of computation. This is completely impossible with neurons having only two inputs each. The best we can do is to decrease the probability of error. Consider, for example, a net like that of Figure 6 to compute [X], where each neuron is supposed to have 0=3, but each drops independently to 2 with a frequency p. As long as p is less than 0.5, the net improves rapidly as the product of the p's of successive ranks decreases. The trick here is to segregate the errors. The moment we look at neurons with three inputs, the picture changes completely; but to describe this change we need to increase the complexity of our logical symbols by putting a circle on the X, so that inside it is C, outside not C, as in Figure 7(a). Now con- sider a net to compute some function, say, all or else none. We can schematize this, as in Figure 7(b). The dash is a "don't-care" condition; it may be a 1, or 0, or any p that you choose. This net makes no mistakes. Let us suppose that each of the first rank Anastomotic Nets Combating Moise AB B 291 NONE ALL OR NONE (a) ABC (b) (c) Figure 7 292 Information Storage and Neural Control exerts +2 excitation on the third. Then its threshold can vary harmlessly: 3 < 0 < 6, or nearly 50 per cent. Moreover, if the threshold is better controlled, then the strength of the signals can vary. Finally, if both are fairly well controlled, the connections can be wrong, as in Figure 7(c), and the input-output function [S5] is still undisturbed. If we want to extend our symbols to four arguments, then the pattern becomes that of Figure 8, and for five arguments it becomes more complex. In general, each new line must divide all existing areas into two; thus for N inputs there are 2 spaces. Oliver Selfridge and Marvin Minsky have worked out simple ways of making such symbols, with sine waves, for any finite number of inputs. Eugene Prange has invented a way of devising the distribution of don't-care conditions so that there are as many as possible for a net of N neurons in the first rank and one in the output rank, each rank having N inputs per neuron. The number of don't-care conditions, or dashes, depends upon the number of ones in the spaces for the function to be computed. The dashes are fewest when the function to be coinputed has exactly one-half its spaces filled with ones. Manuel Blum has solved the following questions: 1) Suppose that there are no don't-care conditions, or dashes, in the symbol for the output neuron; what fraction of the spaces for each neuron of the first rank can have dashes and the calculation be error-free for the toughest function (half-filled with ones), all as a function of N?; 2) With all those dashes in the first rank, what 1.0 0.5 T ^""' 1 Figure 8 50 Figure 9 100 Anastomotic Bets Combating Noise 293 fraction of the spaces in the symbol of the output neuron can harmlessly be dashes? Figure 9 is Blum's diagram, based on equations that are exact if N is a perfect square and fairly good approximations for the rest. You will notice that for N less than 40, the output neuron (the solid line) has fewer dashes. At '^40 they are equal, being ^^80 per cent of all spaces. For larger N, the output neuron has the larger fraction; and, for N = 100, 90 per cent of spaces in the input rank are dashes, and 98 per cent in the output neuron are dashes. From this outcome, it is very clear that the output neuron cannot be a majority organ like the one for N = 3. We all know that real nervous systems and real neurons have many other useful properties. But I hope I have said enough to convince you that these impoverished formal nets of formal neurons can compute with an error-free capacity despite limited pertur- bations of thresholds, of signal strength, and even of local synapsis, provided the net is sufficiently anastomotic. If I have convinced you, it has been in terms of a logic in which the functions, not merely the arguments, are only probable. But even this prob- abilisitic logic, for all its don't-care conditions, is adequate to cope fully with noise of other kinds. Our neurons die — thousands per day. Neurons, when diseased, often emit long strings of im- pulses spontaneously and cannot be stopped by impulses from any other neuions. And, finally, axons themselves become noisy, trans- mitting a spike when none should have arisen or failing to transmit one that they should have transmitted. To handle these problems in which the output of a neuron has ceased to be any function of its input, von Neumann proposed what is called "'bundling.'"' In the simplest case, one replaces a simple axon from A by two axons in parallel. This alters the logic, for now if all fibers in the bundle fire, A is regarded as certainly true; if none fire, as certainly false; but between these limits there is a region of uncertainty — call it a set of values between true and false. In the simplest case, there are two such intermediate values. Von Neumann found that if there are only two inputs per neuron, the neurons had to be too good and the bundles too big. To com- pute, say, X or ¥, with a net constructed like the net of Figure 10, we find that given a probability of an error on the axon, say. 294 Information Storage and Neural Control A B Figure 10 € = 0.5 per cent, to have the bundle usably correct all but once in one million times, he needed 5000 neurons and two more ranks of 5000 to restore his signal so that it was usable. His difficulty was chiefly the poverty of the anastomosis. We have found that, with the same e and the requirement that the bundle be usably correct all but once in one million times, if each axon is connected to every neuron, we only need one rank of 10 neurons. Leo Verbeek has looked into the problem of the death and fits of neurons, and has found that again the probability of an erroneous output decreases as the number of inputs per neuron and the width of the first rank (both 5 in number) increase, at least for probabilities of death and fits reasonably under 50 per cent. Figure 11 shows his graph, where 5 is the number of inputs, p the probability of error in the input neurons, and a:s(p) the probability of erroneous output. Even for a small 5, these calcu- lations are enormously laborious. We are all much indebted to Jack Cowan for our knowledge of many-valued logic for handling bundling, and for conclusive evidence that this is not the cleverest way to obtain reliability. He and Sam Winograd have made a much greater contribution, which I could not expound to you if I wanted to, and I do not because it will probably be communicated in full by Professor Gabor for publication in the Philosophical Transactions. Vaguely, its purport is this: Anastomotic Nets Combating Noise 295 0.4 - 0.2 - In the theory of information concerned with communication, there is a theorem, due to Shannon, that, by proper encoding and decoding, if one transmits at something less than the capacity of a noisy channel, one can do so with as small a finite error as one desires by using sufficiently long latencies. Except for things like X and X, no one before Cowan and Winograd was able to show a similar information-theoretic capacity m computation. They have succeeded for any computation and for any depth of net, limited only by the reliability of the output neurons. The trick lay in a diversification of function in a net that was sufficiently richly interconnected. Their fundamental supposition is that with real neurons the probability of error on any one axon does not increase with the complexity of its neuron's connections. The recipients of most connections are the largest and, consequently, the most stable neurons. Again, it is the richest anastomosis that combats noise best. ACKNOWLEDGMENT I wish to acknowledge the contributions of those who have worked with me in this endeavor, namely: Anthony Aldrich, Michael Arbib, Manuel Blum, Jack Cowan, Nello Onesto, Leo Verbeek, Sam Winograd, and Bert Verveen. 296 Information Storage and Neural Control DISCUSSION OF CHAPTER XII Bernard Saltzberg (Santa Monica, California): In the head- phone experiment, I assume you used a single noise source which divided its power between the earphones. Was an experiment attempted with two independent noise sources? How did the results come out? Warren S. McCulloch (Cambridge, Massachusetts) : It does not help much. It has to be the same noise. Different noise is no good. What they were trying when I was last involved was lagging one earphone a little behind the other to see what phase difference they could make in it and still have it work. As far as I know, this has not been cleaned up yet. Gregory Bateson (Palo Alto, California): What is the price of this increased reliability in terms of loss of educability? Obviously, to obtain a new function — a new relationship — out of this net, you have to alter a large number of connections. In a sense, I suspect that the more reliable your new constructions, the more non- educable the net becomes; but I am not a good enough logician to know that this is so. McCulloch: Look at the flexibility end of it. We have here a neuron with a couple of inputs (A and B) and one output neuron. Let us take the case of three neurons. Incidentally, I cannot build this without the interaction of afferents. I have one output neuron. Now I can send signals back from the central nervous system and tell my eye what it is to look for, what it is to see. You get 256 possible logical functions. You can calculate 253 of them by giving these first rank neurons a nudge on the threshold. Reliability does not mean that the net is inflexible. This is a remarkably flexible device. The flexibility goes up with the anastomosis; it does not go down. That is one of the beautiful things about it. If it was simple majority logic, the situation would be impossible. The stupidest thing to do, so to speak, if you want to get the maximum life out of a rope is to use it until it breaks and then replace it with another one. No mountain climber that I know takes such a chance. The next worse thing is always to stretch two ropes from man to man. What you want is the richness of Anastomotic Nets Combating Noise 297 connections. The dynamics of the picture is beginning to show up, but the matliematics is too comphcated for us as yet. Eugene Pautler (Akron, Ohio) : What type of detector would be required to recognize the results of this output — the computa- tions inherent in this output neuron? McCulloch: I think it is probably all done in the eye. Suppose you tell your eye to look for four-leaf clovers. You simply send out the message, "Find a particular pattern in those leaves"; and when you have found it signal, "Here is one! Here is another!" You knew what you were looking for so you set your filter ac- cordingly. A frog, when he jumps, sends back impulses to his eye to give as great a response as possible to an affair of lesser curvature or greater radius of curvature, which informs his eye. This works during the first part of the jump while his eyes are open. One tells one's eye what to see, what too look for. It would be almost unthinkable that otherwise one could go into, say, Grand Central Station, look off across the hall, and, knowing that there is a chance of so-and-so being there, find him, unless one has in some manner set a filter. Just how much of that matching is done in the eye, I do not know. The mouse, which does not turn its eyes and keeps them open, is another nice animal to work on. His retina is the same all over, and whether you get a response from a particular ganglion cell or from a particular axon depends upon whether the mouse is hungry or whether it has smelled its cheese. If it has, then it bothers to look, but it will not look the rest of the time. The mouse shows very little response to any visual stimulus. The situation is far too complicated to be solved with a set of electrodes. Homer F. Weir (Houston, Texas): In the use of the injured neuron, you are apparently producing noise from non-input sources. Is it correct to say that your injured neuron is putting out output without input? McCulloch: Yes. Either it is doing that or it is dead. Weir: At what level would this have to occur, relatively speak- ing, before it would override this protective error mechanism that you were speaking of? 298 Information Storage and Neural Control McCuUoch: I have not seen my own cerebellum, but I have seen that of many a man my age. I am in my second century, and I expect that at least 10 per cent of the Purkinje cells in my cerebellum are replaced by nice holes at my age, but I can still touch my nose. It is incredible how little brain has to be left in order for it to function. PART IV — THE HUMAN NERVOUS SYSTEM Moderator: Wavne H. Holtzman, Ph.D. CHAPTER XIII THE INDIVIDUAL AS AN INFORMATION PROCESSING SYSTEM James G. Miller, M.D., Ph.D. c CONSIDERING human beings as information processing" sys- tems has in tiie last decade proved useful in both experiment and theory. Some of the hoary old problems of behavior and learning theory have received a new form or have been bypassed, and some fruitful approaches to human individual, group, and social be- havior have arisen. It has been estimated (1) that in fifty years of waking life an individual may process 10"^ (ten thousand trillion) bits of infor- mation. A person may be looked upon as a component in an interpersonal system in which messages are sent from one node to another along channels and through nets. As an individual, he may be studied as a "black box'' whose input-output relation- ships can be detei mined, or as a system of interrelated components whose performance and capacities are increasingly available to experimental investigation. At the Mental Health Research Institute of The University of Michigan some of us work within the general systems orientation which regards all life as a part of the physical space-time continuum. We consider this continuum to be organized into a hierarchy of levels of systems, all of which have subsystems and are themselves subsystems of larger organizations or supersystems. The smallest living system, the cell, is composed of nonliving molecules. These may be free-living or may be components of organs, which in turn are organized into more complex individual systems. These 301 302 Injormaiion Storage and Neural Control may band into face-to-face groups or larger social organizations and societies. There is continuity and there are cross-level similarities in structure and process at all levels of this hierarchy, even though there are, of course, at the same time, many specific differences among individual systems, species, and levels. We have sought for and found cross-level "formal identities" which can be studied experimentally. All living systems are open systems. That is, they maintain steady states of several variables and counteract entropic dis- integration by means of inputs and outputs. Living systems at all levels process both energy and information. These always flow together. For example, energic inputs such as food convey infor- mation in the patterning of their molecular structures, and coded verbal communications are carried on the energy of sound waves. Energic and informational inputs are distinguished by whether the receiver responds to their energic or their informational aspects. Sometimes the response is to both. SUBSYSTEMS Living systems at any level require certain crucial subsystein functions in order to survive, unless they exist in a relationship of parasitism or symbiosis with another system which supplies them. Free-living cells, for example, may be shown to have subsystems that accomplish all the essential functions, while cells which are part of organs may lack some of them. Groups which survive over time isolated from other people have all these subsystems while groups which are parts of organized societies almost never do. Subsystems may be either local, like the eye, or dispersed, like the reticuloendothelial system. There are essential subsystems which deal with the processing of energy and others which process information. The essential energy-processing subsystems in the general order of their operation are: boundary, ingestor, distributor, decomposer, producer, energy storage, excretor, and mover or output transducer. The essential subsystems in information processing, listed in the general order of flow in information processing are: The Individual as an Information Processing System 303 1) Boundary. This may be the limits of the sense organ of a cell or animal or the mechanisms of a group or society which receive information from outside the system. 2) Input Transducer. A transducer changes energy from one form to another. The sense organ of an animal transduces patterned energic inputs to nerve impulses. There are analogs at the society level in the translaters that receive and recode information from outside the society. 3) Internal Transducer. This subsystem receives and passes on information from within the system, as the input transducer does from without. In an animal there is the system of internal sense organs and chemical sensitivities which activate control mechanisms. There are analogs at the group and society levels. 4) Channel and Net. The channel is the route — neuron, wire, air or ether — over which a message is sent from a transmitter to one or more receivers. In the individual the sensory nerves are channels over which the input is transmitted to the central nervous system. Channels may intersect at points called nodes and may be interconnected to form a net. The nervous system of individuals is an information processing net. The blood and lymph of the individual also act as information carrying channels as well as energy distributors. There are two distinct common uses of the word "channel."' The more restricted meaning includes only the flow route for the information, without intervening subsystems of any other sort (such as transducers, decoders, or encoders). The other, broader meaning includes such components together with the intervening flow routes. "Channel" is employed in both these senses in electronics and little confusion appears to result. We follow the second usage. 5) Decoder. The decoder alters input information into a code or language which can be transmitted and "interpreted" inside the systein. 6) Learner. This subsystem establishes a reliable and enduring association between certain information inputs and other infor- mation from outside or inside the system. Thereafter, the system will make an altered output to an input which previously elicited 304 Information Storage and Neural Control another response, or no response, or make the same output to a different input. 7) Memory. This subsystem stores information over time. 8) Decider. A given set of inputs may ehcit two or more alternate outputs. The decider selects the one that is put into action. Each of the subsystems of a system is also a system at its own level and must make its own decisions, as well as carry out other critical functions. The neuron has the binary decision to fire or not to fire, which is based upon the strength and charac- teristics of its inputs and the present state of the neuron. The individual has a central decision-making subsystem which deter- mines output for the whole system. 9) Encoder. This prepares information for output by putting it into a code which can be transmitted to and interpreted by other systems in the environment. 10) Motor or Output Transducer. The motor in an animal is the same for both energy and information outputs. Nervous impulses trigger activities like gross physical movements, speech, ingestion, or excretion. 11) Reproducer. This is capable of giving rise to other systems similar to the one in which it is found. We consider it an infor- mation processing subsystem because its primary activity is transmission of information or patterning. The reproducer, while not essential for the survival of the individual, is necessary for the continuation of the species and all social organizations which endure for more than one generation. Each of these subsystem functions is carried out within the individual, but as we have seen in this symposium, it is not possible at present to show the precise localization of all of them. The specific neural arrangements for decoding, learning, memory, perception, deciding, and encoding, for example, are all being studied but are not yet understood. We have emphasized the use of standard centimeter-gram- second or information theory units, or units which are derivatives of these, rather than the welter of unrelated measures which have been used in the different fields of behavioral science. Since we are looking for cross-level measurable uniformities or dif- The Individual as an Information Processing System 305 ferences, the quantitative study of tiiese requires the use of com- parable measures at different levels, and the units of the natural sciences seem best suited, though, of course, all sorts of phenoinena cannot yet be expressed in them. THE ROUTE OF INFORMATION FLOW Each one of a person's subsystems may participate in the preparation of the output. Input of appropriate kind and strength crosses the individual boundary and is transduced into the proper form for nervous transmission. If a language or code is involved, it is translated by the decoder and classified by the perceiver in terms of a perceptual schema which represents the world as the individual has experienced it. Reference may be made to stored memories. There may be some recoding or other preparation of all or part of the input for storage in the memory. On the output side, a decision is made from among the alternate possible outputs; encoding for external transmission is carried on, and the nervous message is transduced into physical response, through either the speech mechanism or other musculature. There is a large literature on each of these functions and it is impossible to do more than give a brief review of some of the material on some of the sub- systems. Not all input, of course, is channeled through all the subsystems. A reflex response to an input may involve only a small number of subsystems. Complex decisions may make use of the whole range of individual subsystems. Throughout the system there is a continual and cumulative loss of information. One important aspect of the response of biological systems, as both Gerard (2) and Piatt (3) have recog- nized, is amplification. That is, the energy in the signal is very small compared to the energy in the response. At the same time there is a loss of dimensionality from input to output in all am- plifiers, which must select in order to amplify, since they have limited power available. There is distortion of information at each boundary that is crossed, and furthermore, noise alters the signal. The sense organ reacts only to part of the information present in the environment. The perceiver screens and organizes the input further, and in the process ignores that part of it which 306 Information Storage and Neural Control seems irrelevant. Channel capacity may be lower than the capacity of the components. When the behavior is organized the central decision represents only a small part of the original input in- formation. OVERVIEW OF RESEARCH ON SUBSYSTEM FUNCTIONS For an input to cross the boundary into the system its energy must be great enough to cause the external transducer to fire. The signal-to-noise ratio also must be sufficiently high. Other environmental conditions which may influence the permeability of a boundary to an input are competing signals, and the simi- larity of the background to the signal — for example, a white stimulus on a white ground may not be detected. McCulloch gave an example of this in the experiment he mentioned in his paper on detecting a signal against monaural and binaural background noise. Classical psychophysics in its study of the threshold has tended to ignore some of the important aspects of signal detectability or to assume, sometimes incorrectly, that these other things are held constant. Swets, Tanner and Birdsall (4) have pointed out that this classical concept of the threshold is unreasonable because it ignores the control which is exerted by sensory and psychological variables. That is, it neglects the participation of subsystems other than the boundary. The characteristics of human sense organs as input transducers or internal transducers may be specified just as the characteristics of electronic transducers: by transfer function, band width, phase shift, and signal-to-noise ratio. In these respects various sensory subsystems or modalities perform quite differently. The transfer function of a transducer refers to its ratio of output to input. In the visual system this is the relationship between intensity of light and reported brightness. In the auditory system it is the relationship between intensity of sound and reported loudness. Some engineers in designing apparatus for man-machine systems have mistakenly assumed that the cu'-ve of perceived in- tensity rises linearly with the increase in strength of the stimulus. As Stevens has pointed out, the subjective intensity increases as a The Individual as an Information Processing System 307 power function of the stimulus magnitude. The exponent of this function for loudness is about 0.3, while it is about 3.5 for the apparent intensity of electric current applied to the fingers. Stevens (5) notes that: "In three modalities investigated . . . transducers . . . have three radically different operating charac- teristics. The slow growth of loudness (exponent less than one) suggests that the ear behaves as a 'compressor' . . . This com- pressor action probably helps to make it possible for the ear to respond to an enormous range of sound pressures — range of millions to one. The apparent intensity of vibration on the finger tip grows almost linearly with vibration amplitude — as though the transducer were approximately linear. The effective range of vibration amplitudes to which the finger is sensitive is of the order of hundreds to one. (Incidentally, vibration on the arm does not follow a simple power law.) The steep operating characteristic for electric shock suggests the action of an 'expander' of some sort; doubling the current increases the sensation about tenfold. And correlated with this rapid expansion is a narrow operating range of stimuli of the order of only tens to one." In input transducers the output signal usually differs from the input signal in bandwidth characteristics. For instance, light of different wave lengths and sound of different frequencies are subjectively reported as various colors and pitches. Sensitivity over the range of light waves and sound waves is not uniform. Also input transducers are active over only a limited range. There are light waves above and below the visible spectrum and sounds which the human ear cannot hear. Phase shift refers to the lag in phase of the output signal over the input signal. Input transducers differ in speed of transmission. For example, sound waves travel through the atmosphere quite slowly but are transmitted rapidly through the auditory organ, while light waves, which reach the eye very speedily, are processed through a slow input transducer. Input transducers also differ in the amounts and kinds of noise they insert into the signal. Channel and Net Broadbent (6) suggests that the whole individual may be re- garded as a single channel which performs a selective operation 308 Information Storage and Neural Control upon the input, stores part of it, filters it, and transmits it over a limited-capacity channel to long-term storage, to the output trans- ducer, or to both. Here Broadbent includes all the components of the individual system in one channel, which is one possible way to view the system. This results, however, in ascribing to channel activity some things which we have analyzed as subsystem functions. Within the channel he analyzes components which filter, store, decide, and so forth. Quastler (7) analyzes the activity of specific channels in terms of speed, diversity, order of complexity, range, and other factors. Electronics engineers measure in channels the variables of process- ing time, channel capacity, bandwidth, signal-to-noise ratio, and phase shift or lag. These can all be usefully applied to the animal or human being. The processing time through neurons is brief compared to the total response time. The duration of neural propagation of an impulse differs with the length of the channel and the type and size of the neuron. Longer transmission delays occur at the per- ceiver and the decider. Channel capacity is a valuable concept in behavior theory. Broadbent (8) says: "... perhaps the point of permanent value which will remain in psychology if the fashion for communication theory wanes, will be the emphasis on problems of capacity. The latter, in communication theory, is a term representing the limiting quantity of information which can be transmitted through a given channel in a given time . . . the fact that any given channel has a limit is a matter of central importance to communication engi- neers, and it is correspondingly forced on the attention of psy- chologists who use their terms." Quastler (9) was interested in finding how much information man can process at best. His research, therefore, was designed so that neither the visual input nor the muscular output were in any way hampered. In these tasks all inputs came from a single source, all output choices were mechanical, and all displays and operations were thoroughly familiar. He studied rates at which information is transmitted by reading, typing, playing the piano, doing mental arithmetic, or assimilating by glancing at displays The Individual as an Information Processing System 309 of letters, playing cards, scales, or dials. His research was designed to establish the principal factors limiting performance. With these experimental conditions, the performances which were obtained were at peak rates which could have been achieved only under favorable conditions. Quastler (10) says: "We find that people can make up to five to six successful associations per second, can transmit about twenty-five bits per second, can operate efficiently over a range of about thirty possible values and can assimilate some fifteen bits at a glance. We do not expect that they will reach such perfonnance levels with every kind of activity; in fact, we know that they usually do not.'' In bits per second, he and his colleagues found peak performances for piano playing of twenty-two bits; for reading aloud, twenty-four bits; and for mental arithmetic, twenty-four bits. They concluded that the peripheral input mechanisms were not responsible for limitations upon information processing. Quastler (11) notes: "The capacity of the optic nerve is many orders of magnitude higher than twenty or forty bits per second ; a much wider range of symbols could be accommodated with the resolving power of the retina. As to speed limitations, it is known that about three symbols are grouped in the act of reading, and that about four such groups can be assimi- lated in a second; this gives twelve syinbols per second, con- siderably more than the highest useful speed in typing or piano playing. On the output side, it is easy to see that the limitations of the actual speed, both alone and in combination with precision, cannot be attributed to mechanical difficulties. In all tests, ob- served speeds would have been much improved by rehearsing. Thus the mechanisms which limit the observed performance must be connected with the speed of processing information." Signal-to-noise ratio can be important in the specification of channels where minimal energies are involved. Barlow (12) has shown that the limiting factor in the absolute threshold for vision is fluctuation in the noise in the visual pathways. The Decoder If information is to be used by the individual, it must be suitably coded. That is, it must be in a language or signal system which he can understand. Deininger and Fitts (13) have experimented 310 Information Storage and Neural Control upon the relationship between the input code and performance. They found that an inefficient or inadequate code can retard the transmission of information in perceptual-motor performance. Decoding time, therefore, can make a measurable difference in information-processing rate. Coding, of course, is important in the formation of concepts since this involves the classification of various things under categories which ignore differences among them and emphasize similarities. Brown and Lenneberg (14) found that when subjects were asked to name colors as quickly as possible, the average reaction time was shorter and the degree of agree- ment among subjects was higher when there was a word which described the color. When the color had no special name but had to be called "greenish-yellow," or something like that, there was hesitation and inconsistency. Their matrix of intercorrelations yielded a general factor which they called codability. There is a large literature on semantic problems of coding. The contributions of the learner to information processing are both more familiar and less easy to distinguish from other functions than the more peripheral processes. Some have tried to make learning theory cover nearly all of psychology. There has been much research on learning, but little strictly in terms of infor- mation theory, in which it should be viewed as the associating of two or more signals. Competing theories about the memory have been treated in detail by other speakers in this symposium. Just how information is stored over time, and how it is searched for, still is not known. Deciding Deciding, as we have said, goes on in each subsystem, as well as at the system level. Much of psychology concerns choices and judgments of various sorts — psychophysical judgments, sociometric choices, economic and social decisions, and so forth. Recent work in game theory, utility theory, statistical decision theory, and group effects on judgments of their members is clarifying the processes of complex decisions. In complex reaction-time experiments it is possible to calculate accurately the amount of time which is added to the response time when a choice of behaviors is involved. This time falls to zero as the task is better practiced and the choice becomes automatic (15). The Individual as an Information Processing System 311 We have been interested in one aspect of channel capacity which can be studied at five levels of living systems. What happens at each level when a channel is overloaded? INFORMATION INPUT OVERLOAD From a review of the literature we were able to draw a curve which appeared to apply at each level. The general shape of this performance curve shows the output (in bits per second) rising as a more or less linear function of input until channel capacity is reached, then leveling off and finally decreasing in the con- fusional state. This cross-level generality appeared fairly con- vincingly in the empirical work of others, even though it was not recognized as such by them. At the same time, we also found suggestions as to hierarchical differences among the levels. The overall impression of the findings is that channel capacity decreases from cells to organs, to individuals, to groups, to social organi- zations. Processes of adjustment appear to be comparable at different levels. We have hypothesized that there are limited numbers of such adjustment processes which behaving systems can enlist as stresses on them increase. The following adjustment processes, or mech- anisms of defense, seein to be used by living systems against the stresses of information input overload. Not all living systems have all these inechanisms. The smaller systems, like neurons, appear to have fewer than the larger systems, like societies, which not only have all of them but also have complicated variations of them. These appear to be the fundamental mechanisms, but this may not be an exhaustive list: 1 ) Omission, which is simply not processing information whenever there is an extreme of overload; 2) Error, which is processing incorrectly, then not making the necessary adjustment; 3) Queuifig, which is delaying responses during peak load periods and then catching up during lulls; 4) Filtering, which is systematic omission of certain categories of information, according to soine priority scheme; 312 Information Storage and Neural Control 5) Approximation, which is an output mechanism whereby a less precise or less accurate response is given because there is no time to be precise; 6) Multiple channels, which parallel transmission subsystems that can do comparable tasks at tiie same time and consequently to- gether can handle more information than a single channel can transmit alone; 6a) Decentralization, which is a special case of this; and, finally there is 7) Escape, which is leaving a situation entirely or taking any other steps that effectively cut off the flow of information. Thus we have searched for quantitative similarities and differ- ences among living systems at all levels in the way they react to in- formation input overload, and have given special attention to a) performance characteristics of a system as an information processing channel; and b) associated adjustment processes used to relieve stress on the information processing subsystem and maintain per- formance. Our original intention in approaching the problem of over- loading living systems with information was to study a single variable — the input-output rate relationship — postulating a formal identity of this function in channels at all levels of living systems. But this proposition turned out to involve numerous others about many variables representing other aspects of systems. The whole problem ramified in a fascinating way. We built apparatuses and designed procedures which we hoped would provide stable conditions for collecting performance data from the systems we selected for study, attempting to hold con- stant as many of the variables not under investigation as possible. We were not concerned primarily with obtaining the maximum possible transmission rates from our systems, but rather attempted to create a stable situation in which we could test our overload proposition and be sure we knew when overload occurred. Later we could study as independent variables those functions which change a given system's maximum channel capacity. Since information bits per second had been used by others in researches at all five levels, we believed this to be a suitable measure The Individual as an Information Processing System 313 of performance. We realized that at each level we would encounter a complex statistical problem if we used limited sequences of inputs. We also met other problems in calculating" bits, particularly, in knowing what code was employed at the cell and organ levels, and in knowing the exact size of the implicit ensemble at all levels. We hope our methods at least begin to cope with these issues. Cellular Research A stimulator was constructed which could administer pulses to a neuron at various average rates, and at various intensities at each of these. A single fiber in the sciatic nerve of the frog was isolated by microdissection, and was stimulated at the rates of 100, 200, 400, 600, 800, and 1,000 pulses per second, using four different values of stimulus voltage (1, 5, 2, 0, 2.5, and 3.0 times the threshold value). We recorded the output of the fiber thus stimulated from microelectrodes in the same cell and across a synapse in the next cell. As the input rate was increased, the fiber eventually ceased to follow every input and started missing some. Among the fibers which we have studied, three different types of responses have been observed. Some fibers, when they reach the point at which they can no longer follow every stimulus, start skipping every other stimulus. As the rate is further increased they respond only to every third or fourth stimulus in a regular fashion. Other fibers skip in a perfectly random manner, so that at a given rate the number of pulses skipped will have a Poisson distribution. Still other fibers transmit several adjacent stimuli and then fail to transmit any stimuli at all for a long period, after which they again fire repeatedly. Sometimes all three types of functions are found in the same fiber at different times and at different rates of stimulation. Two other phenomena were also noted. As the rate of stimu- lation was increased, there was a fall in the amplitude of the response and a decrease in the lag between the occurrence of the input and the start of the response pulse. The amplitude decrease is probably related to the energetics of membrane recovery; the lower recovery time leads to a lower potential. The decrease in latency must have a similar explanation; it makes the fiber able 314 Information Storage and Neural Control to cope with a greater overload, enabling it to follow at much higher rates than would otherwise be possible. Our findings are in harmony with others who have worked in this field. In order to measure the maximum information transmission capacity of a nerve fiber which employs pulse-interval coding, we must be able to stimulate the neuron with an input source which can deliver trains of two or more pulses at diff^erent intervals. This follows from information theory, since in an evenly spaced pulse train there is no uncertainty about the time of arrival of the next pulse, and hence no information. Maximum uncertainty is available only in a random source in which the pulse is equally likely to occur at any time. It is also necessary to determine the minimum interpulse interval which can be discriminated by the neural system. We can determine this by measuring the standard deviation of the latent period in a fiber, or its "jitter." We proceeded, therefore, to use an electronic timer, accurate to 1 microsecond, to measure the jitter of single sciatic nerve fibers of the frog, studying variation as tiie time between two ad- jacent pulses was reduced. This turned out to be of the order of 2-5 microseconds, and adding a third pulse before the other two did not aff^ect this value. Using a mathematical model developed by Rapoport and Horvath (16), we were able to calculate the curve of maximum channel capacity of such a neuron at various input rates. We found that the output increased as a function of the input up to 4,000 bits per second (an astonishingly high capacity for such a small system — assuming optimal pulse-interval coding); then leveled off" and decreased, thereafter, as the input rate increased. This performance curve is shown in Figure 1. As for adjustment processes, the skipping of pulses which we found at the higher input rates was, of course, omission. The lower output intensities could be called erroneous processing if they were not intense enough to cross the threshold of the neuron on the other side of the synapse. That threshold, incidentally, can be considered a sort of filtering. For other neuronal adjustment processes to information overload, we have no evidence. Organ Research We used the same electronic timer to stimulate the optic nerve of the white rat, recording the output from a macroelectrode on The Individual as an Information Processing System 315 O T3 < X o Fig. 1. mation 0 5 10 15 20 25 30 35 40 45 50 AVERAGE INPUT RATE PULSES PER SECOND X 10^ The channel capacity of a model neuron calculated by continuous infor- theory for a Gaussian noise distribution with a standard deviation, > K FILTERING AVERAGE SECONDS OF QUEUING 50 + 21 123456789 INPUT RATE IN BITS PER SECOND Fig. 10. Average utilization of adjustment processes by teams in Social Institution Experiment. research and had greatly improved their transinission rates since their earher trials. Four adjustment processes were used by the teams in these studies — omission, error, queuing, and filtering. The experimental instructions prevented use of approximation and multiple channels. Utilization of all adjustment processes was measured in percent- ages, except for queuing, which was measured in average number of seconds of delay (Fig. 10). An associated study directed by Meier (17) dealt with the effects of overloads of demands upon the Undergraduate Library of The University of Michigan at periods of peak use. The inflow of students and faculty into this library, each person with special needs, is not an overload of energy or matter, for the library is never actually physically unable to hold them. The demands upon members of the library staff for service, however, can constitute what is essentially an information overload. Participant observation and other operations research procedures were employed to find how much the library was used at top load periods and what changes occurred in library functions at such times. Since no significant difference in average time of getting Thf Indii'idiial as an Information Pmcessing System 325 a book was found between periods of light and of heavy use, the library may not have been under real performance overload at any time. Rough efforts were made to calculate the number of bits of information flowing through the library. It was determined that the average book title in the card catalog contains about 135 bits of information, and that the average reader processes between 50,000 and 90,000 bits per hour of reading. Perhaps the most significant finding by Meier and his colleagues was that a series of adjustment processes occurred, or could occur, in the library to cope with the overload. He recognizes the simi- larity of his list to the one presented earlier in this chapter. How- ever, he found more complex forms of these adjustment processes, or "policies" as he calls them, in this complicated social institution with its many subsystems carrying out numerous activities. His list follows: Queuing; priorities in cjueues and backlogs; destruction of low priority inputs (filtering); omission; reduction of processing standards (approximation) ; decentralization (a special case of use of multiple channels) ; formation of independent organizations near the periphery (multiple channels); mobile reserve (multiple chan- nels); rethinking procedures; redefinition of boundaries of the system; escape; retreat to formal, ritualistic behavior; and dis- solution of the system with salvage of its assets. Whether there are new adjustment processes here, or simply special cases of those we have listed is a question for debate; but that such adjustment policies are used, there can be no question. Summary of Our Research For five levels of organization, or systems, viewed as information processing channels, the following propositions appear to have support: a) When information input in bits per second is increased, the output at first follows the input more or less as a linear function, then levels off at a channel capacity, and finally falls off" toward zero. We have yet to deteimine whether the larger systems have a cut-off mechanism which prevents the final fall in output. Though such a mechanism may delay this fall, the weight of evidence suggests that it must finally occur. 326 Information Storage and Neural Control This decrease of information output rate in living systems is not the result of destruction of the system by an overload of the energy which conveys the information because 1) the process is reversible — decrease of input rate immediately raising output rate back to channel capacity, and 2) final irreversible change of such systems by energy input undoubtedly occurs when the energy is orders of magnitude greater than that involved in informational overload. b) There is a hierarchical, cross-level difference in maximum channel capacity. Assuming pulse-interval coding, we found this to be of the order of 4,000 bits per second for neurons in the frog sciatic nerve, and about fifty bits per second for a single channel in the visual nervous system of the rat. It was six bits per second for the individual, three bits per second for a single- channel group, and three to four bits per second per channel in a small social institution with about the same number of com- ponents in each channel as there were in the group. Apparently the more components there are in an information processing system, the lower is its channel capacity. There are several reasons for this. Two of the most obvious are that recoding of information is necessary at the border between each component and the next, and that such recoding always results in loss of a certain amount of information. Moreover, if there are n com- ponents in any system, one must have a lower channel capacity than the others, and the statistical probability of there being such a slow component is always greater as n increases. This sluggish component constitutes a bottleneck, since no channel is faster than its slowest component. c) Several of the adjustment processes are used by all of these systems, the use increasing as input rate rises. d) Fewer adjustment processes seem to be available to the systems at the lower levels. Those employed at the higher levels appear to be more complex as well as more numerous, although their fundamental similarity to the lower level processes is clear. Of course the findings for other types of systems at each of the levels might be difTerent in significant ways from our findings in the particular systems we chose to study. The goal of these projects The Individual as an Information Processing Sy stein 327 was to determine whether a cross-level formal identity could be confirmed for any examples of systems at different levels. It is apparent that interesting insights arise when not only individuals, but all living organisms and organizations are viewed as information processing systems. REFERENCES 1. Barlow, H. B.: Sensory mechanisms, the reduction of redundancy and intelligence. In: Mechanisation of Thought Processes, Proceedings of Symposium at National Physical Laboratory, Teddington, Eng- land. London: Her Majesty's Stationery Office, 1959, p. 542. 2. Gerard, R. VV.: Organism, society and science. Sci. Monthly, 50: 340-350, 403-412, 530-535, 1940. 3. Piatt, J. R.: Amplification aspects of biological response and mental activity. Arner. Sci., 44.- 180-1 97, 1956. 4. Swets, J. A., Tanner, W. P., Jr., and Birdsafi, T. G.: The evidence for a decision-making theory of visual detection. Technical Report JVo. 40, Electronic Defense Group, LIniversity of Michigan, Ann Arbor, April, 1955. 5. Stevens, S. S.: Cross-modality validation of suljjective scales for loudness, vibration, and electric shock. J. Exp. Psychol., 57:201- 209, 1959. 6. Broadbent, D. E.: Perception and Communication. New York, Pergamon Press, 1958, p. 297. 7. Quastler, H.: In Human Performance in Information Transmission. Con- trol Systems Laboratory, Report No. R-62. Urbana, University of Illinois, 1955. 8. Broadbent: op. cit., p. 5. 9. Quastler: op. cit. 10. Quastler: ibid., p. 62. 11. Quastler: ibid., pp. 62-63. 12. Barlow, H. B.: Increment thresholds at low intensities considered as signal noise discriminations. J. Physiol. (London), 141 :?>'il-?)SO, 1958. 13. Deininger, R. L. and Fitts, P. M.: Stimulus-response compatibility, information theory, and perceptual-motor performance. In, H. Quastler (Ed.), Information Theory in Psjchology. Glencoe, The Free Press, 1955. 14. Brown, R. and Lenneberg, E.: A study in language and cognition J. Abnorm. Soc. Psychol., ^P.-454-462, 1954. 328 Information Storage and Neural Control 15. Mowbray, G. H. and Rhoades, M. V.: On the reduction of choice reaction times with practice. Qiiart. J. Exp. Psychol., 7 7; 16-22, 1959. 16. Rapoport, A. and Horvath, W. J.: The theoretical channel capacity of a single neuron as determined by various coding systems. Inform, and Control, 3:335-350, 1960. 17. Meier, R. L.: Social change in communications oriented institutions. Mental Health Research Institute, Preprint No. 10, March. 1961. CHAPTER XIV INFORMATION PROCESSING IN THE TIME DOMAIN Neil R. Burch, M.D. and Harold E. Childers T« HIS paper briefly outlines the work we are conducting in the Department of Psychiatry, Baylor University College of Medi- cine, and in the laboratories of the Houston State Psychiatric Institute. The basis for this research is the theory that a special case of analysis in the time domain has something to offer both in time resolution and in economy of information processing that cannot be readily obtained from frequency analysis or from more conventional time sampling procedures. The analytical process to be described we have called period analysis (1). Given an amplitude function distributed in tiine, there are a limited number of questions that may be asked of the function to yield an analysis or to undertake data reduction. Consider the following four cases: 1) One inay focus on the amplitude and ask the question "how much" over a time, T; one may focus on both time and amplitude and ask the question "how much" at par- ticular points in time, either 2) points at fixed intervals or 3) points related to an event; finally, 4) one may focus on selected events and ask the question "when." A theorem in information theory tells us that if we take this amplitude distribution in time and sample it every so often, we will retain complete information about the signal. Presented more formally, the theorem reads, "If a function G{t) contains no fre- quencies higher than W cycles per second, it is completely deter- mined by giving its ordinatcs at a series of points spaced jrr seconds apart, the series extending throughout the time domain" (2) (Case 2). A corresponding theorem for sampling in the frc- 329 330 Information Storage and Neural Control quency domain* requires exactly the same number of sampling points plus one, or: 2TW + 1 where T is the duration of a signal, W is the spectral band width, and the sampling points are spaced at fixed intervals. If we can say in the electroencephalographic (EEG) signal, as an example, that the highest frequency which carries neuro- physiological information is 100 cycles per second, we know that we must sample at least 200 times a second — perhaps much more often if there is considerable noise in the system — in order to retain all of the information. While we are now satisfied that in both the time and frequency domain we may ask of the EEG signal "how much" at 200 points per second, we have also posed ourselves a massive problem in data handling. Further, we have not learned anything of the optimum analytic procedure, since to retain all information in the original signal "defeats the very purpose of analysis, which is to abstract and emphasize only significant changes." (4). There remain the case of coding "how much" related to a selected event and the even simpler case of asking only "when" the event occurs. In both of these remaining cases, the coding event must be defined and the assumption made that all 200 points per second in the EEG do not contain the same amount of infor- mation. Let us for a moment suppose that some of these points contain ten times the information of other points. Then we may drop the low information points, retain the high information points and sacrifice a unique characterization of the wave for a good approximation. Such a process would be highly economical in terms of handling the data. The critical problem is, of course, the generation of the coding event which acts as a "metasignal" in the sense in which Gregory Bateson used the term for us earlier. In the first paper of this symposium, Bernard Saltzberg introduced you to Maxwell's demon. I would like to propose another hypothetical information demon, one that might look at each of our points and say, "We'll *If fi(w) represents the spectrum of a function G(t), which is zero everywhere except in the range Tj < t < T2, then i2(w) is exactly determined for all values of w by giving its values at a series of points ^ / {Ti — T2) cycles per second apart in fre- quency, the series extending throughout the frequency domain. (3) (Case 1). Information Processing in the Time Domain 331 take those," or "No, that's low information, drop that." If we speculate that our demon is extremely conservative and expects the signal to be linear as a function of time, a straight line deter- mined by two or more points, then any point that agrees with this assumption is a low information point. The demon has predicted that the signal will not change from positive to negative values, will not change its sense of positive-negative direction, will not even change its sense of curvature. The high information points now become zero points, minimax points and points of inflection in the primary signal. The coding points generated are at the baseline cross of the primary signal, of its first derivative and of its second derivative. We might have a second type of demon, a neurophysiological demon, that can tell us when a significant neurophysiological event is reflected in the signal. This demon identifies our semantic information as contrasted to statistical information. It would be nice, of course, if both these demons were the same. In order to "twin" our two friends, we would be forced to assume that the brain sees change and rate of change of the electrical potentials in its subpopulations as highly significant in- formation. We would also conclude that the wave shape of our EEG signal is rich in semantic information as compared to characteri- zations in the frequency domain such as frequency or power spectra. Defining the coding points for amplitude sampling as the baseline cross of the primary and its first and second deri\^atives allows us to take discrete data in a definite but not uniformly spaced pattern (Case 3). This, on the average, should result in fewer sampling points than the folding, or Nyquist, frequency requirement (5) discussed previously. Period analysis is a further simplification of this general process in that the amplitude of the function is not sampled at all. The theoretical justification for this approach has been developed in terms of the Gram-Charlier series (6). The remainder of this paper will explore period analysis as a special case (Case 4) of information processing in the time domain. The questions to be asked concern retention of both statistical and semantic information during period analysis of several bio- electronic signals. Figure 1 illustrates the characteristics of the first and second derivatives. The function /(.v) in the upper right hand corner of the figure represents an evoked potential which feeds into a 332 Information Storage and Neural Control DERIVATIVE CHARACTERISTICS "to 56 66 TfMSSSKt (CTCIXB m SBXMD) Fig. 1. Electronic Parameters of the Mathematical Derivative. The 90° phase shift and linear doubhng of ampHtude per octave is illustrated as the electronic definition of a first derivative. The sharply increasing amplitude of the second derivative with increase in frequency emphasizes the accentuation of high frequency com- ponents. The three functions on the right of the figure graphically illustrate the eff'ect of derivative processing. differentiating network to yield the first derivative, / (v) . The first derivative, through an identical differentiating network, gives the first derivative of the first derivative, or second derivative, /"(v), of the primary evoked potential. These functions, after Lorente de No (7), illustrate the external action potential of bullfrog alpha fibers and its first two derivatives. It is clear that the high frequency components of the primary evoked potential are greatly accentuated by double differentiation. The electronic definition of a derivative is the same as the mathematical definition except that it is couched in different parameters. The phase shift required in a sine wave is 90° for the first derivative and 180° for the second derivative. The important parameter for our purpose is the amplitude characteristic as illustrated in Figure 1. Given a mixed sine wave made up of equal amplitude twenty cycle per second and forty cycle per second components, the first derivative will yield twice as much amplitude for the forty cycle per second component because it is twice the frequency of the twenty cycle Information Processing in the Time Domain 333 per second component. This linear relationship holds throughout the band pass range. The second derivative multiplies the forty cycle component by a factor of 4, the eighty cycle component by a factor of 16, etc., in this example. Figure 2 illustrates this deriva- tive processing as it is applied to the electroencephalogram. The faster frequency components present in a complex primary wave become full-fledged baseline crosses because of the relative accentu- ation of the faster frequencies. Period analysis proceeds by generat- ing square waves at the baseline cross of the primary, the first derivative and the second derivative. As can be seen in Figure 2, the square wave train designated as major period reflects the domi- Fig. 2. Pulse JVidth Conversion: EEC. The process of period analysis applied to the left parieto-occipital electroencephalogram. The 60 cycle per second artifact superimposed on the original primary trace is markedly reduced by the rejection notch of the selective frequency amplifier, as seen in the filtered primary. The "fragmented" appearance of the second derivative minor period results from the high inertia pen system which cannot foUow^ a true square wave at these fre- quencies. Paper speed 60 millimeters per second. 334 Information Storage and Neural Control nant rhythm of the analog primary. The second derivative square wave train, referred to as the minor period, carries information reflecting superimposed fast activity, desynchrony, and waveshape. It is of particular importance to know how much wave shape information is retained or lost in this processing, because it is probably the wave shape which triggers recognition in the human computer in clinical electroencephalography. We propose that much of the wave shape information is retained in the three square wave trains as they relate to one another in time, as we have attempted to illustrate in Figure 3. The top trace is a synthetic function made up of a "dominant" nine and one-half cycle per second sine wave mixed with a lower amplitude "superimposed" PERIOD RECONSTITUTION SYNTHETIC FUNCTION \nj^j\r'iS\j\s^v^'\rvriP^^ /Wu^^ mmmmmm \\m\mw nfunnn^n, m wmW' / Fig. 3. Mixed Sine Function. K^Vi cycles pei- second sine wave niLxed with a lower amplitude sine wave of approximately 36 cycles per second simulates a "dominant alpha"||with "superimposed fast frequency components." Smoothing, mixing, and smoothing of the tliree square wave trains result in tlie reconstituted signal of tlie bottom trace. Similarity between reconstituted and original signal suggests that wave shape information is retained by the processing. Information Processing in the Time Domain 335 fast frequency of approximately thirty-six cycles per second. Again we see the primary square wave, or major period, reflecting the "dominant rhythin" and the second derivative scjuare wave or minor period rather clearly reflecting the fast component. If these three trains of square waves are smoothed individually by an integration filtering operation, inixecl, and smoothed once more, the reconstituted analog signal may be written out as shown on the bottom trace. In a way, this reconstitution is an inversion of the operations which generated the square waves in the first place. There is a rather striking resemblance between the reconstituted primary and the original signal, although careful inspection will reveal some discrepancies in both phase and amplitude. However, the wave shape, by and large, has been retained. If the process PERIOD RECONSTITUTION ELECTROENCEPHALOGRAM PRI«UC WAV? 'JFJ^lTi v^/^ •jn; LTUir^ JV" Jin^a nf L Wl*UW SQUARE WA7E SECt-i \^V tEcyi 'A:^s\jur[mirTKf\^ ^IR3T i '^rwrus^AT nfi/n4Mi4ifiAaa isfUJW/lM^LJUU \iWv/\ Fig. 4. Left Parieto-Occipital EEC. Period analysis of the electroencephalogram yields the three square wave trains shown. The three square wave trains are smoothed, mixed and smoothed to reconstitute the wave forms in the bottom trace. The complex EEG signal retains enough information in the processing to allow clinical interpretation. 336 Information Storage and Neural Control does this well on a simple wave, what may be expected from the rather more complex signal of the electroencephalogram? Figure 4 shows that the electroencephalogram is not recon- stituted as successfully as the simple mixed sine waves. Some of the amplitude modulation features are lost, the envelope is not as clearly evident on the reconstituted signal, and some phase shift is apparent as distortion in a number of the waves. Here again, however, the resemblance of the reconstituted wave to the original one is rather striking. The clinical electroencephalographer would probably interpret the reconstituted EEG in much the same way as he would the original, and would render much the same clinical impression after reading the reconstituted forms. Interpretation of the major and minor periods may be accom- plished in the same way as interpretation of an EEG but with less equivocation. Anyone who has attempted to reduce, quantita- tively, long stretches of EEG record by any form of hand analysis will appreciate the significance of this. The square waves may be further processed and displayed in several difTerent ways, depending on the physiological event under investigation. One system we have used quite extensively distributes the major and minor periods in a ten second epoch over ten bands in the major period and ten bands in the minor period. Table I defines the bands we are currently using in terms of equivalent frequency. A square wave of the same duration as the square wave generated by an eight cycle per second sine wave falls into band 4 of the major period and band 1 of the minor period. The major period bands TABLE I Band Distribution (As Equivalent FREquENCv) Currently Being Used IN "Spectral Display" of the Square Wave Trains Generated by the Process of Period Analysis Major Period Minor Period {Eq uivalent Frequency) {Equivalent Frequency) Band in cps in cps 1 1.5-3.5 1.5-10 2 3.5-5 10-20 3 5-7.5 20-30 4 7.5-10.5 30-40 5 10.5-13.5 40-50 6 13.5-18.5 50-60 7 18.5-30 60-70 8 30-50 70-80 9 50-80 80-90 10 80-100 90-100 Information Processing in the Time Domain 337 SPECTRAL ANALYSIS BPINEPHRINE EFFECT Fig. 5. Histogi am-like Display Resulting from ''Band Breakdown'' of Square Wave Trains. Small downward spikes indicate ten second epochs. The upward spikes are proportional in their amplitude to the percentage time occupied in tlie previous ten seconds by square waves which fell into a particular band (see Table I). approximate the frequency breakdown employed in clinical electro- encephalography. Major period band 1 covers delta activity, band 2 is a "slow" theta, band 3 is a "fast" theta, etc. Minor period is distributed as ten cycle increments in each of the ten bands. The histogram-like display resulting from this band breakdown process, which we call spectral analysis, is illustrated in Figure 5. In both traces the small downward spikes are ten second epoch markers. The upward spikes are proportional in their amplitude 338 Information Storage and Neural Control to the percentage time occupied during the previous ten seconds by square waves which fell into a particular band. Reading" from left to right between epoch markers, the first band of the minor period is the percentage time of all minor period square waves from one and one-half to ten cycles per second (equivalent fre- quency). The second band of ten to twenty cycles per second has been reduced from as high as 30 per cent time (full scale equals 50 per cent of full time) in portions of the pre-drug record to an insignificant percentage during the epinephrine effect. The higher frequency bands on the right of the spectrum have increased in amplitude some 30 to 50 per cent. This sort of change we refer to as a "shift to the right." It is characteristic of mild arousal such as may be simulated by five micrograms intravenous epi- nephrine per minute. The major period clearly shows a decrease in band 4 as the alpha activity is suppressed and replaced by higher frequencies and some delta activity. We suspect that level of sleep can be followed quantitatively by a simple measure of the percentage delta time as well as more precisely by the spectral epochs. We say "suspect," since to prove that sleep can be fractionated into, say, fifty distinct levels would require an independent measure of the state of consciousness having the same order of resolution as the variable we are trying to demonstrate. Unfortunately, we are unaware of a performance measure or any other measure that allows quantitation of the state of consciousness or state of arousal with as high resolution as we think is possible with period analysis of the EEG. Several bioelectronic measures other than EEG may be amen- able to period analysis or to some modification of the process. Figure 6 shows the first derivative of the galvanic skin response (GSR) signal as it is employed to generate square waves coinci- dent with the onset-to-peak-amplitude time in the primary wave. We regard the onset-to-peak-amplitude time as "active GSR time" since it is the time of depolarization of the membrane which is the effector site of this phenomenon. Automatic analysis of the GSR produces two parameters of real importance in psycho- physiological interpretation. The number of square waves gen- erated per epoch, perhaps ten seconds or perhaps five minutes, and the duration of the active GSR time for the given epoch are partially independent parameters which seem worth considering Information Processing in the Time Domain 339 AtTIVi ■, GS8 SIGl Shh ^''-^^ ~ ' DERIVATIVE l! High Frequency Noise ii Rejected by Pulse 11 Width Dis-'-criraination IjOW Frequency, Low Level Noise Not Converted to Rectangular Pulses Effective Response Level Fig. 6. Period Analysis of the Galvanic Skin Response. The first derivative of the primary GSR clearly shows the accentuation of "noisy" fast components. The effective response level or threshold for square wave generation "filters" out slow components of insufficient amplitude. Square waves of less than one second are filtered out by the "period filter" (see text). 340 Information Storage and Neural Control in the interpretation of states of arousal. Figure 6 also illustrates the use of a "period filter" (digital filter), which is analogous to a resonant frequency filter in the frequency domain, but which does not have the inherent disadvantages of time lag for energy buildup and decay. If the square wave is of less than five milliseconds dura- tion in the case of EEG and of less than one second duration for the GSR, the period filter will not "pass" it for further proces- sing. The GSR recording is often plagued by relatively high frequency noise from movement artifact. In a noisy record this high frec^uency artifact may produce as much as 85 per cent of the square waves. It is a great convenience to be able to "filter" them out. While the system just described is of real practical value in the analysis of the GSR activity of a single subject, it is indispensable to the "coincidence" analysis of GSR's from four subjects in group interaction. Figure 7 is a record of this type of analysis in which The ESToajNE-ANGus Co Fig. 7. Coincidence Analysis of Group GSR. Lines two through five show the square wave trains generated by the baseline cross of the first derivative of the GSRs recorded from Subjects A through D. The coincidence, or overlap, of "active" GSR time between all pairs of subjects is shown in lines six through twelve. Three of a "kind" and four of a "kind" can be seen in lines thirteen through seventeen. Information Processing in the Time Domain 341 a special purpose digital computer is employed to measure the amount of coincidence between the active GSR time of the in- dividual subjects in various combinations. The square wave of Subject A is compared for coincidence or overlap in time with the square waves from the GSR of Subject B, C, D, etc. In a four- man group, the coincidence of all four GSR's at one time is a relatively rare event. Such a ''four of a kind" coincidence is usually the result of a rather strong stimulus which has been experienced by all four members. Consideration of coincidence analysis has led us tentatively tc formulate a model of group interaction predicated on "overlap of value systems'" among the individuals of the group. The specific GSR represents perception and reaction to a specific stiinulus. Generally, it may be assumed that a stimulus has meaning or an "aflfect investment" for the individual if it produces a GSR. The GSR, as an indicator of "investment," is taken as a "yes-no" index without regard for the afi'ect polarity. That is to say, the occurrence of a GSR reveals that the stimulus is "invested" but does not reveal whether the stimulus produces a positive aflfective response or a negative affective response. We are aware of some of the diflficulties implicit in this rather simplified interpretation of the GSR in relation to the psychological variables of affect and investment. We suspect that under certain circumstances extreme high negative affect may "freeze" the GSR and wipe out all response. It may be that this sort of inhibition effect is an idiosyncratic response of the individual or that such a phenomenon may be seen more often in the schizophrenic than in the normal patient. The group interaction is seen as a continuously moving field which presents a sequence of stimuli to all individuals in the group. Some individuals may not perceive a given stimulus or may derive no meaning from it. When two or more individuals perceive and are invested in a given stimulus, it is postulated that each will produce a GSR and that these GSR's will be approximately coincident. Insofar as two individuals have coinci- dent GSR's to a finite but large stimulus array, our hypothesis would suggest that they have "overlap of value systems." In clinical group therapy we might ask the following question: "In the course of group therapy, will two patients in the same diag- 342 Information Storage and Neural Control nostic category show more GSR coincidence as a pair than would one of tiiese patients and a third patient in a different diagnostic category?" Also of interest is the total number of overlaps for a particular group. At present we are able to analyze only four people at a time, even if the group is composed of 8 to 10 individ- uals; but unfortunately in a four-man group or in a four-man subgroup there are various degrees of "coupling" and communi- cation between members that may change the number or degree of coincident GSR's. For such interpretation the total GSR population should be taken into account because the number of overlaps must be soine function of the total number of GSR's generated. In a very loosely coupled group, such as four people in four different rooms without communication, there is a certain probability of overlap that can be computed theoretically. A somewhat more difficult theoretical problem is that of the expected 1500 . VBU. <»OUP OW TWW8 O* OVMUr n MTO • 1300 - • • 1100 - • • 900 - 1 1'°° - . 1 ' • . • 500 ~ 200 - ... 100 .L • I 1 1 1 1 i 1 1500 Sl«tt 1 Fig. 8. Scatter Diagram Representing Approximately 50,000 GSR's Recorded in Group Therapy. The number of coincident GSR's between pairs of subjects per group (Sigma 2) plotted against total number of GSR's per group (Sigma 1) shows a linear relationship which may be used as a baseline for interpretation and cor- rection for overlap expected on a probability basis. Information Processing in the Time Domain 343 overlap value in the moderately coupled group of four people in the same room in therapeutic group interaction. In an empirical approach to the problem of moderate coupling, we have plotted the total number of GSR's generated by a group against the total number of overlaps for that group. Figure 8 is the scatter diagram of two different groups in therapy. These data represent approximately 50,000 GSR's. The rather good linear relationship in a fair-sized population, with respect to number of subjects and hours of interaction, suggests an expected value of GSR coinci- dence which may be used as a baseline for the interpretation of overlap between subjects for small increments of time. This tech- nique may allow us to reconsider group process studies in terms of this new approach. The final application of period analysis which we would like to describe is its use in connection with the electrocardiogram (EKG). Figure 9 summarizes some of the parameters, relationships and cjuestions which are of interest to us in reduction of the EKG. D/iTA OBTAINABLE FROM PERIOD ANALYSIS m PRIMARY EKG ^^ Q S f, FIRST DERIVATIVE ANALYSIS OF PRESENT MEASURES OTHER MEASURABLE PARAMETERS QRS T PR INTERVAL PR SEGMENT QT INTERVAL ST SEGMENT ST INTERVAL FIRST DERIVATIVE PR SEGMENT PR INTERVAL QRS ST SEGMENT SECOND DERIVATIVE PP" SEGMENT PR INTERVAL QRS ST SEGMENT PRIMARY a FIRST DERIVATIVE RO-T INTERVAL RO-U INTERVAL P-RO SEGMENT a TIME DURATION OF ANY WAVE bTIME DURATION OF ANY INTERVAL ( ZERO CROSSING ) c SIGNATURE RECOGNITION a TIME DURATION OF ANY WAVE b TIME DURATION OF ANY INTERVAL c SIGNATURE RECOGNITION ANALYSIS QUESTIONS I. IS THE P WAVE INVERTED? 2 IS THE R WAVE INVERTED? 3,IS THE T WAVE INVERTED? 4 IS THE MAGNITUDE OF P,Q,' R.S.AND T GREATER THAN SOME CONSTANT? 5.Q = S? (TIME) 1 IS THE RATE OF CHANGE IN p Q.R.S.T, AND U WAVES GREATER THAN SOME CONSTANT? 2 IS THE P WAVE SYMMETRICAL? 3. ARE CERTAIN WAVES INVERTED ? 1 IS THE ACCELERATION OF P, Q,R,S,T, AND U WAVES GREAT- ER THAN SOME CONSTANT? 2 HIGH FREQUENCY ACTIVITY? 1 SYMMETRY OF T ? 2 ARE THERE NOTCHES IN P,R, AND T WAVES ? Fig. 9. Classical EKG Wave Shape and Derivatives. Parameters employed in clinical interpretation are related to other parameters not usually considered and to questions which might be posed in the analysis. 344 Information Storage and Neural Control PERIOD ANALYSIS OF EKG PRIMARY WAVE T^ i-ST »tc— $T INT BASE LINE 1 m [ BASE LINE CROSSINGS OF PRIMARY WAVE m rri POSITIVE WAVE NEGATIVE WAVE FIRST DERIVATIVE BASE LINE CROSSINGS OF FIRST DERIVATIVE _ra R R r m Ri R EZi BASE LINE POSITIVE WAVE NEGATIVE WAVE SECOND DERIVATIVE BASE CROSSING OF SECOND DERIVATIVE >"^..,/\j — BASE LINE nn 13 f? n n positive ' ' ' ' wave nn nw\n nn nega^t^.ve Fig. 10. EKG Positive and Negative Square Wave Trains. The square wave trains generated by baseline crosses of the primary, first derivative and second deriv- ative of the EKG are detailed in this ilkistration. All durations and intervals are available for computation. Information Processing in the Time Domain 345 Figure 10 presents the square waves generated by both the positive and negative portions of tlie EKG and its derivatives. The physio- logical information contained in these square waves and in their relations to one another is still largely unknown. Reports of re- cent studies employing general purpose computers and utilizing coding points similar to period analysis indicate success in charac- terizing and classifying normal and pathological subjects (8). We would like to expand one particular problem of EKG analysis as we have approached it in our laboratories. Both low wave Fig. 11. Recognition of a 'fat-thin-faV' Square Wave Set. The synthetic function of mixed sine waves slowly changes wave shape over time. If, and only if, the wave shape generates a square wave sequence within acceptable limits, the complex is "recognized," as indicated by the spike in the recognition pulse trace. 346 Information Storage and Neural Control y^^ Jl/^aJL/^^J F:ltai*d irljirx IID MiJor Riieasiiltlia FlIm Fig. 12. Artifact-Contaminated EKG. Slow wave artifact distorting wave shape and high frequency "pop" artifact are "filtered" out by absence of "recognition." Only those complexes within acceptable limits are held in intermediate storage for further computation. Sixty-cycle artifact is filtered by conventional resonant rejection circuits. sway artifact and high frequency movement artifact demand, for practical analysis, automatic rejection of the contaminated complex. The "filter" we have employed is a system of signature recognition or pattern recognition. A somewhat different type of pattern recognition, as defined in the recent work of Steinberg, et al. (9), is a hybrid combination of Cases 3 and 4, and again utilizes several coding points of period analysis. Figure 11 again presents a synthetic function of mixed sine waves. The two oscillators drift in relation to one another over time, and the wave shape pattern changes with this phase shift. The square wave train generated by the baseline cross of the primary, the major period, is presented to digital circuitry which "recognizes" a complex if, and only if, it is made up of a "fat- thin-fat" square wave set. The definition of "thin" and "fat" Information Processing in the Time Domain 347 square waves and the combination of these square waves may be set up with any desired hmits or sequence; we adjust them for a given EKG signal. The hmits for this particular example are 320 to 66 milliseconds for a "fat" square wave and 88 to 2.7 milliseconds for a "thin" square wave. The trace designated "recognition pulse'' in Figure 11 illustrates by the absence of a pulse the rejection of an improper sequence and individual square waves not falling within the defined acceptable limits. Figure 12 presents signature recognition as applied to the EKG. Both high frequency artifact and baseline sway distort this signal and are rejected, so that in this figure only three complexes have been "recognized." Figure 13 is the identical EKG signal taken at a slower paper speed to display more clearly the rejection of sway artifact. ii)iwffiiHfH'H'H'y|i'iiff!|i"''!"|'Hiif^iH''tti^ir^-vt^ ')'^^ll^| n ''''t^jWf^ni'^ Major Period !rj^v4ru-i_irjr|i*i r|rv4j-t/ jrjrTff !rjri'-lrv-u-|rjj-lrun/-inr|ri^ — r^-^j^ r|n r-Y^ryy^^ Recogr.ltlon pjlse Fig. 13. Slow Writeout of Artifact Contaminated EKG. Rejection of those portions of the record distorted by sway artifact is demonstrated by the absence of recog- nition pulses in die lower trace. 348 Information Storage and Neural Control SUMMARY Period analysis has been described as a special case of informa- tion processing in the time domain. Illustrations have been offered of the application of period analysis to the electroencephalogram, the galvanic skin response and the electrocardiogram. The period filter, coincidence analysis of GSR, and signature recognition of EKG have been detailed as special techniques appropriate to information processing in the time domain. ACKNOWLEDGMENTS The authors would like to thank Messrs. W. A. Spoor, A. J. Welch, and R. J. Edwards for their creative contribution in relation to the work reported. REFERENCES 1. Burch, N. R., and Childers, H. E.: Physiological data acquisition. In. Psychophysiological Aspects of Space Flight, ed. by Col. Bernard E. Flaherty. New York. Columbia University Press, 1961. 2. Goldman, S.: Information Theory. New Yoik, Prentice-Hall, 1955, p. 67. 3. Goldman, S.: Information Theory. New York, Prentice-Hall, 1955, p. 73. 4. Burch, N. R.: Automatic analysis of the electroencephalogram: A review and classification of systems. EEC & Clin. Neurophysiol., 71: 827-834. 1959. 5. Blackman, R. B., and Tukey, J. \V.: The Measurement of Power Spectra, New York, Dover Publications, 1958. 6. Saltzberg, B., and Burch, N. R.: A rapidly convergent orthogonal representation for EEG time series and related methods of auto- matic analysis. IRE WESCON Convention Record, Part 8, 1959. 7. Lorente de No, R.: A study of nerve physiology. Studies From the Rockefeller Institute for Medical Research, 752.- 384-482, 1947. 8. Rikli, A. E. et al.: Computer analysis of electrocardiographic measure- ments. Circulation, 24:643-649, 1961. 9. Steinberg, C. A., Abraham. S., and Caceros, C. A.: Pattern recog- nition in the clinical electrocardiogram. IRE Trans, on Bio-Med. Elect., 9:23-30, 1962. Information Processing in the Time Domain 349 DISCUSSION OF CHAPTER XIV H. W. Shipton (Iowa City, Iowa): May I ask two questions please. First, have you used the advantages of your period analysis system to study the so-called "squeak" effects that were reported by Storm van Leeuwuen about two years ago? Second, what is your approach to the inherent difficulty with all these systems of analysis of presenting multichannel displays? Have you, for example, written out the records for two channels recorded simultaneously? Neil R. Burch (Houston, Texas): The answer to your first cjuestion is no. We have not investigated the '"'squeak" effect reported by W. S. van Leeuwuen. The answer to your second question is that the single-channel processing" we have been doing for a number of years has been directed toward trying to quantify changes in the state of consciousness. We are particularly interested in minimal shifts in the state of consciousness rather than in con- ditions when a man is in coma or in a state of panic. The work we have done in the last year and a half has been directed toward the problem you ask about. For the display of multiple channel information and for better display of the single channel, we are using a type of analysis that is the inverse to the overlap analysis of the group GSR. We generate a train of square waves with signal A. These square waves are minor period square waves gaited by the major period. This yields a burst of minor period square waves, a blank space, a burst of minor period square waves, a blank space, etc. The duration and positioning of these waves are characteristic of the wave shape in this signal. We then take signal B and do exactly the same thing. Now we have two trains of square waves. We put them into norlogic circuits and ask the question: "How much anticoincidence is present?" If these are identical waves, we get no readout at all. If there is dissimilarity between signal A and Signal B, even in very minor phase shifts, then this system reads out either the exact amount of instantaneous anticoincidence or the sum over one second or more. We also plan to display this information toposcopically, and hope to be able to handle up to 10 channels in this way. PART V — SUMMARY AND GENERAL DISCUSSION Moderator: Ralph W. Gerard, M.D., Ph.D. CHAPTER XV SUMMARY AND GENERAL DISCUSSION Ralph W. Gerard, M.D., Ph.D. I AM not confronted here with the problem that so often emerges in trying to summarize a symposium of this kind, because Drs. Fields and Abbott have so clearly exhibited the logical bones of the organization. I think it has been beautifully planned and, on the whole, beautifully executed. There have been many good talks and many interesting lines of thought developed, not all of which, obviously, can I allude to; nor shall I attempt to mention all the participants in the course of my discussion, although I shall refer to things said by practically all. A few items to start us off. Dr. Lindsay, in the opening theory session, made rather a point of distinguishing product theories from process theories. I had not previously heard the dichotomy in that particular form, but I liked it. It is equivalent, I should think, to molar and molecular theories and to the term introduced by Mainz, order-analytical interpretations and cau.sal-analytical interpretations; and it does, as Lindsay suggested, imply a progressive reduction from one level to another. He seemed to think this is primarily because psychologists are reaching out hands toward neurophysiologists. I think the hands are coming from both sides of the gap; and, indeed, still partly an act of faith, I am quite convinced that the hands have about touched. At the level of genes, Kit and Echols, gave the beautiful evidence showing that the genetic code is about to be broken; and, as I listened, it seemed that here, also, interest was moving from one level of thought to another. There was again reductionism; prob- 353 354 Information Storage and Neural Control lems that started pretty clearly as biological ones have now become of interest almost entirely at the level of pure chemistry. The problems here are very sharp and, therefore, will very soon become dull; because, when it is possible to formulate the issues as clearly as it now is, getting the answers is a matter of hard work but often lacks major intellectual excitement. I think the great epoch of the nucleotides is rapidly drawing to a close, although several Nobel prizes are still lurking there; I am not denigrating it, I assure you. I think the most exciting area for the future is rather in reducing behavior to neurophysiology. The questions here are still fuzzy enough so that almost any kind of answer is likely to be exciting. Going on with the group, Bateson gave us his charming presen- tation as raconteur and experimenter. He exemplified beautifully the story that psychologists love to pass around: One rat says to another, "By golly, I've got my experimenter trained now! Every time I push the lever, he feeds me." He discussed the fact that one deals with metasignals for information as to the kind of world one is facing, and, in this connection, there are several points that I cannot resist making. There is an obvious experimental prediction, which perhaps has been checked. (I understand such experiments do give the predicted results.) Bateson compared the classical conditioning experience of one rat with the instrumental conditioning experi- ence of another, and said that each rat then allowed free experience in the world would find his experiment-induced expectations more or less reinforced. This is part of establishing a particular learning set. An animal given a learning set in terms of experience with classical conditioning should learn an instrumental conditioning situation less easily than would a naive animal, and vice versa. At the human level, we at Michigan have an interdisciplinary study on schizophrenics, attempting to break them up into sub- categories. Our social scientist came to the interesting conclusion that the social space in which a schizophrenic subject lives (in contradistinction to the non-schizophrenics in the same hospital and under the same conditions) — his social world — is different from that of non-schizophrenics and that the behavior of the schizophrenic, so abnormal relative to our world, may not be too Summary and General Diseussion 355 inappropriate to his. This is closely related to what Mr. Bateson was saying". In the section on the nervous system, Dr. Brazier gave us an excellent picture of the whole field, with some emphasis on how spontaneous wave generation might give an internal comparison standard. Dr. John picked this up in his research report, then he and Dr. Morrell had a good discussion on the mechanism of fixation, to which I shall return. At the human level, Dr. Miller contrasted the problem of energy and information flow and intro- duced the concept of levels, and Dr. Burch discussed similar problems in connection with his technicjue of extracting informa- tion from a complex temporal signal. Now, what can one do to integrate all these fine materials? I should like to conduct this discussion in terms of four major headings: l)the question of order and information in general, and as applied to organisms; 2) the role of the environment; 3) the problem of malleability; and, 4) the problem of fixation. At the end I shall say a word about our own work on fixation. I am not an information theoretician, but it seemed to me when I began to put this summary together that organizing the material as follows gave me further clarification of the session on information: Think of a deck of cards in any particular order; obviously the energy in it is exactly the same for any order. If you burn the deck, the calories obtained are the same whatever the order. Furthermore, any particular order in a well-shuffled deck is just as probable as any other particular order. Certain orders are of more interest than others, but any order would be of great interest to a player for it determines the hands that are dealt. I think it is useful to distinguish a structural order such as the kind of order in which the cards come from the manufacturer (ace through king and one suit after another). Such structural order we easily recognize in architecture. Usually it implies some regularity and symmetry and repetitiveness, and ordinarily we are likely to call this "order." But I can easily demonstrate to you another very diff'erent order which I might call functional order — an apparent "disorder"' in arrangement that emits ordered behavior. You may have played this little trick as a child: Organize the cards so that by moving the top card to the bottom at each 356 Information Storage and Neural Control letter and turning up one at the word you spell out o-n-e — one; the ace appears: t-w-o — two, the two is turned; and so on, right through the deck, ending with the last two cards of the last suit. Examination of the cards as they have been ordered in the deck so as to give this functional output, which recreates the structural order of the original package, reveals nothing" at all; the deck seems to be completely messed up. Either kind of order is produced by some operation of the environment on tlie system, on the deck of cards; and the amount of information contained in it, in the technical sense, is a matter of how well we know the rules that produced that particular order. If, for example, one gives the value of tt to many hundreds of digits, the number of bits needed to transmit it would increase without limit at the rate of over three bits per digit. But if the formula for calculating tt is given, very few bits are needed for a limitless number of digits. I suggest that one sees structural order quite easily and recognizes the rule almost intuitively; whereas, one does not see functional order nearly so easily nor tumble at once to the rule. But when we do find the rule, the information collapses and we no longer have the element of surprise. Certainly the whole history of scien- tific development has followed such lines. In every area we have recognized structural elements, structural entities, and regularities long before we have paid attention to functional ones. Turning now to organisms in this connection, stored information need not require any expenditure of energy. It may, of course, if storage is dynamic, but it need not, as in the structural storage of books or pictures. Information flow does take energy, but negligible amounts will ordinarily suffice. One can think, in organisms, of an overall structural information, seen in the total morphology that has been built up. This is what Patten was concerned with in his study of the morphology of an ecosystem, a kind of epiorganism. This is of interest per se to the anatomist, the structuralist; but to the behaviorist, the physiologist, it is of interest more in terms of what it can yield as patterned behavior. If the system is suddenly made unable to behave, if it is killed, most of this information remains present, at least for a time, but it is no longer of any functional use or interest. In a way, what Summary and General Discussion 357 I have just called structural information is the same as stored information; but we tend to think of these gross structures a little differently from the micro ones of ordinary memory, to which I shall return. In all, of course, storage of information is a matter of past experience, either of the race, with phylogeny and ontogeny laying down structures that ate essentially uniform from individual to individual in the species, or of individual experience and learning, with the attendant high variance. The flow of information was discussed fully by Dr. Miller, but I shall add a few general comments. First, all the informational aspects of organisms are induced originally by the environment acting upon the system, and changes in these aspects are over- whelmingly the result of continued environmental influence. There are, therefore, two extremely interesting questions to raise about such influence. The first concerns the sensitivity of the system to environmental influence; the second, the establishment of an enduring change. Sensitivity can be of two kinds: 1) quantitative — what threshold of an environmental disturbance or alteration is necessary for the system to recognize it, so to speak; and 2) quali- tative— what specificity exists, what discrimination is made be- tween different kinds of environmental influences — which is per- haps even more interesting. So we have the subquestions of threshold and of specification. The other large question has to do with the conditions under which a transient action of the environment leads to a response of the system. The environmental action, although originally ephemeral, may become irreversible and lead to a permanently altered system. When and how does a reversible response of the system become an irreversible change? This is the essential prob- lem of evolution, of individual development, of group history, and, of course, of individual learning; and I have liked the term "becoming" for this collectivity of irreversible change of the system over time — the "becoming" of the system. The architecture, essentially constant in time, is its "being," the reversible changes in time, its "behaving," and the irreversible changes its "becoming." Let us look at the environment system in a little more detail. The environment alone is able to induce inhomogeneities in a homogeneous system; and if the latter is appropriately responsive 358 Information Storage and Neural Control to particular inhomogeneities, there will be a morphogenetic action and internal structure will result. Some of you may not remember the vast argument that occurred near the turn of the century when the German zoologist Driesch shook apart the two half cells of a fertilized egg. Normally, of course, one would become the right side, say, of a frog and the other the left side; but after separation, each became an intact frog with perfectly good right and left sides. The outer cell surfaces exposed to pond water developed skin in the proper fashion, but the medial surfaces, which became backbone and nervous system when left stuck together, now also developed skin. This phenomenon caused Driesch to turn vitalistic and invoke guiding entelechies, but it was explained decades later by the American zoologist Child in terins of concentration gradients from outside to center. In the intact embryos the medial cell surfaces are at the low or high end of a gradient of oxygen, carbon dioxide, or any other sub- stance that must diffuse from or into the environment; but in the separated cells the end of the gradient has moved to the center of each cell instead of the center of the double cell mass. So, provided the cell is more than a sac of water and is able to respond to different oxygen concentrations by different morphological responses, the organized morphology results from these quantita- tive changes imposed by the environment. The same sort of thing operates throughout embryonic develop- ment. With further cell divisions the germ layers become differ- entiated and then organs are specified. Often it is only a matter of minutes between the appearance of the endoderm and the irrevocable commitment of a given endoderm cell to become a bit of liver or of gut. In tliis particular case we know what the environmental determiner is: if the cell is near heart, it becomes liver; if not, it becomes gut. So environmental influences operate all the way through ontogenesis, in gated time periods, to produce firm outcomes. We are thoroughly familiar with this in many other areas as well. We can tell what kind of environment a person has lived in if he has thick soles or horny hands or a weathered face. Frown or smile wrinkles are inorphological consequences of oft-repeated behaviors. In this case, the environment of the skin is internal to Summary arid General Discussion 359 the system (the facial muscles), but this does not alter the principle. The ontogenesis of an ecological community, i.e., the evolution of the group roles and structures that form during community, is similar. Such roles and structures can form only in certain sequences and at certain stages in the interactions of the indi- viduals that constitute the "cells" of society, and in time can become irreversible. These include customs and rules, libraries, and all sorts of appurtenances that form a morphological substrate and channel social behavior. And, of course, the engram in the brain is entirely comparable to horny skin or to bowed legs or to wrinkles. It is interesting that a time-gated period of specifica- tion has more recently been found not only in differentiation of cells but in "imprinting" the nervous system and in fixation of experience in still other areas. One is inclined to raise the question of whether the units involved are in a sort of soft-shelled state, like a molting crab, all at the same time, or whether different units, particular neuron groups, become impressionable in separate, temporally ordered periods. This also relates to the earlier argu- ment on memory, and I shall come back to it. Now a word about malleability. This, you will remember, refers to the sensitivity and the specificity of an organism relative to its environment, particularly to the rain of information from the environment. Over evolutionary sequences there develops greater ability to respond, with greater chscrimination, to more kinds and lesser amounts of such information. In fact, I would urge that the major theme of organic evolution is what I have called the epigenetic inode and is not just the ability to respond to the environment, to learn, or to be molded by it; beyond that, it is also the ability to be molded more and more easily — to learn to learn. This learning to learn occurs, I think, at all levels and in all systems in the course of "becoming," not only in evolution and history but also in the individual, as psychologists well know. Several major inventions of life have favored this successful increase in the ability to learn. Perhaps the first, certainly one of the very early and important ones, was the invention of an array of molecules able to replicate themselves and to produce other particular molecules (in other words, the invention of an array of genes with sufficient stability and sufficient mutability). This 360 Injormation Storage and Neural Control permitted very slow evolution. A great speeding up of modi- fication of the system by environmental impact, i.e., an enhance- ment of response to the information available, allowed a second forward step — the invention of sex. This latter maneuver made it possible to mix the genes in two individuals, to shuffle the cards, and so get an almost infinite number of hands with the same small array of individual items. The third major landmark was the invention of multicellularity. This made possible the setting off of groups of cells, tissues, and organs for particular functions, including susceptibility to environ- mental influences. Multicellularity made possible a meaningful nervous system, the appearance and steady improvement of which is the most important invention for us. This evolution over suc- cessive epochs probably involved an initial improvement of the individual unit neurons from decrementing to all-or-none con- duction, from reciprocal to irreciprocal synapses, from lower to higher speeds, from higher to lower thresholds, and all the rest. Then there developed better circuitry between the neurons, including such effective physiological devices as the simple reflex, the reverberating loop, the negative feedback loop, etc. Two of the circuits already mentioned are worth a moment. Dr. Brazier, particularly, referred to one as the "inhibitory surround." This term emphasizes recent work by investigators such as Hartline, Hubel, and many others, dealing with the sensory input, but the mechanism really goes back to Sherrington's reciprocal inhibition. This mechanism not only cuts in a clean group of motor neurons to give a shaiply integrated act, very possibly via the feedback inhibition by Renshaw cells, but it also operates all through the nervous system. I have suggested in The Handbook of Neurophysiology that it functions in giving attention to one or another sensory input or thought train and in shifting mood sharply, as well as in selecting a behavior. This device (active units blocking out nearby ones that could have become engaged in the activity but are in this way kept inactive) is the basic mechanism for dissecting a graded continuum into sharp classes. "Nature doesn't come as clean as we can think it," as Whitehead said, but our whole nervous system and our sense organs are designed to clean it up for our thought processes. Summary and Ge?ieral Discussion 361 Perception of an object comes through clean and sharp, and an act comes through clean and sharp without conflict or blurring by opposing" elements. Sometimes we err grievously by over- commitment to a typology, as did the scholastic philosophers; but without such a commitment we could not think at all, and with sophistication we can return to graded or probabilistic thinking. The mechanisms are standard orthodox neurophysiology; their behavioral consequences are still being explored. The second neural circuitry worth mentioning — it has received much attention here — is the double system, discrete and diffuse. The diffuse system gives the metasignals which are the set. It acts like the basic adjustments of the television set that make a picture possible: adjusting brightness and discrimination, locking in the vertical and horizontal, etc., but not giving the actual picture. The discrete system presents the picture, the particular pattern that receives our attention. I have probably oversimplified this (an example of oversharpening nature) but there is much evidence for it. The diffuse system can modulate thresholds and responses of the cortical neurons that are thrown into action initially by the discrete system; and the diffuse system does affect mood, set, emotional background, even level of conscious awareness and attention. The whole question of novelty, stress, anxiety, and performance has been discussed (Gerard, R. W.: Neurophysiology; an integration, in, Handbook of Physiology — Neurophysiology III, Victor E. Hall et al., eds., Amer. Physiol. Society, I960, p. 1919) in relation to the interaction of the two systems in modifying the size of a "physiological neuron reserve." Returning to the overall evolution of the nervous system, the third stage, after improved units and organized circuits, is increase in number. The great rise in capacity of the vertebrates, and particularly of the mammals, is attended, so far as I know, neither by improvements in the neurons and their connections nor by any better circuitry. It is a remarkable consequence of simply adding more of the same. While this is surprising at first, a little thought recognizes that more of the same can add entirely new dimensions of richness in performance. In fact, I was struck by the, I am sure accidental, parallel in the number of base pairs in genes and of neurons in brains. The small virus has about 362 Information Storage and Neural Control 6,000 base pairs, the mammal close to 10'", according to Dr. Kit. The simplest animals possess a few hundred or thousand neurons, man about 10'". Adding more of the same does, indeed, multiply richness and capacity. The next major breakthrough in increasing overall malleability of living things became possible only when the nervous system had become large enough and sufficiently complex to generate those new capacities of interaction which led to culture. Culture, while not completely limited to man, is tremendously more enveloping for this social animal, and I suggest four sub-epochs in its development. The first stage of culture probably can be dated from the invention of the symbol, the use of an arbitrary sign for a thing, a communicable representation of the outside world. Next came organized symbols, which are language, as a tremendous advance, and tested organized symbols, which are science, as a further great step. I strongly suspect that we are just entering a fourth epoch in increased malleability of collective man with the invention and rapid growth of the computer, a prosthetic instrument for thinking, much as bulldozers are for muscles and telescopes and microphones are for receptors. In fact, perhaps the most interesting thing about present-day man is that the world in which he lives, the one that matters, that gives problems and satisfactions, is no longer very much a material world of "things." These have been taken care of. We have established homeostatic control of our physical and biological environment so that these no longer present our primary problems. We live as social beings in an ocean of information, information that did not exist before we created it. Languages of all sorts, pictures of all sorts, a great variety of communication means and contents — these are the things that matter to us. Our interactions with other human beings, mainly at the symbolic level, are what we care about. Indeed, the storing, processing, and retrieving of information at the machine level are undergoing such tremendous advances that the entire transmittal and use of the information which is the corpus of our culture will soon be revolutionized. There is still another exciting aspect of the evolution of mallea- bility that requires mention. In the earlier phases, this evolution took place primarily by a biological, Darwinian kind of process; Sumtnarv and General Discussion 363 later it continues primarily by an environmental social, Lamarckian kind. I shall return to this shortly, but must first examine the last major topic, the fixation of information. For experience to be fixed or information to be stored, there must be a material change of some kind. If a system is to retain an enduring difference induced by the environment, not just a relatively ephemeral change in dynamic state, as a spinning top, the different responsiveness must rest on a morphological differ- ence. Such a material change can be only in the number or kind or position of units, suc'i as ions, molecules, organelles, cells, or perhaps all of these. One is tempted to look at the macromolecules because, at least at that level, they are the only units that have considerable endurance in cells. It is by no means excluded that the lipids, which endure very well (some of them, once formed, apparently have no turnov^er during the life of the brain), or the proteins might be involved; but most investigators interested in this field have a strong predilection for the polynucleotides. Moreover, as pointed out earlier in this symposium, there is growing evidence that implicates them, and there is an especially intriguing reason for interest in RNA and memory. DNA molecules produce another generation of DNA, these produce another generation, and so on. For a series of generations, the important thing, of course, is that means of replication do exist and that they are precise enough to give both great stability and appropriate freedom for change. Change is produced very gradually over generations with the environment acting primarily by means of selection. The environment normally does not alter the DNA molecules, although it is ultimately responsible for the rare and random genetic mutations. Rather, it selects one or another set of these molecules in terms of the phenotypes produced and of the relative degree of their adaptation to the environment. This is Darwinian evolution — natural selection of certain molecules from an array of possible DNA molecules or groups of molecules. But when a given DNA molecule starts to operate in a given organism, it produces messenger RNA and ribosome RNA and proteins and enzymes and all tlie rest; and somehow or other this sequence is under pretty direct control of the environment. Indeed, it looks as if there is here a Lamarckian kind of influence by the 364 Injormatwn Storage and Neural Control. DNA Darwinian selection by Environment DNA — ^^ Messenger RNA — ^- RNA — ^- protein y Lamarckian modification by environnnent DNA Ontogeny Figure 1 environment (Fig. 1). Just where in the sequence it acts, we do not know; but a reasonable guess would be that it operates on the messenger RNA, which is small in amount and relatively unstable, to modify it in kind or amount or distribution. This, I think, reveals the nub of the earlier discussion between Dr. John and Dr. Morrell. The extremely basic question arises: Must we assume, or is it better to assume, that the environment operates here by modifying the RNA (or other) molecules, which is Lamarckianism; or is it possible that, as in genetic selection, there is a large array of molecules, say a gene-like array of RNA's, on which environment operates by some kind of selection? I am sure nobody knows the answer at the moment; the situation does not have quite the feel of selection to a biologist, but feelings can be very wrong. Moreover, I would point out that, if molecular modification is involved, we have not solved the critical problems when we recognize that this occurs. It is important to get this far; but some workers have talked as if identifying a memory trace with a change in RNA is essentially the solution of the engram. Rather, we are then at the very beginning of our troubles. Exactly the same problems face us here that faced Lamarck in getting the giraffe's neck longer. Let mc point out what these problems are. The environment leads the giraffe to stretch his neck; somehow stretching the neck generates a substance, or influence, which goes from the neck to the gonads and produces Summary and General Discussion 365 a change in the sex cells; this change specifically favors the develop- ment of a longer neck in the offspring giraffe — a truly formidable re- quirement, which alone made Lamarckian inheritance improbable. Our demands are no less. We require, also, transduction from a process to a structure and back to a process, from information fiow to information storage to information retrieval. Nerve messages and events must be fixed in some kind of stable architectural alteration which favors regeneration of comparable events from the system. The flow of information is a matter, essentially, of action at synapses where nerve cells junction. Synapses can vary only in number, or intensity, which is really equivalent; position; kind, to some extent, as excitatory or inhibitory; and, of course the temporal phase of their activity. There are no other parameters, for these synaptic attributes also express the patterns of neuron connectivities. The storage occurs during a period of fixation, as I have called it, or consolidation, as Dr. Morrell called it, during which a reversible change becomes irreversible and an enduring memory is established. This engram probably includes a molecular change and, as just discussed, may involve production of an altered mole- cule or selection of a particular molecule from a pre-existent array. Selection might be in position or in number as well as in archi- tecture of molecules. Given the molecular change, still further consolidation processes over time might well involve more gross morphological changes, such as enlargement of end-feet or actual sprouting of axon branches (there are many more in old neurons than in young ones); but this is all guess work. Perhaps there are only a given number of slots, so to speak, in which memories can last, although any notion of one memory in one slot is untenable. There is conclusive physiological and psychological evidence that, at most, there are different arrays or patterns of neuron groups which subserve different memories, with some spatial separation as well as overlap. Then, finally, we must account for the ability of the particular morphological residue left by a given pattern of impinging im- pulses in turn to make the neuron sensitive to just that pattern of impulses, so that in the future this input can fire the cell more easily than other inputs. 366 Information Storage and Neural Control Dr. John made a noble effort to reduce all this to a single quantitative picture by pointing out that an increase, say, in total cellular RNA would bind more ions and thereby cut down intra- cellular potassium which would slow the discharge of the neuron membrane and the optimal frequency at which it would respond. Explanations of this sort we eagerly welcome. Many workers are engaged in such efforts to push understanding further. My own feeling is that if one reduces the RNA change to a single overall quantitative parameter, even if parceled out to different cell regions or membrane areas, there does not remain the necessary great specificity; but this is certainly a matter of opinion at the moment. In any event, here are the active growing points of experiment, as well as theory, in this field. I shall take a final moment to add to those facts already before you a few new ones regarding fixation. Dr. Morrell referred to our earlier work, paralleled independently by others, of giving an animal a certain learning experience and then, after different intervals, stopping the activity of the brain. We found that if brain activity was stopped early enough, either by abrupt cooling or by massive electric shock, there had not been time for the experience to become fixed in the nervous system. A hamster or rat given an electric shock within a few minutes of an experience had no recollection of the experience; the animal learned nothing, much like the retrograde amnesia of man after a concussion. The fixation time so established was fifteen minutes, although changes continued for fully an hour. To grapple more firmly with the engram, we wished a more localizing preparation, but without encroaching on MorrelTs elegant mirror spot technicjue in the cortex. There has been much argument as to whether the cord can or cannot fix experience, or learn. Chamberlain, Haleck, and I decided to follow a clue provided by an Italian physiologist, Di Giorgio, relating to enduring" postural asymmetries after uni- lateral lesions in the cerebellum or other cephalad structure. Many mammals show the phenomenon. We have used rats mainly. After an asymmetrical lesion, the right hind leg is, say, more flexed, the left one more extended. Now, of course, if the cord is cut, the asymmetric streams of descending impulses are stopped and cord discharges should lapse back to symmetry. This is, Summary and General Discussion 367 indeed, what happens if the cord is cut within three quarters of an hour after the start of asymmetry. But if the asymmetry has been allowed to persist longer than this, and the time discon- tinuity at forty-five minutes is too sharp for comfort, then the asymmetry remains for hours or days after the cord is cut. Clearly, physiological activity has been fixed in cord neurons; and one has an obvious place to look for shifts in DC potentials across the cord, in unit activity of motor neurons, in RNA and enzyme content in various cells in the cord, and the like. Further, we are examining the influence on fixation time of drugs which speed or slow the formation of RNA, and Rothschild is making comparable studies on the learning abilities of rats and mice in various maze and avoidance situations. It does look as if 8-azaguanine, which slows RNA formation, slows learning and prolongs fixation time; and that a malononitrile dimer (Upjohn U9189), which is reported to speed RNA formation, may have the reverse eflfect. But results are still coming in and all this is very preliminary. In any event, inany workers are zeroing in on many prepara- tions, including the flatworm, and we are really beginning to come to grips with the problems of information processing" and storing by the nervous system. DISCUSSION OF CHAPTER XV Ralph W. Gerard (Ann Arbor, Michigan): I would like to invite questions and comments from those who participated in the symposium. All those in the audience will wish to hear the views of the participants on what some other speaker has said. Frank Morrell (Palo Alto, California): Dr. Gerard, may we ask you to amplify the details of this beautiful experiment. I would particularly like to know the details of how the operation to produce asymmetry was done, whether drugs influence this, and whether, for exainple, the same relations exist if such an operation is performed using anesthesia. Gerard: The preparation is made using anesthesia, and time is from the appearance of asymmetry, not from the time of the cord cut. Anesthesia (ether or nembutal) is light, and the animal is ordin- 368 Information Storage and Neural Control arily pretty well out when asymmetry appears. Before that, presum- ably, impulses coming down the cord have not been effective. E. Roy John (Rochester, New York): I would like to mention a couple of experiments related to your remarks and ask if you would react to them. I ain sure the first one will be of interest to you, although it is not directly related to the question of memory in the nervous system, but rather to your comments on the loss of plasticity and functional specialization in tissue. The data are contained in a recent paper by Buchsbaum in the Journal oj Experimental ^oology. He and his co-workers were trying to develop a planarian tissue culture inethod, and succeeded in making a pleasantly simple medium in which explants were grown. They observed that a small explant occasionally proliferated as a sheet, reached a certain size, folded back on itself, apparently dedifferenti- ated, and developed into a planarian. This rather unexpected observation suggests that, at least at this level, the loss of plasticity with specialization is reversible. More directly relevant to our major concern here is the recent paper by Sporn and Dingman in the Journal of Psychiatric Research in which 8-azaguanine was used to interfere with RNA synthesis, and a significant decrease in the rate of maze learning was ob- served. I would also like to mention the on-going thesis work of Eugene Sachs, in our laboratory, which may provide additional insight into aspects of information storage. Some time ago, in collaboration with Wenzel and Tschirgi, we observed that small intraventricular injections of electrolytes seri- ously interfered with the performance of some previously estab- lished conditioned responses. Mr. Sachs has investigated the effects of small alterations in central potassium or calcium on learning and performance by making intraventricular injections before each training session. Control groups are first trained, and then receive an appropriate number of central injections. Sachs' results indicate that animals perform conditioned responses best under conditions of central electrolyte concentration like those present during training and poorly under other conditions, including normal cerebrospinal fluid concentration. Control groups that receive the injections after training show no evidence of accom- modation effects. In these animals, central injection causes per- Summary and Ge?ieral Discussion 369 formance deterioration, while in the annnals in which these changes were present during learning, performance continues perfectly. Certain chemical changes seem to facilitate learning, while others slow it. These groups showed differential sensitivity to drugs many months after training, indicating that the effects of the small elec- trolyte shifts are long lasting. These various findings show that very small local electrolyte shifts seem capable of affecting the long-term storage of an experience in such a way that readout is optimal when the electrolyte microenvironment of the readout mechanism resembles the situation during the initial experience. I would like to add just one thing. We have replicated the cannibalism experiments of McConnell using a blind procedure. It seems to me that the most striking evidence in favor of a sequence specificity model comes from such studies. Gerard: Let me answer your second question about the aza- guanine findings first. That paper appeared while our own experi- ments were in progress and were coming out the same way. As to your rounded-up planaria, I seem to remember that Child and Hyman got smaller segments to regenerate, but this is an unim- portant detail. I am not quite sure what you are asking of me. Maybe another question will help: When the cells reorganize inside such a sheath or coating and are all mixed up, if you have previously trained them, do they remember? John: That is one reason why we are doing tissue culture ex- periments. I do not know yet. Gerard: Regarding your electrolyte shift, this strikes me as exactly what would be expected, on the following argument. Small shifts in the calcium-potassium ratio produce large changes in the neuron thresholds; high potassium lowers threshold, high calcium raises it. If you have done your conditioning under one set of thresholds of the neuron group, tlien the engram set up would be congruent with that distribution of neuron thresholds in that neuron population. Having once established the pattern, which is more difficult with calcium and easier with potassium, you would need the same balance of thresholds that then existed in order to re-evoke the engram in a given assembly of cells, because various cell thresholds do not change exactly in proportion to the ion ratio. If learnins; was under the normal ion ratio, then 370 Information Storage and Neural Control any shift in ion balance would disturb it. It seems to me this is exactly what one would expect. John: That is one possible explanation. Yet experiences learned under normal circumstances may be retrieved in situations in which it is unreasonable to argue that the configuration of excita- bility of neuron populations is quite as it was during the experience. An alternative to your suggestion might be that the altered electrolyte surround directly affects storage mechanisms on the molecular level. Gerard: It is a matter of how it has shifted. These are probably very big shifts, even with small amounts of electrolytes. You remember the work Ochs did with the Bures potassium technique. It is a nice way of locating the engram, besides the split-brain technique. He had rats learn a performance with one hemisphere inhibited with high potassium chloride. He removed this and the animals behaved perfectly well. But sometime later, when he blocked the other hemisphere with potassium chloride, with the first still ticking away happily, the rats had no knowledge of what they had learned. The engram was in only the part of the brain which was active during learning. It is this kind of an effect, I think, that you are dealing with. Let me discuss your last question. Maybe we should not go into it because this whole planaria business, while fascinating, is a bit off the line of the discussion. You may not know, though, that your student Corning turned up with Jim McConnell and reported the RNAse results, but could not interpret them. My inter- pretation, and I think the one you have used, seemed reasonable. Let me remind the group of the basic experiment. A flatworm is trained, cut in two pieces, and the head allowed to regenerate a tail and the tail a head. Both new worms remember, as McConnell demonstrated. Dr. John and his group showed, further, that if each of these two parts is regenerated in RNAse, the head worm still re- members but the tail worm does not. One can explain this in terms of the fact that the head worm has more organized structural units in it to begin with and does not have to re-create many neu- rons. Now, would something of this kind apply to your question? John: I am sorry. I am talking about the cannibalism studies. One group of planaria is fed shredded, trained worms. Another Summary and General Discussion 371 group of planaria is fed shredded, naive worms. On subsequent training, the group which ate trained worms was found to acquire that conchtioned response significantly more rapidly than the group which ate naive worms. This experiment was run blind in our laboratory, and the results confirmed previous reports by McConnell's group. One is probably justified in assuming the absence of enzymes in the planarian gut, which would degrade macromolecules. The reason I refer to this work is to ask what sort of mechanism you would suggest to account for these results. Gerard: WeU, it is even more unbelievable than the earlier stuff, but I still tliink that what I was saying was relevant to your question. I would have to assume that these informed molecules are not completely degraded in being digested and absorbed, and so supply templates on which the organized learning can be based, just as for the tail regrowing a head with its neurons. I wonder if we should not let some of tlie other people get in before pushing this one point. Morrell: I had hoped to get a specific comment on the plausi- bility of my suggestion for chemical "protection." I wonder whether a possible mechanism for preservation of an imposed shift in charge distribution might be the bonding of the charged moiety to phospholipid. Conceivably such bonding might not only protect this given molecular rearrangement, but also fix it to sites within the membrane where influences on synaptic trans- mission might be expected. There is some evidence by Tobias which indicates that axons treated with proteases continue to conduct action potentials for many hours, while treatment with lipase rapidly abolishes conduction. Tobias (personal communi- cation) has now found that similar treatment with ribonuclease also impairs the capacity of the axon to generate action potentials. Moreover, there is some preliminary evidence from Dr. Herzenberg (personal communication) to the effect that the DNA-RNA speci- fication system may not only regulate protein synthesis but also influence molecules containing phospholipid. In fact, these lipid molecules are antigenic and thus conceivably could provide a chemical mechanism for cell recognition. Gerard: I think that is fine to have on the record. I had rather not push it, although I must say that I heard recently that Tobias' 372 Information Storage and Neural Control finding, which I have also quoted with enthusiasm, is under question as to whether the hpase at the pH and ionic strength used was acting on hpids or exhibiting its other, venomlike, action. So this may not hold. Saul Kit (Houston, Texas) : Dr. John's question is a very complicated one. I believe I would have to discuss it with him to understand fully all of its implications. I think we should be very careful in extrapolating from the molecular biology level to the neurophysiology frame of reference. I should prefer, therefore, to let Dr. Gerard's answer stand. Gregory Bateson (Palo Alto, California) : This is changing the subject somewhat, but going back to what Dr. Gerard said about evolution and the relations between Lamarckian and Darwinian theories, there are some rather peculiar problems in the economics of communication within the organism which indicate, at first glance, that neither the Lamarckian nor the Darwinian system will work. Let me put it this way. We have an organism. We describe it at any given time or over any given finite time in terms of all necessary variables to define all possible states — Vi, V2, . . . , V„ — perhaps many thousands of variables. Any one of these has a finite set of values. If the organism exceeds any of these finite thresholds, it dies. Now consider a pre-girafTe which has the good fortune to get the mutant "long neck" as an item in the genotypic corpus of genes. That genotypic system is not going to tell the heart of the giraffe that it now has to enlarge in order to supply the head with blood. It is not going to deal with the new problems of the inter- vertebral disks. It is not going to solve all sorts of other new somatic problems which, in fact, the happy giraffe, the lucky giraffe, is going to have to deal with at the somatic level. The giraffe is going to have to occupy servo-circuits within its soma to modify the size of the heart, and so on. By doing so, it has reduced the finite set of possible states of its organism. Later, this pre-giraffe is lucky enough to get another externally adaptive mutation — let us say big feet, which it needs for kicking lions. It is now again limited to a subset of its possibilities; and if it has to deal with both mutations simultaneously, it is limited to that overlapping subset of possibilities which is compatible with both mutations. Summary and General Discussion 373 You see that very soon a sequence of externally adaptive muta- tions of this sort is going to lead to a nonviable giraffe. It is using up its somatic flexibility with every adaptive mutation that it gets. The only way it can regain flexibility is by getting those mutations which will enlarge its heart or do whatever is necessary to cope with the externally adaptive changes. It has to shift some of its acc|uired characteristics from the somatic servo-systems to the soldered-in genotypic systems. The system can only work if there is a comparatively large number of mutations which will simulate a Lamarckian process, and evidently God set it up this way to deceive the Russians. Now, let us look at the other side of the picture. Suppose the system were set up on Lamarckian lines. The genotype would then have to pick up from the soma (and it is difficult enough to imagine it picking up anything) those particular acquired charac- teristics which are the essential ones. But the enlarged heart is not just an enlarged heart. It is one item in a general shift in value all around servo-circuits to enlarge that heart. All those values at other points around those circuits are going to be picked up in a Lamarckian system, passed on by inheritance, and soldered into the genotype. In fact, a Lamarckian system will very rapidly gum up the works by decreasing the somatic flexibility just as badly as the Darwinian system. We face, therefore, an economics of communicational pathways. Evolution will only work if you have one system (the genotype) relatively independent of the other (the somatic), with natural selection playing on the whole thing. You have to have a digital genotype, soldered in, with random changes, and you have to have a system (the soma) of analogue operations. The soma is being, so to speak, a trial model to test the genotype. The hen is the egg's way of finding out if it was a good egg or not. The whole economics of the system depends upon keeping the soma and the genotype separate. If you are right in saying that cultural evolution is something much more like a Lamarckian system, I think we may look forward to considerable chaos in the culture. The genotype is the analogue of a legislator. He can only afford to make those changes which affirm changes that have already occurred at the somatic or popular level. If we live in a 374 Information Storage and Neural Control Lamarckian system in which the lower levels are maximally able to affect the higher ones, then perhaps we are headed for chaos. Gerard: I am sorry we are talking about the giraffe. It seems a camel would be more appropriate. As you know, the definition of a camel is an animal made by a committee. This seems to be the problem you are bothered about. I also think that, in a sense, it should be a camel, because I let his head under the tent and you have brought the whole animal in. The issues you are raising are really not too close to the basic one of the fixation of the experi- ence as I was trying to discuss it. Let me simply say this in response to these important considerations. As you know, this difficulty — the fact that there must be multiple changes that interact with each other — has been recognized by evolutionary theorists for a long time. One of the earliest criticisms of natural selection was that it could explain the survival of the fittest but not the arrival of the fittest. I think this is partly what you are raising. I am certainly no expert in the field of evolution, but I have been in close touch with many of the experts in this field over many years. They remain an absolutely solid phalanx on selection as an adequate and satisfactory mechanism for evolution, without bringing in Lamarckianism. Waddington and Dobzhansky have recognized very clearly the fact that natural selection has favored mutable genes, which is a bit in your direction. Hyman Olken (Livermore, California): I have one question I would like to ask Dr. Morrell. Bottley pointed out that if you increase the frequency of light pulses toward a certain value, you get increased response; then if you increase beyond that frequency, the response decreases. Would that have any effect on the results that you pointed out yesterday where you tested the memory of certain frequencies and recovered other ones? Morrell: Well, it would have an influence on the detectability of any frequency in the system with which we were working. You could see from the illustrations that frequencies beyond seven, say, would gradually fill in the interval, and you could not possibly count a frequency; therefore, it would be undetectable with these methods. Max E. Valentinuzzi (Atlanta, Georgia): I think that this is the appropriate moment to bring up three questions which have Summary and General Discussion 375 not been answered as yet. They are related to the amount of information necessary to transmit or to organize one unit of information. As you know, it is not possible to transmit information if there is not a previous amount of information available as a storage unit. So, the first question is: How many units of infor- mation do we need as a minimum to store one unit of information? The second question is: How much energy is necessary to organize one unit of information? The third question is: How much energy is necessary to transmit from one point to another the same unit of information? Warren S. McCulloch (Cambridge, Massachusetts): Do the first two questions amount to how much information you have to have to make another unit of information? This is one of the nasty cjuestions that is puzzling us at the present time. There is a way of approaching it, but no one is happy about it. You cannot say in a simple way, "How much for unit?"; but you can ask — and it is the famous question put by John von Neumann — "How much of a computing machine do you have to have for that computing machine to make more?" This is the same question; you have the problem of the generation of a computer, and it does not matter whether you make it formally or make it in hard- ware. The actual problem is that of starting with no form. This means starting from noise, and from noise it is hard to get anything, to generate any form. The answer is that nobody knows how much information you need. Gerard: What about the second question on the energy for transmitting the unit of information? McCulloch: With regard to the last question, as small an amount of energy as you can get in one packet can carry one bit. The limit is strictly that of the physics. Kit: I wonder if this question is not too general. Should we not be thinking about the kind of information that we are storing, transmitting, and replicating? I think estimates could be made of the amount of energy needed to replicate a DNxA molecule. Also, one can measure the amount of energy consumed by a bacterial cell during the replication of the DNA of a phage. This measured value will be greater than the amount needed for phage DNA synthesis and presumably will be an upper limit of the 376 Informatwn Storage and Neural Control amount of energy needed. However, I feel diat if we investigate another information system, the amount of energy required to make anotlier unit of information might be very different. McCulloch: Light does not come in packets smaller than a single photon, and from Bowman's figures one photon can excite. That is the lowest figure that anybody has and the lowest anyone will ever have. Gerard: Time has gone on. It is now my privilege and pleasure, since I am acting as moderator at the moment, to thank the organizers of this symposium, the Houston Neurological Society and Baylor University, the various local people who have been kind to us, and, above all, the speakers who have given us such interesting material. We are adjourned. APPENDIX A INTRODUCTION Michael H. Arbib "A LOGICAL Calculus of the Ideas Immanent in Nervous Activity" by Warren S. McCulloch and Walter Pitts is the classic paper on neurophysiological automata theory and still merits reading today, almost twenty years after its publication. Section I. which gives the neurophysiological basis for the model, is still valid in all its essentials and remains the most readable discussion of this basis. Section II, on the theory of nets without circles, and the dis- cussion of Section IV are equally excellent. However, Section III, the theory of nets with circles, was only intended as a sketchy account. It was presented in Carnap's notation, which was not apt for the task at hand, and is incomplete, hard to read, and contains many errors. Hence, for this part of the theory, we advise the reader to turn to more recent publications. The theory of nets with circles was first fully worked out by Kleene (1) and has since been given an elegant re-presentation by Copi, Elgot, and Wright (2). The assertions of McCulloch and Pitts concerning the connection between the neural nets and Turing machines [Turing (3)] have been fully worked out by Arbib (4). REFERENCES \. Kleene, S. C: Representation of Events in Nerve Nets and Finite Automata. In Automata Studies, ed. by C. E. Shannon and J. Mc- Carthy, Princeton, Princeton University Press, 1956, p. 3. 2. Copi, I. M., Elgot, C. C, and Wright, J. B.: Reahzation of events by logical nets. J. Assn. Computing Mchy., 5.- 181-1 96, 1958. 3. Turing, A. M.: On computable numbers, with an application to the Entscheidungs-problem. Proc. London Math. Soc. (2) ^i.-230-265, 1936; with a correction, ibid., 43:544-546, 1947. 4. Arbib, M.: Turing machines, finite automata and neural nets. J. Assn. Computing Mchy., 8:461-415, 1961. 377 A LOGICAL CALCULUS OF THE IDEAS IMMANENT IN NERVOUS ACTIVITY* Warren S. McCulloch and Walter H. Pitts Because of the "all-or-none" character of nervous activity, neural events and the relations among them can be treated by means of propositional logic. It is found that the behavior of everv net can l:)e described in these terms, with the addition of more complicated logical means for nets containing circles; and that for any logical expression satisfying certain conditions, one can find a net behax'ing in the fashion it describes. It is shown that many particular choices among possible neurophysiological assumptions are equivalent, in the sense that for every net be- having under one assumption, there exists another net which behaves under the other and gives the same results, although perhaps not in the same time. Various applications of the calculus are discussed. T. INTRODUCTION HEORETIClAL neurophysiology rests on certain cardinal as- sumptions. The nervous system is a net of neurons, each having a soma and an axon. Their adjunctions, or synapses, are always be- tween the axon of one neuron and the soma of another. At any in- stant a neuron has some threshold, which excitation must exceed to initiate an impulse. This, except for the fact and the time of its occurrence, is determined by the neuron, not by the excitation. From the point of excitation the impulse is propagated to all parts of the neuron. The velocity along the axon varies directly with its diameter, from less than one meter per second in thin axons, which are usually short, to more than 150 ixieters per second in thick axons, which are usually long. The time for axonal conduc- tion is consequently of little iinportance in determining the tiine *Reprinted from The Bulletin of Mathematical Biophysics, 5:115-133. 1943, with permission of the Editor, N. Rashevsky. 379 380 Information Storage and Neural Control of arrival of impulses at points unequally remote from the same source. Excitation across synapses occurs predominantly from axonal terminations to somata. It is still a moot point whether this depends upon irreciprocity of individual synapses or merely upon prevalent anatomical configurations. To suppose the latter requires no hypothesis ad hoc and explains known exceptions, but any assumption as to cause is compatible with the calculus to come. No case is known in which excitation through a single syn- apse has elicited a nervous impulse in any neuron, whereas any neuron may be excited by impulses arriving at a sufficient number of neighboring synapses within the period of latent addition, which lasts less than one quarter of a millisecond. Observed temporal summation of impulses at greater intervals is impossible for single neurons and empirically depends upon structural properties of the net. Between the arrival of impulses upon a neuron and its own propagated impulse there is a synaptic delay of more than half a millisecond. During the first part of the nervous impulse the neuron is absolutely refractory to any stimulation. Thereafter its excitability returns rapidly, in some cases reaching a value above normal from which it sinks again to a subnormal value, whence it returns slowly to normal. Frequent activity augments this sub- normality. Such specificity as is possessed by nervous impulses depends solely upon their time and place and not on any other specificity of nervous energies. Of late only inhibition has been seriously adduced to contravene this thesis. Inhibition is the ter- mination or prevention of the activity of one group of neurons by concurrent or antecedent activity of a second group. Until recently this could be explained on the supposition that previous activity of neurons of the second group might so raise the thresholds of internuncial neurons that they could no longer be excited by neurons of the first group, whereas the impulses of the first group must sum with the impulses of these internuncials to excite the now inhibited neurons. Today, some inhibitions have been shown to consume less than one millisecond. This excludes internuncials and requires synapses through which impulses inhibit that neuron which is being stimulated by impulses through other synapses. As yet experiment has not shown whether the refractoriness is relative or absolute. We will assume the latter and demonstrate .1 Logical Calculus of the Ideas Immanent in Jservous Activity 381 that tlie difference is immaterial to our argument. Either variety of refractoriness can be accounted for in eitlier of two ways. The "inhibitory synapse" may be of such a kind as to produce a sub- stance whicii raises the tlireshold of the neuron, or it may be so placed that the local chsturbance prockiced by its excitation opposes the alteration induced by tlie otlierwise excitatory syn- apses. Inasmuch as position is already known to have such effects in the case of electrical stimulation, the first hypothesis is to be excluded unless and until it be substantiated, for the second involves no new hypothesis. We have, then, two explanations of inhibition based on the same general premises, differing only in the assumed nervous nets and, consecjuently, in the time required for inhibition. Hereafter we shall refer to such nerv'ous nets as equivalent in the extended sense. Since we are concerned with properties of nets which are invariant under equivalence, we may make the physical assumptions which are most convenient for the calculus. Many years ago one of us, by considerations impertinent to this argument, was led to conceive of the I'esponse of any neuron as factually equivalent to a proposition which proposed its ade- quate stimulus. He therefore attempted to record the behavior of complicated nets in the notation of the symbolic logic of proposi- tions. The "all-or-none" law of nervous activity is sufficient to insure that the activity of any neuron may be represented as a proposition. Physiological relations existing among nervous activ- ities correspond, of course, to relations among the propositions; and the utility of the representation depends upon the identity of these relations with those of the logic of propositions. To each reaction of any neuron there is a corresponding assertion of a simple proposition. This, in turn, implies either some other simple proposition or the disjunction or the conjunction, with or without negation, of similar propositions, according to the configuration of the synapses upon and the threshold of the neuron in question. Two difficulties appeared. The first concerns facilitation and ex- tinction, in which antecedent activity temporarily alters responsive- ness to subsequent stimulation of one and the same part of the net. The second concerns learning, in which activities concurrent at some previous time have altered the net permanently, so that a stimulus which would previously have been inadequate is now 382 Information Storage and Neural Control adequate. But for nets undergoing both alterations, we can sub- stitute equivalent fictitious nets composed of neurons whose con- nections and thresholds are unaltered. But one point must be made clear: neither of us conceives the formal equivalence to be a factual explanation. Per contra! — we regard facilitation and extinction as dependent upon continuous changes in threshold related to electrical and chemical variables, such as after-potentials and ionic concentrations; and learning as an enduring change which can survive sleep, anaesthesia, convulsions and coma. The importance of the formal equivalence lies in this: that the altera- tions actually underlying facilitation, extinction and learning in no way affect the conclusions which follow from the formal treat- ment of the activity of nervous nets, and the relations of the corresponding propositions remain those of the logic of propositions. The nervous system contains many circular paths, whose ac- tivity so regenerates the excitation of any participant neuron that reference to time past becomes indefinite, although it still implies that afferent activity has realized one of a certain class of con- figurations over time. Precise specification of these implications by means of recursive functions, and determination of those that can be embodied in the activity of nervous nets, completes the theory. THE THEORY: NETS WITHOUT CIRCLES We shall make the following physical assumptions for our cal- culus. 1. The activity of the neuron is an ^'all-or-none" process. 2. A certain fixed number of synapses must be excited within the period of latent addition in order to excite a neuron at any time, and this number is independent of previous activity and position on the neuron. 3. The only significant delay within the nervous system is syn- aptic delay. 4. The activity of any inhibitory synapse absolutely prevents excitation of the neuron at that time. 5. The structure of the net does not change with time. A Logical Calculus of the Ideas Immanent in jYervous Activity 383 To present the theory, the most appropriate symbolism is that of Language II of R. Carnap (1938), augmented with various notations drawn from B. Russell and A. N. Whitehead (1927) including the Pnncipia conventions for dots. Typographical neces- sity, however, will compel us to use the upright 'E' for the existen- tial operator instead of the inverted, and an arrow ('-^') for implication instead of the horseshoe. We shall also use the Carnap syntactical notations, but print them in boldface rather than German type; and we shall introduce a functor S, whose value for a property P is the property which holds of a number when P holds of its predecessor; it is defined by 'S{P) (/) . = . P(A'.v) . / = .v')'; the brackets around its argument will often be omitted, in which case this is understood to be the nearest predicate-expression [Pr] on the right. Moreover, we shall write S-Pr for S{S{Pr)), etc. The neurons of a given net '^ may be assigned designations '^I'j '^2', . . . , 'c„'. This done, we shall denote the property of a number, that a neuron c, fires at a time which is that number of synaptic delays from the origin of time, by ^A^' with the numeral i as subscript, so that N ,{t) asserts that c, fires at the time t. N, is called the action of c,. We shall sometimes regard the subscripted numeral of ' N' as if it belonged to the object-language, and were in a place for a functoral argument, so that it might be replaced by a number-variable [z] and quantified; this enables us to abbre- viate long but finite disjunctions and conjunctions by the use of an operator. We shall employ this locution quite generally for sequences of Pr\ it may be secured formally by an obvious dis- junctive definition. The predicates '.Vi', '.V^', . . . , comprise the syntactical class ' N\ Let us define the peripheral afferents of V)! as the neurons of ^^I with no axons synapsing upon them. Let N,, . . . ^ N^ denote the actions of such neurons and A^,,+i, N,^,, . . . , N„ those of the rest. Then a solution of VX will be a class of sentences of the form S- A^p+i (21) . ^ . Pr, {N,, N,, ... , N„ 2i), where Pr, contains no free variable save Zi and no descriptive symbols save the A'' in the argument [Arg], and possibly some constant sentences [sa]; and such that each S, is true of VX. Conversely, given a Pvi {^i, ^2 ^p\, Zi, s), containing no free variable save those in its Arg, we shall say that it is realizable in the narrow sense if there exists a net 9l 384 Information Storage and Neural Control and a series of A^, in it such that M (zi) . = . Pr^ (M, A^o, •• • , Zi, sai) is true of it, where sax has the form A^(0). We shall call it realizable in the extended sense, or simply realizable, if for some n S"{Pri) ipi, • ... pp. Zu s) is realizable in the above sense. Cp, is here the realizing neuron. We shall say of two laws of nervous excitation which are such that every S which is realizable in either sense upon one supposition is also realizable, perhaps by a different net, upon the other, that they aie equivalent assumptions, in that sense. The following theorems about realizability all refer to the ex- tended sense. In some cases, sharper theorems about narrow realizability can be obtained; but in addition to greater com- plication in statement this were of little practical value, since our present neurophysiological knowledge determines the law of ex- citation only to extended equivalence, and the more precise theorems differ according to which possible assumption we make. Our less precise theorems, however, are invariant under equiva- lence, and are still sufficient for all purposes in which the exact time for impulses to pass through the whole net is not crucial. Our central problems may now be stated exactly: first, to find an effective method of obtaining a set of computable S constituting a solution of any given net; and second, to characterize the class of realizable S in an effective fashion. Materially stated, the problems are to calculate the behavior of any net, and to find a net which will behave in a specified way, when such a net exists. A net will be called cyrlic if it contains a circle: i.e., if there exists a chain c„ C/+i , ... of neurons on it, each member of the chain synapsing upon the next, with the same beginning and end. If a set of its neurons Ci , c-i , . . . , Cp is such that its removal from VX leaves it without circles, and no smaller class of neurons has this property, the set is called a cj>clic set, and its cardinality is the order o/vX. In an important sense, as we shall see, the order of a net is an index of the complexity of its behavior. In particular, nets of zero order have especially simple properties; we shall discuss them first. Let us define a temporal propositional expression (a TPE), desig- nating a temporal propositional function {TPF), by the following recursion: A^, A Logical Calculus of the Ideas Immanent in Nervous Activity 385 1. A^p^ [zi] is a TPE, where Pi is a predicate-variable. 2. If Si and So are TPE containing" the same free individual variable, so are SS\, SivSo, Si.S-> and S,. ^-^ S2. 3. Nothing else is a TPE. Theorem I Every net of order 0 can be solved in terms of temporal propositional expressions. Let Ci be any neuron of V^l with a threshold 6, > 0, and let Cn, Ci2, . •• , (',p have respectively //,i, '?,2, •• • , n,p excitatory synapses upon it. Let Cj], r,2, • • • , ^'jy have inhibitory synapses upon it. Let Ki be the set of the subclasses of \n,i, n,2, •• • , fi,p\ such that the sum of their members exceeds 6,. We shall then be able to write, in accordance with the assumptions mentioned above, where the 'E' ^'^^ 'H' are syntactical symbols for disjunctions and conjunctions which are finite in each case. Since an expression of this form can be written for each C; which is not a peripheral afferent, we can, by substituting the corresponding expression in (1) for each A''^,,, or A'',,- whose neuron is not a peripheral afferent, and repeating the process on the result, ultimately come to an expression for A^, in terms solely of peripherally afferent A^, since ^^l is without circles. Moreover, this expression will be a TPE, since obviously (1) is; and it follows immediately from the definition that the result of substituting" a TPE for a constituent p{z) in a TPE is also one. Theorem II Every TPE is realizable by a net of order zero. The functor .9 obviously coi"nmutes with disjunction, conjunction, and negation. It is obvious that the result of substituting any S,, realizable in the narrow sense (i.n.s.), for the p{z) in a realizable expression Si is itself realizable i.n.s.; one constructs the realizing net by replacing the peripheral afferents in the net for Si by the realizing" neurons in the nets for the Si. The one neuron net 386 Information Storage and Neural Control realizes p\{z\) i.n.s., and Figure 1-a sliows a net tliat realizes Spi{zi) and hence SS-i, i.n.s., if So can be realized i.n.s. Now if So and S3 are realizable then S"'S-2. and S"Sz are realizable i.n.s., for suitable m and n. Hence so are S^'^'^So and ^''""'""Sa. Now the nets of Figures lb, c and d respectively realize S{pi{zi)\ p2{z\)), S{pi{zi) . p2{zx)), and S\pi{z,) . ~ poiz,)) i.n.s. Hence S'-+"+' (SiV S2), ^"'+"+1 (Si . So), and ^''«+"+i (Si . ~ So) are realizable i.n.s. Therefore Si v SoSi . SoSi . ~ So are realizable if Si and So are. By complete induction, all TPE are realizable. In this way all nets may be regarded as built out of the fundamental elements of Figures la, b, c, d, precisely as the temporal propositional ex- pressions are generated out of the operations of precession, dis- junction, conjunction, and conjoined negation. In particular, corresponding" to any description of state, or distribution of the values true and false for the actions of all the neurons of a net save that which makes them all false, a single neuron is constructible whose firing is a necessary and sufficient condition for the validity of that description. Moreover, there is always an indefinite number of topologically different nets realizing any TPE. Theorem III Let there be given a complex sentence Si built up in any manner out of elementary sentences of the form p(zi — zz) where zz is any numeral, by ary of the propositional connections: negation, disfunction, conjunction, implication, and equivalence. Then Si is a TPE and only if it is false when its constituent p(zi — zz) are all assumed false — i.e., replaced by false sentences — or that the last line in its truth-table contains an 'F', — or there is no term in its Hilbert disjunctive normal form com- posed exclusively of negated terms. These latter three conditions are of course equivalent (Hilbert and Ackermann, 1938). We see by induction that the first of them is necessary, since p{zi — zz) becomes false when it is replaced by a false sentence, and Si v So, Si . S2 and Si . ~ S2 are all false if both their constituents are. We see that the last condition is sufficient by remarking that a disjunction is a TPE when its constituents are, and that any term Si . So . . . . Sm . -^ S,„+i . '^ . . . . -^ s„ can be written as A Logical Calculus of the Ideas Immanent in Nervous Activity 387 (Si . So ... . S„0 . ~ {Sm+xV S,n + lV . . . .V S„), which is clearly a TPE. The method of the last theorems does in fact provide a very convenient and workable procedure for constructing nervous nets to order, for those cases where there is no reference to events indefinitely far in the past in the specification of the conditions. By way of example, we may consider the case of heat produced by a transient cooling. If a cold object is held to the skin for a moment and removed, a sensation of heat will be felt; if it is applied for a longer time, the sensation will be only of cold, with no preliminary warmth, how- ever transient. It is known that one cutaneous receptor is affected by heat, and another by cold. If we let Ni and A^2 be the actions of the respective receptors and N?. and A^4 of neurons whose activity implies a sensation of heat and cold, our requirements may be written as N^{t) : = : A'i(/-1) . v . N^.{t-^) . ^N~,{t-2) Ndt) . = .No(t-2) .No(t-l) where we suppose for simplicity that the required persistence in the sensation of cold is, say, two synaptic delays, compared with one for that of heat. These conditions clearly fall under Theorem III. A net may consequently be constructed to realize them, by the method of Theorem II. We begin by writing them in a fashion which exhibits them as built out of their constituents by the operations realized in Figures la, b, c, d: i.e., in the form N^(t) . ^ . S{A\it) V S[{SN,{t)) >'^N,(t)]} N,(t) . ^ . S{[SN,{t)] .N,{t)]. First we construct a net for the function enclosed in the greatest number of brackets and proceed outward; in this case we run a net of the form shown in Figure la from Co to some neuron r„, say, so that Nait) . = . SN,(t). Next introduce two nets of the forms Ic and Id, both running from Ca and c^, and ending respectively at Ci and say Cb. Then A^4(0 . = . S[NAt) . N,it)] . ^ . S[(SN2(t)) . N,(t)]. 388 Information Storage and Neural Control Finally, run a net of the form lb from C\ and Cb to fs, and derive .¥3(0 . ^ . .S[.Vi(Ov.V,(0] . ^ . .StVi(0 v.S'[GSWo(0) . ~A'2(0]1. These expressions for N z{t) and iV4(/) are the ones desired; and the realizing net in toto is shown in Figure le. This illusion makes very clear the dependence of the correspond- ence between perception and the "external world'' upon the specific structural properties of the intervening nervous net. The same illusion, of course, could also have been produced under various other assumptions about the behavior of the cutaneous receptors, with correspondingly different nets. We shall now consider some theorems of equivalence: i.e., theorems which demonstrate the essential identity, save for time, of various alternative laws of nervous excitation. Let us first dis- cuss the case of relative inhibition. By this we mean the supposition that the firing of an inhibitory synapse does not absolutely prevent the firing of the neuron, but merely raises its threshold, so that a greater number of excitatory synapses must fire concurrently to fire it than would otherwise be needed. We may suppose, losing no generality, that the increase in threshold is unity for the firing of each such synapse; we then have the theorem: Theorem IV Relative and absolute inhibition are equivalent in the extended sense. We may write out a law of nervous excitation after the fashion of (1), but employing the assumption of relative inliibition instead; inspection then shows that this expression is a TPE. An example of the replacement of relative inhibition by absolute is given by Figure If. The reverse replacement is even easier; we give the inhibitory axons afferent to c, any sufficiently large number of inhibitory synapses apiece. Second, we consider the case of extinction. We may write this in the forni of a variation in the threshold 6, after the neuron Ct has fired; to the nearest integer — and only to this approximation is the variation in threshold significant in natural forms of excita- tion— this may be written as a sequence di + bj for j synaptic A Logical Calculus of the Ideas Immanent in Nervous Activity 389 delays after firing, where bj = 0 for / large enough, say 7 = M or greater. We may then state Theorem V Extinction is equivalent to absolute inhibition. For, assuming relative inhibition to hold for the moment, we need merely run M circuits U\, U'2, . . . 'hi containing respectively 1, 2, ... , A/ neurons, such that the firing of each link in any is sufficient to fire the next, from the neuron c, back to it, where the end of the circuit Wj has just b,- inhibitory synapses upon c,. It is evident that this will produce the desired results. The reverse substitution may be accomplished by the diagram of Figure Ig. From the transitivity of replacement, we infer the theorem. To this group of theorems also belongs the well-known Theorem VI Facilitation and temporal summation may be replaced by spatial sum- mation. This is obvious: one need merely introduce a suitable secjuence of delaying chains, of increasing numbers of synapses, between the exciting cell and the neuron whereon temporal summation is desired to hold. The assumption of spatial summation will then give the required results. See e.g. Figure Ih. This procedure had application in showing that the observed temporal summation in gross nets does not imply such a mechanism in the interaction of individual neurons. The phenomena of learning, which arc of a character persisting over most physiological changes in nervous activity, seem to re- quire the possibility of permanent alterations in the structure of nets. The simplest such alteration is the formation of new synapses or equivalent local depressions of threshold. We suppose that some axonal terminations cannot at first excite the succeeding neuron; but if at any time the neuron fires, and the axonal terminations are simultaneously excited, they become synapses of the ordinary kind, henceforth capable of exciting the neuron. The loss of an inhibitory synapse gives an entirely equivalent result. We shall then have 390 Information Storage and Neural Control Theorem VII Alterable synapses can be replaced by circles. This is accomplished by the method of Figure li. It is also to be remarked tliat a neuron which becomes and remains spon- taneously active can likewise be replaced by a circle, which is set into activity by a peripheral afferent when the activity commences, and inhibited by one when it ceases. THE THEORY: NETS WITH CIRCLES The treatment of nets which do not satisfy our previous assump- tion of freedom from circles is very much more difficult than that case. This is largely a consequence of the possibility that activity may be set up in a circuit and continue reverberating around it foi an indefinite period of time, so that the realizable Pr may involve reference to past events of an indefinite degree of remote- ness. Consider such a net VX, say of order /;, and let Ci, c-2, . . . , r^ be a cyclic set of neurons of ^^l. It is first of all clear from the definition that every N^ of ^^\ can be expressed as a TPE, of M, A^2, . . . , Np and the absolute afferents; the solution of v)l involves then only the determination of expressions for the cyclic set. This clone, we shall derive a set of expressions [A\: Ndz,) . ^ . Pr.[S"" M(zi), *S"''-^ N,(z,), ... , S"''^ N,{z,)], (2) where Pr , also involves peripheral afferents. Now if n is the least common multiple of the n,„ we shall, by substituting their equiva- lents according to (2) in (3) for the A^„ and repeating this process often enough on the result, obtain S of the form N,{z,) . ^ . Pr,[S"N,{zr), S"N,(zi), ... , S" Np(z,)]. (3) These expressions may be written in the Hilbert disjunctive nor- mal form as N,{z,) . = . E S„ n '?" A^. (zi ) n ~ *^" ^i(2i), for suitable ^ (-i) where S„ is a TPE of the absolute afferents of V^I. There exist some 2" different sentences formed out of the pN, by conjoining to the conjunction of some set of them the conjunction of the A Logical Calculus of the Ideas Immanent in Nervous Activity 391 negations of the rest. Denumerating these by A'i(zi), ^'2(21), . .., X-2p{zi), we may, by use of the expressions (4), arrive at an equi- pollent set of equations of the form X,{z,) . ^ .ZPruiz,) . S^Xjiz,). (5) Now we import the subscripted numerals i,j into the object- language: i.e., define Pri and Pr2 such that Pri(zzi,Zi) . = . X,{zi) and Prj(zzi,zz2,Zi) . = . Pr,j{zi) are provable whenever zzi and ZZ2 denote i and / respectively. Then we may rewrite (5) as (zi)zzp : Pri(zi, Z3) . = . {EZ'{)zZp . Pr-iizi. Zo, z-i - zzn) . Priizo, Zz - zZn) (6) where zz^ denotes n and zZp denotes 2'\ By repeated substitution we arrive at an expression (zi)zZp : Pri(zi, zz„zzo) . = . {Ez-z)zZp {Ezz)zZp . . . {Ez„)zZp Pr-zizi, z-2, ZZn {zzo — 1)) . Pr-i{z2,Zz,zZn {zz2 - 1)) (7) Pr2(z„_i,z„,0) . Pri(Zn,0), for any numeral ZZ2 which denotes s. This is easily shown by induction to be equipollent to {zi)zzp : . Pri{Zi,zZnZZ2) : = : (Ef) (Z2) zzo — l/(zoZZ„) ^ ZZp . fiZZnZZ2) = Zi . Pr2{f(,ZZn (Z2 + 1)), (8) f(zz,a-z)) . PrAf {0),0) and since this is the case for all ZZ2, it is also true that (Z4) {z,)zzp : Pn{z,,z,) . = . (Ef) (Z2) (Z4 - 1) ./(Z2) ^ ZZp . /(Z4) = Zi/(Z4) - zi . Pro[/(z2 + 1),/(Z2), Z2] . (9) Pri[/(res (Z4. zZn)), res (Z4, zz,,)], where zz„ denotes n, res {r,s) is the residue of /• mod s and zZp denotes 2''. This may be written in a less exact way as N^t) . ^ . (Ecf>) ix)t - 1 . <^(.r) ^ 2' . 0(0 = i . P[0(.f+ l),0(.r) ..V,(o^ (0)], where a and t are also assumed divisible by n, and Pr2 denotes P. From the preceding remarks we shall have 392 Information Storage and Neural Control Theorem VIII The expression (9) for neurons of the cyclic set oj a net S'X together with certain TPE expressing the actions oJ other neurons in terms oj them, constitute a solution of V)I. Consider now the question of the reahzabihty of a set of S,. A first necessary condition, demonstrable by an easy induction, is that (z.2)zi . pAz-2) ^ p,{z,) .^.Si^ sMj (10) should be true, with similar statements for the other free p in Si'. i.e., no nervous net can take account of future peripheral afferents. Any S, satisfying this requirement can be replaced by an equi- pollent S of the form {Ef) (z,)zy {z,)zz,r.hPr,„, :f{Zr,Z,,Zs = 1 . ^ ./>.3(Z2) (11) where zZp denotes p, by defining Pr„,i = /[(zi) {Z2)zi{zs)zzp : . f(zu z-i, Zs) = 0 . v . /(zi, Zo, Zs) = 1 :/(zi, Zo, Z3) = 1 . = . /),3(z,) : -^ : S,]. Consider now these series of classes a,, for which N ,{V) : = : {E<\>) {.v)t(^m)q : 4>ecxi :ISf„,{x) . = . {t, x, m) = 1. [/ = ry + !,••• ,M] (12) holds for some net. These will be called prehensible classes. Let us define the Boolean ring generated by a class of classes k as the aggregate of the classes which can be formed from members of k by repeated application of the logical operations; i.e., we put -^ aeX : a, ^eX . — > . — a, a . (3, aW jSeX]. We shall also define ^(k) . = . (R(k) - t'p' - "'V', f-i\e(K) =p X[(a, /3) : atK -^ ae\ . ^ . — a, a . (3, aV (3, S "aeX and G{'\>,t) = i[{m) . cf>{t -\- l,t, m) = '!^(m)]. A Logical Calculus of the Ideas Immanent in Nervous Activity 393 The class !-iv,,(/c) is formed from k in analogy with H\(>'), but by repeated apphcation not only of the logical operations but also of that which replaces a class of properties P e a by S{P) e S ^^ a. We shall then have the Lemma Priipu Pi. • . . , p.,. Zi) is a TPE if and only if (Zl) (pu ... , pra) {Ep„, + i) : />,„+! e ir^^eilpl, p2, • • • , P,n] ) A„+i(zi) = PuiPuPi, ... ,A»,Zl) (13) is true; and it is a TPE not involving \S" if and only if this holds when '<-R,.' is replaced by 'f-R', and we then obtain Theorem IX A series of classes ai, a-^, ... a, is a series of f)rehensible classes if and only if (Em) (En) (p)n(i) ('V) : . i.r)ni';^ix) = Ov •b{x = 1 :^ : (E^) {Ey)m . 'M^) = 0 . fSeiillyiiEi) . y = a,)) . v . {x)m . ^(.r) = 0 . l3efk[yaE,) . y = «,)] : (0 (0) : ^ea. . (14) 'i4>, nt + p) . ^ . (Ef) . fef3 . {w)m{.v.)t - 1 . (t){n{t + 1) + p, nx + p, iv) = f(nt + p, nx + p, iv). The proof here follows directly from the lemma. The condition is necessary, since every net for which an expression of the form (4) can be written obviously verifies it, the t];'s being the charac- teristic functions of the S„ and the (3 for each -^ being the class whose designation has the form JJ ^r, J J PTj, where Pr,, denotes I'-i J-4i,-, a,, for all k. Conversely, we may write an expression of the form (4) for a net VX fulfilling prehensible classes satisfying (14) by putting for the Pra Pr denoting the ']j's and a Pr, written in the analogue for classes of the disjunctive normal form, and denoting the a corresponding to that '4^, conjoined to it. Since every S of the form (4) is clearly realizable, we have the theorem. It is of some interest to consider the extent to which we can by knowledge of the present determine the whole past of various special nets: i.e., when we may construct a net the firing of the cyclic set of whose neurons requires the peripheral afferents to 394 Information Storage and Neural Control have had a set of past values specified by given functions 0,. In this case the classes a, of the last theorem reduced to unit classes; and the condition may be transformed into {Em, n) {v)n{i, <];) {Ej) : . {x)m : '\>{x) = 0 . w , '^(x) = I : ^i€j(({;, nt + p) : -^ : (:w)m(x)t — 1 . (f)i{n{t + 1) + p, nx -\- 'p,w) = 4>j{nt + p, nx + p, w) : . (:u, v) (w)m . 4>iix>'{u + 1) -\- p, nn + p, w) = (t)i{n(v + 1) -]r p,nv -{- p,w). On account of limitations of space, we have presented the above argument very sketchily; we propose to expand it and certain of its implications in a further publication. The condition of the last theorem is fairly simple in principle, though not in detail; its application to practical cases would, however, require the exploration of some 2-" classes of functions, namely the members of fjv(|ai, ••• , «..j). Since each of these is a possible ^ of Theorem IX, this result cannot be sharpened. But we may obtain a sufficient condition for the realizability of an S which is very easily applicable and probably covers most practical purposes. This is given by Theorem X Let us define a set of X" of S by the following recursion: 1. Any TPE and any TPE whose arguments have been re- placed by members of K belong to K; 2. If Pri{zi) is a member of K, then (22)21 • Pri{zo), (£22)2:1 . Pviiz-i), and C,„rXzi) • * belong to it, where C,„„ denotes the property of being congruent to m modulo n, m < n. 3. The set K has no further members. Then every member of K is realizable. For, if Pr\{zi) is realizable, nervous nets for which A^,(2i) . = . Pry{z,) . SN,{zi) iV,(zi) . ^ . Pn(z,)vSN,fzr) are the expressions of equation (4), realize (22)21 • Priiz-z) and A Logical Calculus of the Ideas Immanent in Nervous Activity 395 {E Zi)Zi . Priiz^j respectively; and a simple circuit, a, C2, ... , c,., of ?2 links, each sufficient to excite the next, gives an expression A^„(zi) . = .M(0) . C.„ for the last form. By induction we derive the theorem. One more thing is to be remarked in conclusion. It is easily shown: first, that every net, if furnished with a tape, scanners connected to afferents, and suitable efferents to perform the necessary inotor-operations, can compute only such numbers as can a Turing machine; second, that each of the latter numbers can be computed by such a net; and that nets with circles can be computed by such a net; and that nets with circles can compute, without scanners and a tape, some of the numbers the machine can, but no otliers, and not all of them. This is of interest as affording a psychological justification of the Turing definition of computability and its equivalents, Clhurch's X — definability and Kleene's primitive recursiveness: If any number can be computed by an organism, it is computable by these definitions, and con- versely. CONSEQUENCES Causality, which requires description of states and a law of necessary connection relating them, has appeared in several forms in several sciences, but never, except in statistics, has it been as irreciprocal as in this theory. Specification for any one time of afferent stimulation and of the activity of all constituent neurons, each an "all-or-none'' affair, determines the state. Specification of the nervous net provides the law of necessary connection whereby one can compute from the description of any state that of the succeeding state, but the inclusion of disjunctive relations prevents complete determination of the one before. Moreover, the regen- erative activity of constituent circles renders reference indefinite as to time past. Thus our knowledge of the world, including ourselves, is incomplete as to space and indefinite as to time. This ignorance, implicit in all our brains, is the counterpart of the abstraction which renders our knowledge useful. The role of brains in determining the epistemic relations of our theories to our 396 Information Storage and Neural Control -<1 ^ <^ i7 <■ < ^> ^f- FlOURE 1 A Logical Calculus of the Ideas Immanent in Nervous Activity 397 observations and of these to the facts is all too clear, for it is ap- parent that every idea and every sensation is realized by activity within that net, and by no such activity are the actual afferents fully determined. There is no theory we may hold and no observation we can make that will retain so much as its old defective reference to the facts if the net be altered. Tinnitus, paraesthesias, hallucinations, de- lusions, confusions and disorientations intervene. Thus empiry confirms that if our nets are undefined, our facts are undefined, and to the "real'' we can attribute not so much as one quality or "form." With determination of the net, the unknowable object of knowledge, the "thing in itself,'' ceases to be unknowable. To psychology, however defined, specification of the net would contribute all that could be achieved in that field — even if the analysis were pushed to ultimate psychic units or "psychons," for a psychon can be no less than the activity of a single neuron. Since that activity is inherently propositional. all psychic events have an intentional, or "semiotic," character. The "all-or-none" law of these activities, and the conformity of their relations to those of the logic of propositions, insure that the relations of -^ EXPRESSION FOR THE FIGURES In the figure the neuron cv is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by W with i as sub- script, as in the text. Figure la N-i(t) . = . A^i(/ - 1) Figure lb A^3(0 • = • A^i^' - 1) V N2(t - 1) Figure Ic A^3(0 . = . A^i(/ - D • NM - 1) Figure Id Nsit) . = . N,(t - 1) . ^ N-2(t - 1) Figure le .V,(0 : = : .V,(/ - 1) . V . ,V2(/ - 3) . - A^,(/ - 2) N,(t) . = . N-At - 2) . N-zit - 1) Figure If N^{t) : = : ~ A^i(; - 1) . N ■i{t - \)vN^{t - 1) . V . yVi(t - 1) • N-At - 1) . iVsO - 1) is!, it) : = : - 7Vi(< - 2) . N At - 2) v N -M - 2) . v . NAt - 2) . NAt - 2) . ^At - 2) Figure Ig A^3(0 . = . NAt - 2) . ~ iV,(< - 3) Figure Ih iVsCO . = . A^,(/ - 1) . NAt - 2) Figure li .V3(0 : = : Ar,(/ - 1) . V . ,V,(^ - 1) . {Ex)t - 1 . A^,(.v) . N Ax) 398 Information Storage and Neural Control psychons are those of the two-valued logic of propositions. Thus in psychology, introspective, behavioristic or physiological, the fundamental relations are those of two-valued logic. Hence arise constructional solutions of holistic problems involving the differentiated continuum of sense awareness and the norma- tive, perfective and resolvent properties of perception and execu- tion. From the irreciprocity of causality it follows that even if the net be known, though we may predict future from present activities, we can deduce neither afferent from central, nor central from efferent, nor past from present activities — conclusions which are reinforced by the contradictory testimony of eye-witnesses, by the difficulty of diagnosing differentially the organically diseased, the hysteric and the malingerer, and by comparing one's own mem- ories or recollections with his contemporaneous records. Moreover, systems which so respond to the difference between afferents to a regenerative net and certain activity within that net, as to reduce the difference, exhibit purposive behavior; and organisms are known to possess many such systems, subserving homeostasis, appetition and attention. Thus both the formal and the final aspects of that activity which we are wont to call mental are rigorously deducible from present neurophysiology. The psychi- atrist may take comfort from the obvious conclusion concerning causality — that, for prognosis, history is never necessary. He can take little from the equally valid conclusion that his observables are explicable only in terms of nervous activities which, until recently, have been beyond his ken. The crux of this ignorance is that inference from any sample of overt behavior to nervous nets is not unique, whereas, of imaginable nets, only one in fact exists, and may, at any moment, exhibit some unpredictable activity. Certainly for the psychiatrist it is more to the point that in such systems "Mind" no longer "goes more ghostly than a ghost." Instead, diseased mentality can be understood without loss of scope or rigor, in the scientific terms of neurophysiology. For neurology, the theory sharpens the distinction between nets neces- sary or merely sufficient for given activities, and so clarifies the relations of disturbed structure to disturbed function. In its own domain the difference between equivalent nets and nets equivalent in the narrow sense indicates the appropriate use and importance A Logical Calculus of the Ideas Immanent in Nervous Activity 399 of temporal studies of nervous activity: and to mathematical bio- physics the theory contributes a tool for rigorous symbolic treat- ment of known nets and an easy method of constructing hypo- thetical nets of required properties. REFERENCES 1. Carnap, R.: The Logical Sjntax of Language. New York, Harcourt. Brace and Company, 1938. 2. Hilbert, D., und Ackermann, W.: Grundiige der Theoretischen Logik. Berlin, J. Springer, 1927. 3. Whitehead, A. N., and Russell, B.: Principia Mathematica. Cambridge, Cambridge University Press, 1925. NAME INDEX Abbott. W., 19, 170, 171, 353 Abraham, S., 348 Abt,J. P., 226 Ackerman, W., 133, 138 Ackermann, \V., 386, 399 Adey, VV. R., 240 Aldrich, A., 295 AUee, W. C, 170 Alper, T., 116 Apgar, J., 116 Aposhian, H. V., 135 Arbib, M. H., 295, 377 Arduini, A., 226 Arnon, D. I., 148, 170 Ashby, W. R.. 142. 169 Astrachan, L., 1 10, 111 , 1 12, 1 19, 124, 135 Attneave, F., 173, 184 Bach, L. M. N., 242 Bachtold,J. G., 136 Barlow, H. B.. 327 Barlow, J., 277 Barnett, L., 115 Basilic, C, 71, 113, 119 Bates, J. A. V., 241 Bateson, G.. 25, 173, 184, 185. 186, 242, 296, 330, 354, 355, 372 Baumol, ^V. J.. 171 Bavelas, A.. 173, 181, 182 Beadle, G. W.. 59 Beavers, W. R., 186 Beckwith, W.. 276 Beers, R. F.,Jr., 229 Bell, D. A., 16 Bellman, R., 167, 171 Belozersky, A. N., 87, 106, 114 Benzer, S., 71 Berg, P., 71. 98, 115 Bergold, G. H., 114 Bernard, C., 233 Bidwell, R. G. S., 170 Birdsall, T. G.. 306, 327 Bishop, G. H., 226 Blackman, R. B., 348 Block, L. N., in Blum, M., 285, 289, 292, 293, 295 Blustein, H.. 21, 22, 184 Boltzman, L., 5, 144 Boyer. G. S.. 137 Branson, H. R., 141, 169 Brattgard, S.. 226 Brazier, M. A. B.. 19, 226, 230, 24L 242, 277. 355. 360 Brenner, S.. 71, 114, 115, 135 Brillouin. L.. 141. 142, 147, 169 Britten, R.J., 135 Broadbent, D. E., 307, 308, 327 Brown, R., 310, 327 Bubel, H. C.. 136 Buchsbaum, R., 368 Burch. N. R., 24, 329, 348, 349, 355 Burma, D. P., 114 Burns, B. D., 226 Burton, K., 101, 115 Bush, R. R., 49, 56 Caceros, C. A., 348 Carnap, R., 377, 383, 399 Chamberlin, M., 98, 115 Chargaff, E., 115 Cheng, P. v., 115 Cherry, C, 17 Childers, H. E., 329, 348 Chow, K. L., 197, 198, 200, 225, 226 Cohen, G. N., 138 Cohen, S. S., 115, 125. 134. 135 Copi, I. M.. 377 Cordes, S., 115 Corley, K., 269 Corning, VV. C, 276, 370 Courtois. G., 216, 228 401 402 Information Storage and Neural Control Cowan, J., 294, 295 Craston, D. F., 170 Crawford, E. M., 115 Crawford, L. V., 115 Crick, F. H. C, 60, 71, 74, 77, 82, 83, 115, 119 ), 81, Daesch, G. E., 137 Darnell, J. E., Jr., 74, 122, 123, 136, 138, 139 Davenport, W. F., 241 Davern, C. I., 98, 115 Davies, D. R., 71 Davison, P. F., 115 De LaHaba, G. L., 138 Dean, W., 229 Deininger, R, L., 309, 327 Deutsch, J. A., 226 Dewson, J., 197, 198, 200, 226 Dickerson, R. E., 71 Dingman, W., 276, 368 Dixon, M. K., 137 Dobzhansky, T., 374 Doty, P., 87, 100, 115, 118 Driesch, H. A. E., 358 Dubbs, D. R., 117 Duda, W. L., 55 Dulbecco, R., 129, 136, 137 Duncan, C. P., 189, 196, 226 Dunlop, C. W., 240 Dunn, D. B., 115 Echols, H., 59, 71, 72, 73, 74, 75, 121, 353 Edstrom, J., 115 Edwards, R. J., 348 Eiduson, S., 276 Elgot, C. C, 377 Ellen, P., 276 Emerson, A. E., 170 Epstein, H. T., 115 Essman, W. B., 226 Estes, W. K., 49, 56 Feinstein, A., 17 Fields, W. S., 353 Finamore, F. J., 116 Finch, J. T., 136 Fitts, P. M., 309, 327 Flaks,J. G., 135 Fogh, J., 133, 138 Freeman, G., 137 Freese, E., 83, 116 Freifelder, D., 115 Fresco, J. R., 118 Frey, B. A., 136 Frisch-Niggemeyer, W., 116 Furth,J. J., Ill, 116 Gaarder, T., 170 Gabor, D., 17, 294 Gafford, L. G., 118 Garen, A., 71, 136 Gebhardt, L. P., 136 Geiduschek, E. P., 116 Geller, E., 276 Gerard, R. W., 26, 189, 196, 226, 227, 228, 305, 327, 353, 361, 367, 369, 370, 371, 372, 374, 375, 376 Gibbs, W., 144 Gilbert, W., 116, 135 Gillbricht, M., 170 Gillies, N. E., 116 Ginsberg, H. S., 137 Glickman, S. E., 276 Goldman, M., Ill, 116 Goldman, S., 17, 348 Goldring, S., 227 Goldstein, M. H., 241 Gorman, A. L. F., 243 Gran, H. H., 170 Gray, E. G., 285 Green, M., 137 Gregory, R. T., 27 Grey Walter, VV., 241 Gros, F., 112, 116, 135, 136 Grunberg-Manago, M., 67, 71 Gumnit, R. J., 227 Fatt, I., 289 Faulkner, P., 115 Haibt, L. H., 55 Hall, B. D., Ill, 112, 116, 118, 119, 124, 135 Name Index 403 Hall, V. E., 361 Halstead, VV. C, 211, 227 Hamberger, C. A., 227 Hammer, G., 227 Hart, R. G., 71 Hartley, J. W., 138 Hartley, R. V. L., 6 Hartline, H. K., 360 Hayashi, M., 112, 116 Hebb, D. O., 37, 54, 55, 190, 227 Hede, R., 115 Helinski, R., 71 Henderson, K., 134 Hendrix, C. E., 240 Herriot, R., 121 Hershey, A. 13., 134 Hiatt, H., 116, 135 Hilbert, D., 386, 399 Hoagland, M. B., 116 Holland, J. H., 55 Holland, J. J., 136 Holley, R. W., 103, 116 Hollingworth, B. R., 135 Hooper, L., 138 Home, R. W., 132, 136 Horowitz, N. H., 59, 70 Horvath, W.J., 314, 328 Howes, D. VV., 137 Human, M. L., 134 Hurwitz, J., Ill, 116 Hutchinson, G. E., 169 Hyden, H., 211, 226, 227, 276 Hyman, L. H., 369 lizuka, R., 211, 227 Ingram, V. M., 71 Isaacs, A., 138 Ivers, R. R., 18 Iwamura, T., 116 Jacob, F., 69, 71, 114, 135, 136 Jarvik, M. E., 226 Jasper, H. H., 228, 277 Jaynes, E. T., 144, 170 John, E. R.,20, 120,211, 227,243,247, 249, 250, 260, 261, 262, 263, 267, 276, 278, 282, 355, 364, 366, 369, 370, 372 Joklik, W. K., 136 Jones, O. W., 71 Josse, J., 135 Kaiser, A. D., 136 Katz,J. J., 211, 227, 289 Kellaway, P., 23, 24 Kcndrew, J. C., 71 Khinchin, A. I., 17 Killam, K. F., 245, 246, 247, 248, 249, 250, 259, 260, 261, 262, 263, 276 Kiinura, K., 119 Kirby, K. S., 117 Kit, S., 76, 117, 120, 122, 139, 353,362, 372, 375 Kleene, S. C., 377, 395 Kleinschmidt, W. J., 117 Klug, A., 136 Kok, I. P., 117 Kornberg, A., 125, 135 Kornberg, S. R., 135 Korolkova, T. A., 276 Kozloff, L. M., 134 Kraft, M. S., 228 Kreps, E., 211, 228, 276 Krey,J., 170 Kristiansen, K., 216, 228 Kroger, H., 114 Kurland, C. G., 116, 135 Kurtz, H., 138 Lawley, P. D., 83, 117 Lciman, A. L., 243, 267, 276 Lengyel, P., 71, 113, 117, 119 Lenneberg, E., 310, 327 Leuchtenbergcr, C., 137 Levine, M., 136 Levinthal, C., 70, 115 Levintow, L., 136, 138 Levy, H. B., 138 Liberson, W. T., 264, 268, 276 Libet, B., 227 Lichtenstein, J., 135 Lindegren, C. C., 118 Lindegren, G., 118 Lindeman, R. L., 169 Lindsay, R. K., 34, 353 404 Information Storage and Neural Control Linschitz, H., 141, 169 Lipmann, F., 1 38 Littlefield, J. \V., 120 Livanov, M. N.. 245, 246, 264, 268, 276, 277 Lockart, R. Z., 136 Loeb, T., 117 Lorente de No, R., 332, 348 Loucks, R. B., 272, 277 Luria, S. E., 134, 136 Lute, M., 134 Lwoff, A., 137 Lwoff, M., 137 MacArthur, R., 143, 169 MacFadyen, A., 169 Magasanik, B., 117 Majkowski, J.. 252, 253, 277 Makhinko, V. I.. 117 Maling, B., 71 Mandelbrot, B., 43, 44, 45, 46, 56 Marcus, P. I., 137 Margalef, D. R., 169 Marmur, J., 87, 115, 118 Martin, E. M., 115 Martin, R. G., 71 Matthaei, J. H., 64, 71, 112, 117, 134, 139 Mayor, H.D., 17, 18,74, 121, 122, 171, 172 McConnell, J., 369, 370, 371 McConnell, W., 170 McCulloch, W. S., 36, 38, 55, 283, 296, 297, 298, 306, 375, 376, 377, 379 McGlothlen, M., 72, 121 McLaren, L. C, 136 McQuillen, K., 135 Meier, R. L., 324, 325, 328 Merrill, S. H., 116 Meselson, M.,71,96, 114, 117, 118, 135 Miller, G. A., 52, 53, 56, 355, 357 Miller, J. G., 301 Minagawa, T. , 117 Minckler, S., 118 Minsky, M., 292 Monod, J., 69, 71 Moore, H. F., 118 Morganstern, O., 170 Morrell, F., 73, 189, 228, 244, 245, 246, 248, 277, 278, 282, 355, 364, 365, 366, 367, 371, 374 Moses, L., 225 Mostellar, F., 49, 56 Mountcastle, V. B., 237, 241 Mowbray, G. H., 328 Muller,J.. 231, 233, 234 Munier, R., 138 Myers, J., 116 Nagington, J., 132, 136 Naitoh, P., 225 Nakamoto, T., 1 11, 116, 120 Nathan, P. W., 224, 229 Nathans, D., 138 Neff, W. D., 268, 277 Newton, A., 137 Nieder, P. C., 277 Nirenberg, M. W., 64, 67. 71, 74, 75. 112, 117, 134, 139 Nomura, M., 110, 111, 118 Nyquist, H., 6 Obrist, VV. D., 228 Ochoa, S., 67,71, 74, 75, 112, 113, 114, 117, 119 Odum, H. T., 170 Oesterreich, R. E., 277 Ogur, M., 118 OXeary, J. L., 226, 227 Olken, H., 374 Onesto, N., 295 Opalskii, A. F., 117 Osawa, S., 118 Park, O., 170 Park, T., 170 Patrick, B. S., 25 Patten, B. C., 140, 169, 171, 172, 356 Pautlcr, E., 297 PhiUips, D. C.,71 Pigon, A., 227 Pitts, W. H., 36, 55, 284, 377, 379 Piatt, J. R., 305, 327 Polyakov, K. L., 245, 277 Name Index 405 Prange, E., 292 Pribram, K. H., 228, 229 Puck, T. T., 137 Quastler, H., 308, 309, 327 Rabinovvitch, E. I., 170 Rabson, A., 138 Randall, C. G., 118 Ransmeier, R. E., 228 Rapoport, A., 46. 56, 314, 328 Rashevsky, N., 167, 171, 379 Reymond, D. B., 231 Rhoades, M. V., 328 Rich, A., 110, 118 Rikli, A. E., 348 Ris, H., 71 Risebrough, R. VV., 116, 135 Roberts, L., 228 Roberts, R. B., 135 Rochester, N.. 37, 38, 55 Rogers, S., 138 Roitbak, A. E, 197, 229 Roizman, B., 138 Rolfe, R., 96, 118 Root, W. L., 241 Rose, VV. R., 138 Rosenblatt, P., 54, 56 Ross, G., 228 Ross, R. W., 137 Rothman, F., 71 Rothschild, 367 Rothstcin, J., 170 Rowland, V., 192, 229 Rubin, H., 129, 137, 138 Rusinov, V. S.. 192. 229 Russell, B., 383, 399 Russell, W. R.. 224, 229 Ryther.J. H., 170 Sachs, E., 267, 276, 368 Sachs, L., 137 Saltzbcrg, B., 5, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 296, 330, 348 Salzman, N. P., 118, 128, 136, 137, 138 Sandler. B., 228 Schafer, \V., 1 18, 131, 138 Schaffer, F. E., 118, 136 Schildkraut, C. E., 87, 96, 101, 102, 110, 118 Schlessinger, D., 135 Schmidt, K. P., 170 Schrodinger, E.. 140, 141, 142, 147, 169 Schuster, H., 74, 118 Schwartz, M., 17 Schwerdt, C. E., 118, 136 Sebring, E. D., 136 Sclfridge, O., 292 Sevring, E. D., 138 Shannon, C. E., 5, 13, 14, 15, 17, 52, 142, 144, 168, 169, 230, 240, 241, 318 Shapiro, A., 18, 19, 23, 73 Sherrington, C. S.. 51, 231, 241, 360 Shipton, H. W., 21. 241. 349 Shore, V. C., 71 Sibatani, A.. 119 Siebert, W. M., 241 Siminovitch, L.. 135, 136 Simon, E. H., 119 Simon, H. A., 41. 42, 46, 56 Sines, J. O., 277 Sinsheimer, R. E.. 119, 135 Smith, J. D., 115 Smith, K. M., 137 Snedecor, G. \V., 170 Sokolov, E. N., 233, 241 Spahr, P. F., 115, 135 Speyer,J. F., 71. 113, 117, 119 Spiegelman, S., 1 1 1. 1 12, 1 16, 1 18, 1 19, 124, 135 Spirin, A. S., 87, 106, 114, 119 Spoor, VV. A., 348 Sporn, M. B., 276, 368 Stahl, F. VV., 117 Stamm, J. S., 229 Steinberg, C. A., 346. 348 Steiner, R. F., 229 Stephenson, M. E., 103, 119 Stern, J. A., 248, 277 Stevens, A., 119 Stevens, S. S., 306, 307, 327 Stoker, M. G. P., 137 Storck, R., 112, 119 Strandberg, B. E., 71 Strauss, B., 117 406 Information Storage and Neural Control Streisinger, G., 135 Stuart, D. C.,Jr., 133, 138 Stumpers, F. L. H. M,, 17 Sueoka, N., 99, 115, 119 Suwa, N., 227 Sved, S., 115 Sverdrup, H. U., 171 Swets,J. A., 306, 327 Szent-Gyorgi, A., 147, 148, 170 Szilard, L., 5 Takahashi, T., 119 Takahata, N., 227 Takeda, T., 227 Taketomo, Y., 185 Tamm, I., 137 Tanabe, M., 227 Tanner, W. P., Jr., 306, 327 Tatum, E. L., 59 Temin, H. M., 129, 137, 138 Tessman, I., 97, 119 Thomas, R. S., 119 Thompson, R., 229 Thoren, M., 138 Thrall, R. M., 169 Tissieres, A., 135 Tobias, J. M., 120, 229, 371 Travers, P. L., 180, 184 Tribus, M., 144, 170 Tschirgi, R. D., 211, 368 Tukey.J. W., 348 Turing, A. M., 36, 55, 377 Volkin, E., 110, 111, 112. 116, 119, 124, 135 von Foerster, H., 27"^, von Neumann, J., 5, 36, 55, 170, 284, 285, 375 Wagner, B., 117 Warner, R. C, 114 Watson, J. D., 60, 71, 77, 80, 81, 116, 119, 120, 135, 139 Watts-Tobin, R. J., 115 Weaver, W., 17 Weill, J. D., 114 Weiner, M. F., 19, 186 Weir, H. F., 297, 298 Weiss, M., 243, 252, 254, 255, 256, 277 Weiss, S. B., Ill, 116, 120 Welch, A. J., 348 Wenzel, B. M., 211, 368 Wheelock, E. F., 137 Whitehead, A. N., 360, 383, 399 Whitfield, I. C., 234, 241 Wiener, N., 6, 169, 171, 230, 241 Winocour, E., 137 Winograd, S., 294, 295 Wittman, H. G., 71 Woese, C. R., 120 Woodward, P. M., 9, 17 Work, T. S., 115 Wright, J. B., 377 Wyatt, G. R., 135 Ulett, G. A., 277 Valentine, R. C., 115 Valentinuzzi, M. E., 24, 374 Vallentyne, J. R., 170 van Leeuwuen, W. S., 349 Vendrely, R., 119 Verbeek, L., 294, 295 Verveen, B., 289, 295 Vinograd, J., 117 Vladimirov, G. E., 229 Vogt, M., 129, 137 Yamana, K., 119 Yanofsky, C., 71 Yarmolinsky, M. B., 138 Yngve, v., 46, 47, 48, 49, 52, 56 Zamecnik, P. C., 103, 119 Zimmerman, J. B., 135 Zimmerman, T., 138 Zinder, N. D., 117, 136 Zipf, G. K., 39, 40, 43, 44, 56 Zubkoff, P. L., 116 Zuckermann, E., 266, 277 SUBJECT INDEX A Afferents interaction of, 285-289, 296 peripheral, 383, 385, 390-393 After discharge, 199, 218, 382 All-or-none law, 381, 397 Amino acids, 62-67, 72, 103 Amnesia, retrograde, 189, 224, 366 Assimilated rhythms, 247, 248 Attention, 48, 233, 361 Auditory mechanisms, 266, 285, 307 Automata theory, 284, 377 Axons, 190, 284, 289, 293-347, 371, 379, B Bacteria, 59, 102, 106, 139 DNA from, 102 metabolism of, 97 mutant strains, 63 Bacteriophages, 94, 110, 123, 124, 127 T-even, 79, 85 T-4 mutants, 82 nitrous acid induced mutants, 97 Base pairing, 62, 65 specific, 79 Behavior differential conditioned, 274 disturbed, 179 ordered, 306, 355 purposive, 398 Binary representation in computers, 27- 30 Binary units, 6, 9, 10 Biomass, 144, 149, 151, 155, 164 Brain information storage in, 56 information transfer ir, 240 number of neurons in, 361 c Calculation error-free, 292, 293 Calculus, logical, 55, 284, 377 Cannibalism experiments, 369 Channel Capacity, 12, 15, 142, 240, 295, 308, 311, 325 communication, 7, 52 correction, 15, 142 length of, 308 noise, 240 overloaded, 31 1 Chlorophyll, 150, 155, 157 Coding genetic, 61, 76, 82, 112, 353 in nerve cells, 246 in nervous system, 233 in time domain, 330 of language, 43-48 spatio-temporal patterns of, 279-282 Coincidence analysis, 243, 278, 340 Communication accuracy in, 26 channels, 12, 54, 142, 240 economics of, 25, 372 pathological alteration, 180 systems, 12, 15, 25, 242, 373 theory, 12, 17, 142, 230, 308 Communities adaptability of, 166 bioenergetics of, 140, 147 complexity of, 1 43 diversity of, 162 energy balance in, 163 stability of, 143 trophodynamics, 140, 147 Computation, error-free, 290 Computers averaging by, 233 coding in, 31, 36, 37 407 408 Information Storage and Neural Control general purpose, 345 generation of, 375 simulation of brain, 38, 51 Conditioning avoidance response, 249 differential to central stimulation, 268 Correction of errors, 14, 15 D Decision making, 34, 49, 167, 243, 304, 310 Decoding, 10, 295, 303, 309 Deoxyribonucleic acids (DNA) amount per cell, 84, 85 as genetic material, 60 average composition, 85-93 base composition, 93 base sequence of, 62, 66, 70, 72, 80, 88 distribution of, 98 equilibrium sedimentation of, 95 formation of hybrid molecules, 101 genetic information in, 60, 66, 222 heterogeneity of composition, 96, 99, 112^ molecular size of, 85, 96, 97, 101, 122 non-overlapping bands, 97 phage 0X1, 74, 98, 111 primer, 65, 109, 111 replication, 80, 363 structure, 60, 77, 80, 81, 101, 109, 121 synthesis, 76, 80, 109, 121, 125, 375 Dependency, 177 Deterministic models, 230, 236 Discrimination, 178, 243, 259, 264 Dominance, 177 Error correction of, 7, 13 frequency of, 14 in performance, 259-264 of commission, 261 of omission, 260 probability of, 290. 293, 295 Exchange, interpersonal, 175 Expectation, 37, 185, 354 Experience, fixation of, 354, 359, 363, 374 Extinction, 153, 381, 388 Filter, 22, 278, 297, 340, 346, 348 Filtering, 311, 314, 319, 324 Fixation, 189, 355, 365-367 Frequency analysis, 23, 26, 329, 340 Galvanic skin response (GSR), 338, 348 Generalization, 248, 258, 272, 278 Genes, 59-63, 69, 72, 121, 126,289,353, 359, 374 as determinants of protein struc- ture, 61 mutation of, 72, 126 suppressor, 72 Genetic coding, 61, 64, 70, 114 H Habituation, 192, 233, 264 Homeostasis, 142, 232, 398 E Electrocardiogram (EKG), 330, 343- 348 Electroencephalogram (EEG), 21, 23, 190, 207, 252, 278, 330-340, 348 Energy balance, 163, 165 Energy gains and losses, 140, 149, 155, 166, 171, 356 Entropy, 5, 15, 25, 141, 147, 149, 240 Environmental influences, 355, 356 Equivocation, 13, 14, 15, 19, 336 Information capacity, 9 content, 9 coding, in brain, 268 flow, route of, 305 genetic, 59, 70, 76, 103, 108, 122, 123. 189 input overload, 311 measure, 5, 6, 7, 240 overload, 311, 314, 325 overload testing, 316 transmission of, 25, 124, 304, 310, 362 Subject Index 409 Information processing essential subsystems, 302 in computers, 31, 32, 33 in human brain, 240, 301 in time domain, 329-348 models of, 35, 38, 41 , 54, 230, 236, 239 subsystems research, 306 Information storage, 8-10 fixation of experience, 363, 366 in nerve cells, 189 long-term, 244, 369 mechanisins, 192 short-term, 195-197, 210, 211 Information theory, 5, 240, 295 in ecology, 140-149 in neurophysiology, 230-239 Inhibition, 235, 268, 284, 341, 360, 381, 388 Inputs, 53, 235, 287, 293, 302-308, 311, 314, 325, 365 Interaction group, 341, 343 of afferents, 285, 296 virus and cell, 129 Interference, 18, 22, 244 Language, 38, 303, 305, 362, 383 coding, 43-48 information in, 20 redundancy of, 26 statistical properties of, 42 Learning, 173, 181, 283, 303, 310, 357, 366, 381, 389 levels of, 174, 177, 183, 185, 190, 355, 359 process of, 177, 181 , 232, 354, 359, 371 theory of, 56, 173, 301, 310 Logic of propositions, 379, 381, 397 probabilistic, 55, 284, 293 M Machine, computing, see Computers Machine, Turing, 36, 284, 377, 395 Malleability, of processing system, 355, 360, 362 Memory, 37, 52, 225, 243, 255, 280, 304, 310, 359, 363, 374 cellular, 201, 212 enduring, 189, 365 functional, 201 recent, 189 retention of, 189 Message, 12, 15, 19, 25, 301-305 {see also Binary units) Messenger, see RNA Metacommunications, 186, 354 Metalanguage, 186 Models, information processing, see In- formation processing Mutations, 61, 74, 80 chemically induced, 81 externally adaptive, 372 genetic, 63, 363 suppressor, 72 N Natural selection, 167, 373 Nets, see Neuron nets Neuron nets anastomotic, 283-295 logically stable, 287 with circles, 390 without circles, 382 Neurons input. 290, 294 internuncial, 380 logical functions of, 286 output, 284, 292, 297 spontaneously active, 236, 390 storage, 235 threshold of, 289, 314, 381 Noise, 13, 18, 21, 22, 171, 283, 289, 296, 305, 330, 375 errors induced by, 26, 142 fluctuation in, 309 high frequency, 340 random, 12, 18, 22 Nucleotides composition, 94, 95 sequence, 83, 101 triplets, 82, 97, 114 Numbers, binary, 27, 29, 30 410 Information Storage and Neural Control O Omission, 260, 311, 317, 319, 324, 325 Order functional, 355 structural, 355 Outputs, 38, 142, 233, 288, 292-297, 302-308, 311, 318, 326, 356 Overload, see Information overload Perception, 283, 284, 304 Performance erroneous, 269, 275 principle factors limiting, 309 under overload conditions, 320, 323 Period analysis, 189, 329, 340-348, 359 Photosynthesis, 154-156, 164, 165 Planaria, 368 cannabalism studies, 370 ■ Poliovirus biosynthesis of, 132 multiplication of, 133 properties of, 127 Prediction, 174, 181 ProbabiHstic models, 230, 233 Probability, 10, 11, 19, 20, 25, 41-43, 143, 145, 326 Problem solving, 35 Protein specificity, 59, 68 structure, 62, 70, 72 synthesis, 59, 64-68, 70, 108, 124, 371 viral, 129-132 Purines, 60, 61, 76, 79 Pyrimidine, 60, 61, 76, 79, 103 Q Queuing, 311, 317, 321, 324 R Random processes, 232 Receiver, 6, 25, 303 Redundancy, 7, 14, 25, 26, 84, 122, 173, 181, 283 Reinforcement, 175-177, 259, 269,272 Replication phage, 130 virus, 123, 132 Response, 175-177, 197 behaviorally appropriate, 257 conditioned, 190, 211, 245, 248, 253, 263, 317, 371 differentiated, 271 generalization of, 248, 258 graded, 235 labeled potentials, 245 Reverberation, 37, 280, 282 Ribonucleic acid (RNA) and information storage, 120 and memory, 244, 280 base sequence, 66, 73, 222 cellular concentration after stimula- tion, 211, 221 general characteristics, 103 informational, see messenger messenger, 64-67, 74, 76, 80, 98, 108- 114, 124, 363 ribosomal, 64, 105, 124, 363 synthesis. 111, 121, 125, 368 total cellular, 105 transfer, 64-66, 72, 103, 112, 131 virus, 74, 103, 107, 114, 128, 132, 139 Ribosomes, 64, 139 Shannon's Theorem 10, 142, 144, 148 Signals, 11, 13, 23, 233, 283, 289, 290, 305, 310 electroencephalographic, 330 electrocardiographic, 343, 347 meaningful, 237 random, 18 reconstituted, 335 synchronous, 284 Signals-in-noise, 230, 232 Specificity genetic, 62 protein, 59, 62, 68 Stimuli, 176, 238 concurrent peripheral and central, 264 conditioned, 211, 248, 252, 258, 264- 266, 271, 275 flicker, 264 peripheral, 252, 266, 269 photic, 265, 269, 271, 275 tracer, 245, 271 Subject Index 411 Storage {see also Information storage) capacity, 8-11, 375 mechanisms, 210, 370 long-term, 244, 308, 369 short-term, 195, 197 Symbols, 14, 52, 53, 362 Synapses, 37, 234, 313, 365, 379, 382, 388, 390 alterable, 390 delay across, 380, 382, 387 excitatory, 381, 385, 388 inhibitory, 382, 385, 388 irreciprocal, 360 reciprocal, 360 System auditory, 192, 306 biological, 17, 18, 57, 305 communication, 8, 184 Darwinian, 372, 373 dissolution of, 325 equivocation of, 16 genotypic, 372, 373 homogeneous, 357 Lamarckian, 372, 374 non-linear, 234 protein synthesizing, 65 output of, 38 permanently altered, 357 random, 11, 25 receiving, 20 static, 8 visual, 306 Theorems I to X McCulloch and Pitts, 382-394 Theory behavior, 38, 53 mentalistic, 39 stochastic, 41, 43 Threshold [| changes in, 382 "8 differential, 265 neuron, 272, 289, 369 occlusion, 265, 270, 273 variation in, 388 Time domain, 329-348 Trophodynamics, 140, 141 Turing machine, see Machine, Turing u Uncertainty, 5, 17, 143, 145, 314 V Virus action formation of precursor molecules, 1 30 fowl plaque virus, 131 on cell synthesis, 126 poliovirus biosynthesis, 132 Virus replication, 123, 133 w Watson-Crick model for DNA, 60, 77