DISCRETE-OPERANT SCHEDULES OF REINFORCEMENT By Dennis Marshall Lee A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1978 ACKNOWLEDGEMENTS I wish to express my gratitude to Dr. Marc N. Branch for providing valuable guidance in the formulation and conduction of the research herein and the members of my supervisory committee, Drs. H. S. Pennypacker, E. F. Malagodi, C. K. Adams, D. A. Dewsbury, and E. W. Davis, for their interest in and support of this endeavor. TABLE OF CONTENTS Page ACKNOWLEDGEMENTS ii ABSTRACT iv INTRODUCTION 1 METHOD 27 RESULTS 34 DISCUSSION 58 REFERENCES 67 BIOGRAPHICAL SKETCH 71 iii Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements of the Degree of Doctor of Philosophy DISCRETE-OPERANT SCHEDULES OF REINFORCEMENT By Dennis Marshall Lee March, 1978 Chairman: Marc N. Branch, Ph.D. Major Department: Psychology The point of departure for the present research is the experimental analysis of animal behavior. Experimental paradigms for studying behavior in the animal laboratory prosper depending on the orderliness and generality of the data they produce. Free-operant conditioning procedures have enjoyed a good deal of popularity for this reason. Over the years the behavioral phenomena generated by operant schedules of reinforcement have become increasingly complex and subsequently increasingly difficult to analyze. The necessity for the development of alternative tactics appears imminent. The formulation of tactics depends heavily on the datum one has chosen as a dependent measure. The funda- mental datum upon which the operant approach has been built is response rate. The valid use of this measure requires that certain procedural restrictions be observed. Response rate is legitimate as a dependent variable only when res- ponses are free-operant. A free-operant is defined as an instance of behavior which takes a short time to execute, IV is easy to perform and leaves the responding organism in the same place ready to respond again. A pigeon pecking a key and a rat pressing a lever are examples. There are no imposed restrictions on the frequency with which the res- ponse can occur, hence the descriptor "free." Rate of free-operant responding has proven to be in- adequate as a measure in a variety of situations. Behavior in terms of rate can only be partially accounted for by schedule variables. The present study was concerned with the development of procedures for extending the study of the basic processes of operant conditioning theory to procedures and techniques utilizing the discrete-trial or discrete-operant. The objective here was to explore procedures in which probability of response was the dependent variable under conditions in which the response is restricted to programmed opportunities. A discrete- trials procedure imposes greater constraints on the measured response and provides the experimenter with additional parameters which define those constraints. Examples are trial length and the schedule of trial presen- tations. The experimenter has both greater control over response occurrence and a greater number of parameters to manipulate. The variables manipulated in the present re- search were intertrial-interval and the schedule of pre- sentation of opportunities to respond. Opportunities were scheduled on either a fixed or variable time base. The schedules of reinforcement used were fixed-interval or v fixed-ratio for two experimental groups; each consisted of three white Carneaux pigeons. The results suggested that responding on trials which are superimposed on schedules of reinforcement is differentially affected depending on the kind of reinforcement schedule used. Probability of res- ponse is not affected by changes in intertrial-interval (ITI) value when the reinforcement schedule is fixed-interval, whereas it is with fixed-ratio. Under the same procedures, average latency to responses was found to increase with ITI value in the fixed-interval group but to vary unsystematically in the fixed-ratio group. Finally, randomizing trial pre- sentations in time had little effect when the food con- tingency was fixed-interval but produced increases in response probability when it was fixed-ratio. Patterns of responding on the two schedules of reinforcement were analo- gous to those found in typical free-operant fixed-interval and fixed-ratio schedules. vi INTRODUCTION The experimental analysis of behavior has been con- structed largely upon frequency of behavior as a basic datum. Frequency measures are applicable to and indeed the raison d'etre of the free-operant paradigm. This experi- mental approach, with both its advantages and limitations, has determined to a large extent the nature and course of the science thus far. The importance of selecting a proper dependent measure cannot be overstated. Skinner (1950, p. 193) states: "Progress in a scientific field usually waits upon the discovery of a satisfactory dependent variable." Skinner (1953) argues that frequency of behavior is both intuitively appealing and experimentally advantageous as a dependent measure. All psychologists, he maintains, are interested in how their clients or laboratory animals will behave at some future time. Often this potential for future behavior has been assigned to something in the organism at the moment. These internal states have various names: attitudes, opinions, dispositions, habits, etc. All are attempts to represent in the present organism some propensity for future action (Skinner, 1953, p. 69). Many of these traditional concepts in psychology as well as less technical terms invite explanation in terms of frequency. Someone has a 1 2 smoking habit if he smokes frequently. An individual is said to have an interest in football if he talks frequently about and watches football games. An organism possesses an instinct to the extent that a certain topography of behavior is emitted with certain frequency. The incorporation of this notion into an experimental method has resulted in some compromise. Certainly oppor- tunity plays a large part in determining the frequency with which various behaviors occur. One cannot watch football when no games are being played. Smoking cigarettes is under the control of several variables including the availability of smoking materials and social and legal restrictions. In most environments behavior is constrained by opportunity. The free-operant paradigm, on the other hand, arranges environments without this kind of constraint. It may not be a particularly good analog of non-experimental environments if it preempts the consideration of important variables. One major difficulty with free-operant schedules of reinforcement is the manner in which relationships between independent variables and behavior are specified. Schedules of reinforcement are descriptions of the temporal arrange- ments of experimentally controlled events with certain aspects of behavior. Schedules specify the way in which independent variables are applied; how the variables of interest contact the behavior of interest. If the schedule described completely all possible conjunctions between the independent variables and the behavior, then any changes in 3 behavior could be attributed simply attributed to that contact — the influence of the independent variables. Unfortunately, the contact between variables and be- havior is not so simple. The temporal conjunctions between variables and behavior proscribed by a schedule constrains to an extent, but does not fully determine the actual manner in which variables will contact the behavior. Much depends on what the organism does. The interaction between the schedule and an animals behavior results in a complex of empirically generated conditions or variables (Jenkins, 1970, p. 66). These variables are a source of control over behavior just as are the explicitly defined schedule vari- ables. They have the disadvantage, however, of not being directly manipulable by the experimenter. In fixed-interval schedules, for example, the presen- tation of a reinforcer is conditional on the execution of a single response following a minimum time. This is a de- scription of the formal rules which partly constrain the behavior . The number of responses which occur in a given interval and their temporal relationship with the reinforcer is not specified by the schedule. These are parameters which emerge as a result of the schedule-behavior inter- action and are not directly manipulated by the experimenter. Their contribution to the observed behavior is unknown unless they can in some way be manipulated and evaluated. An unspecified portion of the variability is determined by such indirect variables in all free-operant schedules of 4 reinforcement. This characteristic of schedules is not unrecognized, however: . . . In summary, a scheduling system has no effect until an organism is exposed to it, and then it no longer fully determines the contingencies. (Skinner, 1966, p. 76) This is in some ways a very unfortunate characteristic of schedules of reinforcement. It puts the investigator in the position of having to hypothesize what the contingencies really are. Functional relationships stated in terms of the formal schedule contingencies and some measure of behavior are, by this admission, over-simplified and inaccurate. The confounding of direct and indirect variables may not be important to investigators who are eager to display the "complexity" of the phenomena which their paradigm is cap- able of producing (e.g., Skinner, 1966; Morse, 1966). The suggestion here appears to be that complexity in the labora- tory is a good thing because the real-world is similarly complex. The real-world is indeed behaviorally complex but it may not be similarly so. The analytical reduction of complex behavior to special combinations and arrangements of more elemental components may prove extremely difficult (Findley, 1962). Different levels of complexity in other sciences (e.g. physics, chemistry and biology) are usually associated with different phenomena, methods and laws. Assuming that behavior is at least as complex a subject matter we might expect similar analytical stratification. The reason that schedule generated behavior patterns are so complex is that schedules of reinforcement fail to 5 state adequately the relationship between independent vari- ables and behavior. We may not really know what contin- gencies are operating at any given point in time: . . . A given set of contingencies yields a perfor- mance which combines with the programming equipment to generate other contingencies which in turn generate other performances and so on ... . (Skinner, 1966, p. 25) (emphasis added) This is an amazing state of affairs for a science which seeks to establish relationships between contingencies of reinforcement and behavior. The contingencies' half of the equation is ambiguous. It is not simply a description of the schedule since the programming equipment does not ac- count for all contingencies. The description of conditions under which behavior occurs is not enough. We are led to speculate about the real conditions which are determining behavior on an "indirect" level. This situation is a result of the loose fashion in which variables are arranged with respect to behavior in the free-operant paradigm. A para- digm which encourages unambiguous arrangements would have less speculating to do. If the conditions under which each response occurred were specified in detail by the experi- menter, then the contribution of each condition to the emission of that response could be evaluated. Free-operant schedules do not describe the conditions under which indi- vidual responses occur. Their contingencies are "loose" in the sense that they call attention to what the organism may do, rather than what he must do (Findley, 1962). The pre- vailing stimulus conditions and the local contingencies typically are not observable or explicit or manipulable. 6 This state of affairs is not due to the deviousness of the subject matter but to the choice of methodology. A chemist would not intentionally leave conditions in his experiments to chance in order that the results be "rich and varied." He controls all known variables in order that the influence of one factor may be evaluated. On the other hand, schedule research, out of a commitment to frequency as a dependent measure, acknowledges but then experimentally ignores, the influence of a host of potential indirect variables. The paradigm does not allow these variables to be collectively controlled in any given situation since they are experimentally inaccessible (as indirect variables at least). The resulting "richness" and complexity of schedule behavior is seen as a virtue (Skinner, 1966). Significantly, performances on the simplest, most basic schedules have yet to be explained adequately. For example, there are "theories" of fixed-interval responding (e.g., Dews, 1970) because even simple schedule arrangements are incredibly complex. To reiterate, the only reason they are complex is that so many of the contingencies are deliberately left to chance. An evaluation of the relative contribution of directly mani- pulated variables is difficult because their effects are confounded by hypothesized indirect variables. One of the defining characteristics of the free-operant paradigm is the opportunity for repeated occurrence of a single response. Time is "condensed," thereby allowing for the observation of a great deal of behavior in a short 7 period of time. This effect has been likened to that of the microscope in the biological sciences. The repetition of behavior has the additional advantage of allowing repeated measurement over time. The methods of science are espe- cially suited for analyzing repeatable events (Millenson, 1967, p. 157). Since behavior cannot be studied in its entirety it must be broken into units of special attention (Findley, 1962). Because of the high density of behavior promoted by the paradigm, a sensitive and continuous measure of moment to moment responding is available. Continuity and sensitivity, however, appear to have been bought at the price of uncon- trolled variability. The source of this unwanted variabil- ity seems to be the free-operant itself. The response, inasmuch as it occupies much of the organism's time and produces proprioceptive stimuli becomes a significant sti- mulus factor which is an important source of control over the ongoing behavior. The stimulation arising from highly stereotyped repetitive serial responding may mask or distort the effects of manipulated variables (Blough, 1966a). An organism may, in a sense, control its own behavior to the detriment of external control. The microscope analogy is a poor one in retrospect. The concentration of responses in time is more comparable to pumping up cells so they can be more easily seen, not simple magnification. Unlike the free-operant paradigm the microscope is relatively unobtru- sive. The idea that the organism's own behavior may be a powerful stimulus is not new. Ferster and Skinner (1957, p. 580) described a random mixed fixed-ratio (FR) schedule in which the response requirement was either 30 or 190 res- ponses. Fixed-ratio responding typically consists of a pause after reinforcement followed by a high steady rate of responding until the next reinforcer is presented. On a well-maintained simple FR190, responding, once begun, usually continues until the ratio is completed. A random mixed FR30, FR190 schedule provides no external signal as to which ratio is in effect at any given time. The only sig- nificant programmed event is the reinforcer presentation. If the pigeons behavior were controlled by this event exclusively, responding would be expected to continue in typical fixed-ratio fashion until reinforcement. The results of the experiment suggested other factors were involved. Performance was as expected during the FR30 component. However, pausing was observed after approxi- mately every thirtieth response when the requirement was 190. The interpretation offered by Ferster and Skinner was that the pigeons had come to discriminate their own per- formance. In other words pecking behavior in the pigeons provided a powerful source of stimulation which in turn determined whether and how pecking would occur in the future. In this case the behavior of pausing was also related to the reinforcement contingencies. The contin- gencies do not explain why the pausing occurred, however. 9 It would be of no advantage and perhaps a slight disadvan- tage for the birds to behave as they did (in terms of re- inforcement maximization). One possible explanation is that each response acts as a member of a behavioral chain (e.g. , Ferster and Skinner, 1957). In this view the occurrence of a response functions as a discriminative stimulus for the next response. A similar formulation was proposed by Mechner at about this time. The concept of "internal cohesion" in ratio schedule response runs was advanced by Mechner (1958a). In his study rats were first trained on a 16 response, fixed-ratio sched- ule on lever A. After this, 50 percent of the runs required that a single response be made on lever B following the completion of the run. If the switch to lever B occurred prior to the completion of a run the ratio was reset. In this way Mechner devised a system by which run terminations would be indicated by a measurable second response rather than by the occurrence of pauses which were previously inferred to be terminations. He found that the minimum number of responses required for reinforcement determined the lengths of obtained response runs when termination was defined as a response on the second lever. The responses in the run were characterized by Mechner as being bound to- gether as a cohesive unit. The apparent dependence of one response on another within a run encouraged the search for other kinds of res- ponse dependencies. In a subsequent paper Mechner (1958b) 10 demonstrated that run lengths could be sequentially depen- dent. He found the length of a run to be related to the length of the immediately preceding run. On some occasions cyclic fluctuations in the length of successive runs were seen. These observations indicated another level of res- ponse interdependence. It appeared that response runs could function as discriminative stimuli for the occurrence of certain other runs of specific length. Furthermore, there was a suggestion that series of runs could be sequentially dependent; a kind of higher-order dependency. The investigation of response dependencies has pro- ceeded by less direct means via the analysis of interres- ponse times. An interresponse time (IRT) is the elapsed time between the initiation of a response and the next response. An IRT is generally considered to be a property of a response (Morse, 1966), although in its original con- ception (Anger, 1956) it was considered a stimulus (i.e., the time between two responses). Patterns in the distri- bution of IRTs may reveal response dependencies without the deliberate arrangement of variables to test for their oc- currence (cf. Mechner, 1958a, b). All schedules are subject to this kind of post-hoc analysis. Gott and Weiss (1972) performed a detailed IRT analysis of the transition of fixed-ratio responding from FR1 to FR30 in pigeons. Following the transition to FR30, the first few ratios consisted mostly of short IRTs (less than one sec). Steady responding then gave way to a pattern in which many 11 long IRTs (10-20 sec) occurred throughout the ratio. This increased variability would be expected as the abrupt transition was extinction-like with respect to the subjects' experimental histories. Eventually a new pattern evolved in which a long post-reinforcement time was followed by a "border" region in which responding began and accelerated briefly. A sustained run of responses with short IRTs concluded the ratio. Eighty percent of all responses occurred in this run. The final FR performance could be described as com- parable to that found by Ferster and Skinner (1957) and Mechner (1958a, 1958b). The responses appeared to be emitted in cohesive streams. An IRT analysis of the res- ponding revealed that the runs had special characteristics which could be correlated with topographical features of responding. This analysis, then, was able to shed light on the source of fixed-ratio run "cohesiveness." If ratio responding were homogenous in the sense that individual pecks occurred on a regular cycle, the expected IRT distribution would be unimodal. That is, the times between most of the responses would be about the same; resulting in one category of IRT collecting most of the responses. In fact, most of the subjects in this study produced multimodal distributions (with up to 5 modes). What was particularly striking about the distributions was that the modes were spaced at equal time intervals from time zero. Because of this characteristic Gott and Weiss called 12 them harmonic modes or harmonics. (Similar distributions have also been observed on VI schedules (Blough, 1963; Weiss, 1970)). Visual observation of the pigeons with the most pronounced harmonics revealed a response topography consisting of ineffective "pecks" which were either pecks which struck the wall or pecks which ended short of any physical contact with the key. The authors surmised that the harmonics were due to a rhythmic pecking motion which resulted in switch closures in some cases and misses in others. Other consistent variations in topography were also noted. Some subjects repeatedly "bit" at or grasped the metal edge of the key opening. Others "nibbled" the key by making slight rapid bobbing motions accompanied by opening and closing of the beak against the key. These topographies tended to generate high rates of "responding." It is noteworthy that these topographies persisted over several months. Many subjects maintained harmonic-type responding despite the fact that it was inefficient. Others showed a small nibble component over many weeks without any increase in time spent nibbling. This is odd since this topography produced the highest rates and there- fore the maximum potential reinforcement rate. These ob- servations suggest that variants in response topography may emerge on fixed-ratio schedules and persist in a stereo- typical fashion. Not all ineffective variants are selected out nor are effective variants necessarily selected for. This brings into question the definition and identification 13 of the response. It would be difficult indeed to say what fragment of the ongoing behavior comprised an operant . How many operants are emitted during an episode of key nibbling? Are wall pecks and key-feints operants? Peculiarities in pecking have often been noted in- formally and communicated anecdotally. Variations of the "ideal" pecking response are probably the rule rather than the exception. It has been hypothesized that topographical variability is the result of the intermittency with which reinforcers are presented in typical schedules (Schoenfeld, 1950). Schoenfeld contended that intermittent schedules produce response variants via an extinction-like process and that those variants are then either reinforced or extin- guished. It is well-known that extinction produces in- creased variability in rate of responding (Skinner, 1938) and in topography as well (Antonitis, 1951). Antonitis found that guinea pigs and white rats would vary the posi- tion at which they thrust their noses into a slot (the reinforced response) during extinction, whereas, they had established position preferences on regular reinforcement (CRF) . Intermittent schedules appear to have similar effects on responding. Variability along quantitative response continua (e.g., force, displacement, and duration) is greater under interval and ratio reinforcement schedules than under CRF (D'Amato and Siller, 1962; Herrick, 1965; Herrick and Bromberger, 1965; Notterman and Mintz, 1965). 14 Ferraro and Branch (1968) found that a VI schedule resulted in more variability in response location along an eight location response strip than did regular reinforcement (CRF). In a similar experiment Eckerman and Lanson (1969) found that variability in response location increased when fixed-intervals, variable intervals, random intervals, or extinction were scheduled. The generation of topographical variants appears to be a consistent characteristic of intermittent free-operant schedules. To the extent that these variants are func- tionally independent, they represent distinct operant classes (Blough, 1966b; Smith, 1974). For variants which include more than one measured response, the dependent variable of rate is misleading. The analysis of interresponse times has revealed depen- dencies in other schedules as well. During Sidman avoidance (Wertheim, 1965) successive IRTs show trends which indicate such dependencies. Sequential response effects have also been observed on DRL schedules (Farmer and Schoenfeld, 1964; Ferraro, Schoenfeld, and Snapper, 1965; Kelleher, Fry and Cook, 1959). In the Ferraro et al. study, subjects were found not to maximize reinforcement density. The obtained first-order sequential dependencies indicated that the subjects tended to repeat the IRT just emitted whether it was reinforced or not. In effect, the schedule contin- gencies were not related in any obvious way to the proba- bility of occurrence of some responses (see also Weiss, Laties, Siegel, and Goldstein, 1966). 15 Responding generated by variable-interval (VI) sched- ules of reinforcement often serves as a baseline in many kinds of experiments. Blough and Blough (1968) performed an IRT analysis of VI performance and found evidence that response produced stimuli, that is the animal's own be- havior, may be a prominent source of control over responding. Interresponse times did not occur with equal probability as might have been expected from the schedule contingencies. The obtained IRT distributions were irregular and idio- syncratic. The most frequently emitted IRTs would presum- ably be those which were most often reinforced. But which IRT is reinforced depends on which is emitted; that is, it depends on the organism (Shimp, 1973a). Blough and Blough also reported that short IRTs tended to follow short IRTs and long IRTs to follow long IRTs. This indicated that certain classes of IRTs were emitted in interdependent cohesive sequences. They conclude that other variables such as topography of the response, stereotypy and response chaining affect rate on a VI schedule. All instances of the measured response then may not be equivalent; they may vary in their susceptibility to experimental manipulations to the extent to which these other factors exert control over their emission . Some techniques have been devised which artificially generate statistically stable and reproducible distributions of IRTs so that the effects of independent variables can clearly be observed (Anger, 1954; Blough, 1966b; Shimp, 16 1973b). For example, Blough (1966b) devised a synthetic VI schedule by which he sought to produce an ideal baseline which would be free of the stereotypy and response depen- dencies typical of normal VI schedules. The schedule selectively reinforced pecks which terminated IRTs that occurred least often relative to the experimental distri- bution of IRTs that would be expected from a random generator. In brief, the schedule favored local IRT variability by reinforcing unlikely IRTs. This procedure would tend to break-up response chains which contained frequently emitted IRTs via extinction and reinforce other less frequent IRTs. This would be a continuing dynamic process since those IRTs which became more frequent through reinforcement would subsequently be selected against. One class of IRT (those of less than 0.8 sec) showed large individual differences in terms of IRTs per oppor- tunity (an estimate of the conditional probability that a response will occur), showed little or no effect of the reinforcement contingency and demonstrated the largest session to session variability relative to all other IRT classes. These IRTs showed no sign of diminishing though they were never reinforced. They would be of little inter- est except that they constituted a very large proportion of some birds' recorded responses. Responses that are emitted in cohesive runs or are primarily a function of the preceding response are not equivalent to responses whose source of control originates 17 with schedule contingencies. They cannot be of the same operant class if they are controlled by different variables. In terms of stimulus antecedents, individual responses in typical schedules of reinforcement are highly unique. The stimulus conditions fluctuate widely with respect to time and events. The consequences of responses are also somewhat unique. There are at least two different operants in all schedules on the basis of consequence alone; those that are reinforced and those that are not. When the effects of a schedule are highly variable or unreliable, the variability may be founded in the multitude of different operants we define as one. If the effects of a contingency are reliable in terms of overall rate, we have obtained order — but to what end? The unit of analysis in a science is crucial. The free-operant paradigm gives us rate, but rate of what? Schedules may be used as tools with which to analyze variables operating in other schedules. This, it will be remembered is the experimental strategy on which the analy- sis of schedule behavior is founded (e.g., Skinner, 1938; Ferster and Skinner, 1957; Morse, 1966; Zeiler, 1977). The aforementioned variables take many forms which reflect different levels of explanation. Ferster and Skinner in their book Schedules of Reinforcement (1957) were interested in explanations in terms of a small number of elementary processes. Jenkins describes this view. It is widely held that although the determination of performance under reinforcement schedules is extremely complex, the performance will ultimately be understood as a product of the combined action of a small number 18 of familiar, elementary conditioning processes .... processes that include, for example, primary and con- ditioned reinforcement, extinction, response differen- tiation and generalization or discrimination .... (Jenkins, 1970, p. 23) Ferster and Skinner concluded that all the phenomena pro- duced by the sequencing of events in schedules were re- ducible to the operation of these elementary conditioning processes . Interestingly — twenty years after Ferster and Skinner — the level of explanation has shifted somewhat. Zeiler (1977) deals less with hypothetical processes and more with variables which are directly manipulable: number of res- ponses per reinforcer, interreinforcement time, and response dependency. This is not to say that Ferster and Skinner neglected these variables, only that the level of explana- tion has tended away from the elementary conditioning pro- cesses approach and toward one which is in terms of schedule variables and parameters. This is just the reverse of the trend one would expect if the intervening twenty years had produced evidence for general principles of behavior. Jenkins comments on the pursuit of principles of behavior: The laying out of elementary conditioning process as though they were firm foundation blocks on which new construction might be erected no doubt suggests a certain naivete about the state of our knowledge of conditioning. Perhaps a more apt metaphor would have us building on shifting sands. It is clear that the so-called elementary processes, although they may be familiar, are not themselves well understood (p. 106) . . . . Every finding seems capable of many explana- tions. Issues become old, shopworn, and disappear without a proper burial. (Jenkins, 1970, p. 108) 19 More recent explanations in terms of schedule variables and parameters are, unfortunately, subject to the same criticisms. The building blocks of behavior are elusive and yet the schedule data are reasonably orderly and reproducible. An alternative explanation is that schedules represent basic arrangements of stimuli and behavior which interact in complex ways; that each schedule is a unique confluence of variables which fundamentally determines behavior. Morse and Kelleher (1970) coined the phrase "schedules as fundamental determinants of behavior." Their usage of the phrase reflected their belief that past schedule experi- ence was a crucial determinant of present schedule behavior. This included not only performances in the distant past but behavior which had just occurred (1970, p. 183). Morse and Kelleher (1977, p. 198) state "as soon as behavior develops, its history becomes important." Zeiler (1977) also discusses the fundamental nature of schedules : If each schedule represents a particular conjunction of variables, the only way of managing that conjunction is by establishing that schedule ... to the extent that the precise interaction of multiple variables is res- ponsible for a distinctive performance, each schedule is a fundamental arrangement. (Zeiler, 1977, p. 229) The two characterizations are by no means incompatible. Morse and Kelleher emphasize the importance of momentary stimulus conditions, past and current reinforcement con- tingencies, and the qualitative and quantitative properties of ongoing, immediately past and distant past behavior in determining subsequent behavior. In sum, all aspects of the 20 environment past and present, and behavior past and present, are potential determinants of behavior. An example (re- searched extensively by Morse and Kelleher 1970, 1977) is the manner in which the effects of consequent stimulus events are determined by schedule context, schedule history, and ongoing aspects of behavior. Stimulus events such as shock and food have implicit, common-sense connotations which have carried over into scientific thinking as well. The designation of stimuli as positive or negative involves the assumption that presenta- tion or withdrawal of a particular stimulus will have an invariant effect. Kelleher and Morse (1968) have found that electric shock can produce different effects depending on the schedule used. In their study squirrel monkeys were trained to produce electric shock on a fixed-interval 10-minute schedule of shock presentation. This was accomplished by super- imposing an FI 10-minute schedule on a variable-interval (VI) food schedule and then gradually eliminating the VI food schedule. Following this, a one-minute period in which each response produced an electric shock was appended to the FI 10-minute shock schedule. Positively accelerated res- ponding, characteristic of performance under fixed-interval schedules of food presentation was maintained during the first 10 minutes, but responding was suppressed during the eleventh minute in which each response produced an electric shock. This was a case in which the same stimulus event (a 21 12.6 mA electric shock) both maintained and suppressed responding . In addition to schedule arrangements, experimental history can also be a critical determinant of whether a particular event will increase or decrease behavior. Responding maintained under a schedule of food presentation is usually suppressed by the presentation of intermittent electric shocks (Azrin, 1956; Estes and Skinner, 1941). The presentation of shocks has the opposite effect, however, when responding is being maintained under a shock-avoidance schedule. For example, a fixed-time schedule of shock presentation has been shown to increase dramatically res- ponding by rhesus monkeys which were maintained on a shock avoidance schedule (Sidman, Herrnstein, and Conrad, 1957). Based on these studies and many others, Morse and Kelleher (1977) argue that the same stimulus event under different conditions may increase or decrease behavior. The premise is that all events both environmental and behavioral are qualified by their context and the history of the organism. Zeiler's statement suggests much the same idea excepting an explicit reference to history: variables which occur in unique arrangements interact in precise ways which, in effect, qualifies the contribution of such vari- ables. The following illustration may be helpful in under- standing the implications of the concept. There are twenty- six letters in the English language. Each letter is identi- fiable and can be arranged with respect to other letters to 22 form words (e.g. , nouns) which have some effect on a reader. The contribution of each letter to the word is qualified by its relation to other letters, and they in turn are quali- fied by it. Words are, in a sense, fundamental arrangements of letters which have an effect unique to their construction. The effect of single letters or non-word groups of letters would not be the same on the listener. The effect of a single letter could not be evaluated by adding it to other groups of letters (e.g., pat, rap, tap plus "e" yields pate, rape, tape) or subtracting it for that matter. The words are fundamental arrangements. Similarly, the effect of adding or subtracting a stimulus event such as shock or substituting it for another event such as food depends on the context of a situation. Tandem and chained schedules differ by only one feature; the component discriminative stimulus. And yet that one variable cannot be changed without significantly altering the interaction of other variables to the extent that the two schedules cannot be unambiguously compared (Gollub, 1977). Morse and Kelleher are trying to impress on us the subtleties which exist in schedule controlled behavior. Reinforcement does not simply consist of making food con- tingent upon a response. Present and future behavior depend on the sequential ordering of behavior. While admitting that their approach necessarily complicates simpler tra- ditional formulations of behavior, they argue that it is more valid since the phenomena studied are closer to those 23 of ordinary behavior. Their immediate goals are expressed in the following passage: An understanding of such reproducible behavioral pro- cesses is to be found in the exact characterization of the temporal relations among the events comprising such processes and in the specification of the conditions under which they occur. (Morse and Kelleher, 1977, 175) (emphasis added) As apparent advocates of microanalysis and as har- bingers of a new era of increased complexity in the labora- tory, they do not appear worried about whether eventual order is realizable via traditional schedule methodology. If understanding is to be found in the exact characteriza- tion of temporal relations among events and the specific conditions under which they occur, rate measures will have to be reconsidered because rate is a summary of many tem- poral relations and many specific conditions. Morse and Kelleher have demonstrated that behavior is multiply deter- mined using the free-operant paradigm. The kind of future analysis implicated by their demonstrations may not be accomplishable within that framework. Free-operant schedules do not specify the fine grain aspects of behavior and en- vironment which are commensurate with the suggested level of analysis. In order to evaluate the specific conditions under which events occur, those conditions need to be ac- cessible, measurable, and manipulable. Local contingencies (of the kind suggested here) are not directly accessible, measurable, or manipulable in free-operant schedules. Morse and Kelleher suggest that schedules define repro- ducible behavioral processes; that the schedule itself 24 determines a certain outcome and that schedule effects supersede traditional formulations in terms of basic pro- cesses. This is another way of saying that schedules are fundamental determinants of behavior. One problem with this conceptualization is that every new schedule defines a new process. A catalog of schedule processes is a poor sub- stitute for an understanding of behavior in terms of general basic principles (Jenkins, 1970, p. 107). Unless the level of control and analysis shifts to take into account the immediate conditions and temporal relations which determine the occurrence of single operants, botanizing will be the extent of our analysis. Zeiler's reference to the fundamental nature of sched- ules appears as an afterthought which is meant to emphasize the complexity of schedule performance and the limitations of a methodology which assumes that variable interactions do not occur. He offers no practical or theoretical solutions to these problems. Schedules are important and ubiquitous, he suggests, so we should keep studying them. It may be necessary to change some aspects of schedule analysis in order to avoid or at least minimize some of the problems inherent in free-operant schedules. It is proposed here that an approach utilizing discrete-operants may be more appropriate. A discrete-operant procedure imposes greater constraints on the measured response and provides the experimenter with additional parameters which offer precise momentary discriminative control for the occurrence 25 of responses. The experimenter may now formally program response antecedents (trials) and explicitly manipulate the temporal spacing of operants with respect to the reinforcing stimulus, other operants and other significant events. Operant behavior may be analyzed by superimposing opportunities (trials) to respond on top of schedules of reinforcement and then recording the relative frequency with which responses occur during these opportunities. Confound- ing factors such as response-produced stimuli and sequential dependencies would be minimized by the spacing of the oppor- tunities and the control exerted by the trial-on stimulus. The question of unit size is also handled nicely within this paradigm. Previously one could never tell what size unit was being conditioned since only one aspect of the unit (its end point) was directly observed. The discrete-operant approach, by directly manipulating response antecedents, provides a reasonable estimate of the units "start-point". The time elapsing between trial onset and the response (latency) could be a measure of the topographical duration of the unit (not unlike IRTs). The discrete-operant approach has a certain intuitive appeal as well. Much of human behavior appears to consist of single discrete responses (some may involve considerable behavior) which occur in specific situations or contexts. Usually our behavior alters the environment so that a repe- tition of the same response is either unnecessary or im- possible. Merely by moving about, an organism changes the 26 portion of the environment that it confronts. Jenkins observes "Neither men nor animals are found in nature res- ponding in an unchanging environment for occasional rein- forcement" (Jenkins, 1970, p. 107). This is not to say that the laboratory must mimic or study behavioral analogs of human behavior in the real world to be useful or effective. However, the apparent ubiquity of control of behavior by antecedent environmental stimuli and the apparently discrete nature of responses suggests a laboratory analogue may be profitable. The experiments contained herein are essentially an exploratory investigation into schedules of reinforcement in which opportunity to respond is programmed in addition to reinforcement contingencies. The rational for this kind of approach was given above. The immediate objectives were to produce and describe phenomena resulting from basic para- metric manipulations of response opportunity. There are many tacts one could take in the beginning. The one chosen here involves the manipulation of the number and the spacing of opportunities (trials) against a backdrop of two common schedules of reinforcement; fixed-interval and fixed-ratio. The independent scheduling of trials provides a new dimen- sion in the control over behavior. It is the purpose of the present research to explore the potential of this new dimen- sion . 27 METHOD Subjects Six adult male white Carneaux pigeons were individually housed and maintained at 80 percent of their free-feeding weights. All pigeons were experimentally naive at the beginning of the first experiment. Apparatus The experiments were conducted in three standard pigeon chambers: two 3-key (BRS-LVE, Inc.) and one single key modular (Coulbourn Instruments, Inc.). Only one key was used in the three-key chambers, although the other two keys remained uncovered. A force of approximately 15g (0.15N) was required for effective operations of the response key which produced a 0.04-sec tone provided by a Sonalert (P.R. Mallory & Co., Inc.). Mixed grain was presented via a 4-sec operation and illumination of a food hopper. There was no houselight and the chamber was dark except for occasional feeder and keylight (green) operations. White noise was present continuously in the room containing the chambers. In a separate room, experimental process control and data collection were accomplished using a PDP-8F minicom- puter operating under the SKED Software System (Snapper, Stephens, Cobez, and VanHaaren, 1976). Records of session 28 responses were produced by a cumulative recorder (Ralph Gerbrands Co.). Procedure Experiment I . The pigeons were trained to peck the key by the method of successive approximations following maga- zine training. Three birds (2577, 2564, and 998) were successfully autoshaped to peck during one or two sessions following initial failures to hand-shape. No systematic differences were found between those birds which were auto- shaped and those which were not under subsequent procedures. All pigeons were then exposed to 3 days of a procedure in which the keylight was illuminated randomly on the average of once every 10 seconds. A single response to the lit key darkened it and produced food one-third of the time. Res- ponding occurred to nearly every instance of keylight-on by the third day. The six pigeons were then randomly assigned to two groups of three and three. No members of a group were run in the same chamber. In one group (Pigeons 2577, 339, and 1771), a fixed-interval 3 minute (FI 3-min) schedule of food presentation was implemented. The first response to a lit key after 3 minutes produced food. The interval was timed from the end of the previous food presentation. Opportunity to respond (keylight-on) was arranged on a different sched- ule. Following food the keylight would be illuminated for 3 seconds after which it was dark for a period of time. A lit key was designated as a trial or "opportunity" and the time 29 between trials (when the key was dark) was called the inter- trial interval (ITI). The onset of a trial always followed a fixed time period from the previous trial onset. This time period was comprised of the trial (3-sec) and the ITI which was parametrically varied. The ITI values were 1, 9, 17, and 27 seconds for the fixed-interval group. The Experimental Phases Table lists the number of sessions each parameter value was in effect and the sequence in which they occurred . The ITI values, although fixed, varied somewhat since responses extinguished the keylight for the duration of the trial. Short-latency pecks could add close to 3 seconds to the time that the key was dark. This time was presumably indistinguishable from the ITI (from the pigeons point of view) . Trials were scheduled so that an opportunity to peck would occur at regular intervals (as opposed to variable intervals) and also at the precise moment the FI 3-minute timed-out making food available on the next response. Intertrial intervals were selected which would accommodate these requirements. The number of trials which were pro- grammed during a single, 3-minute interval were 45, 15, 9, and 6. Those corresponded to the ITI values of 1, 9, 17, and 27 seconds. This meant that on the 46th, 16th, 10th, and 7th trials, respectively, the FI 3-minute timed-out and food was presented contingent on a response. Each session was run for 20 food presentations or 60 minutes, daily, seven days a week with few exceptions. 30 The second group consisted of Pigeons 2564, 998, and 114. This group was exposed to a fixed-ratio (FR) schedule of food presentation in which food followed the emission of a fixed number of responses. Opportunity to respond (key- light-on) was arranged on a different schedule. Three- second trials were scheduled with intertrial interval values of 1, 5, 7, and 10 seconds. The Experimental Phases Table lists the number of sessions each parameter value was in effect and the sequence in which they occurred. In this procedure, as with the FI, the period of time the key was dark varied with the latency of response to the previous trial. This period was equivalent to the scheduled ITI only if no response was emitted on the previous trial. Otherwise it could be up to approximately 3-seconds longer. The fixed-ratio requirement was eleven (FR 11) in all phases of this experiment. Sessions were run and terminated as they were for the FI group. A phase was terminated when no trends were observed for the last five days in the response measures of a given group. Experiment II. During the last phase of the previous experiment, the fixed-interval group was working under a fixed-interval 3-minute schedule of food presentation in which 3-second trials were scheduled with a fixed intertrial interval of 27 seconds. The last five days of this phase served as a baseline for the second experiment. In this experiment trials were programmed on a variable time base. Following the end of a food presentation or the termination 31 EXPERIMENTAL Fixed-Interval 3-Min Trial Length ITI 3" 1" 3" 9” 3" 17" 3" 27" 3" 9" 3" 1" 3" 27" Fixed-Ratio II Trial Length ITI 3" 1" 3" 5" 3" 7" 3" 10" PHASES TABLE (Subjects # 2577, 339, 1771) # of Days 14 18 11 13 15 18 19 (Subjects # 2564, 998, 114) # of Days 18 23 15 14 32 of a trial (keylight off), a new trial would begin with a probability of 0.037 (l/27th) at each tick of a 1-second clock. Trials were randomly programmed through the interval in this manner until the 3-minute FI timed-out. At this point a trial was automatically initiated. This procedure was in effect for 11 sessions. The fixed-ratio group was exposed to a similar pro- cedure in which the ITI was varied with an average duration of 5-seconds. During the last phase of the previous ex- periment, all three pigeons in this group ceased responding. Two of the three (998 and 114) had to be autoshaped to regain the keypeck response. In the second experiment, an FR11 schedule of food presentation was implemented in which opportunities to respond were scheduled a fixed time (5- seconds) after the termination of the previous trial. This procedure differed slightly from the fixed 5-second ITI phase in the first experiment in that the ITI was timed from actual trial termination rather than from the point where the trial would automatically terminate if no response were made. In this way all ITIs were 5-seconds regardless of the latency of response to trial-on. Trial initiations, on the other hand, could occur anywhere between 5 and 8-seconds apart depending on the latency of response. After a base- line was established on the fixed 5-second ITI procedure, trials were scheduled with a variable ITI which had an average of 5-seconds. New trials now began with a prob- ability of 0.20 at each tick of a 1-second clock. Following 33 the emission of the eleventh trial response, food was pre- sented. This procedure was in effect for 16 sessions. 34 RESULTS Experiment I . Probability of response was calculated for each session by dividing the number of trials which were terminated by a response by the total number of trials presented in the session. Median probability of response was computed by taking the median of the individual session probabilities for the last five sessions of each phase or ITI parameter. Phase medians are plotted for the FI group in Figure 1. In the top half of the figure are the results obtained from an ascending sequence of ITI values (1, 9, 17, and 27 seconds; see Table 1). The lower half shows a descending series (27, 9, and 1 second) with a disconnected point which represents a second determination of the 27-second parameter following the descending sequence. The first value of the descending series (27 seconds) is the same as the last value of the ascending series. It can be seen that probability of response did not change systematically as a function of the length of the intertrial interval in the ascending sequence. In other words, under the fixed-interval food contingency, responses were made on the same proportion of trials when the absolute number of trials was varied between seven and forty— six per interval. A weak but consistent trend can be seen in the Median probability of response as a function of intertrial interval ( ITI ) time p rH i — i 0 G GG 0 gg > TG gg O G Eh 0 G 0 •H & 0 0 a G GG 0 G P •H CD & GG r-> • i — l G G £ +-> 0 0 G p P o +-> m o > 1 GG rH G bO G G m o •H G 0 0 P 0 0 0 •rH o G 0 X 0) GG a £ 0 CG •H G Eh 0 0 0 0 a CD T3 rH 0 hO & 0 rH t> G 0 • +-> o Cd bO G G CD O P v_r G G & G • 0 •rH G O G G 0 m 0 P bo 0 C 0 0 G GG G •H 0 0 +-> •rH 0 P 0 P o 0 G O E o •to E 0 0 0 0 CD CM •H G 0 0 P O GG m G m G G P m G bO O 0 G G a G 0 •H P P G ■H GG m G 0 GG m m -p G •H - P G G ■H 0 «H •H o 0 P G rH i — 1 & O O 0 0 G 0 0 P P GG - r-\ CO G 0 G O a • 1 G •H a o t> 0 1 — 1 0 0 o - 0 05 gg G a i — 1 P i — i CO p> > G v • 0 G m CD 1 — 1 gg P P P 0 GG 0 p o 0 G GG P 0 m> G m o G G p 0 •H 0 g rH G o G P 0 *H G rH GG 0 > - G G 0 p GG G 0) O > GG Eh & P •rH p E o -p P o GG P t> G 0 0 G . 0 co G G G P G P - ■H i — l G 0 0 E G rH •rH G CO G 0 P l — 1 m o CO 0 0 GG G I — 1 0 •H 0 •H -P Eh m G E P •H G 0 O O G 0 m 0 •H 0 G 0 CO . E P GG G CO m 0 G G p P be G O 0 0 bO G O G > 0 CD G •rH o 0 0 p GG •H ms 0 G GG bO G Eh T3 G CO a P G i — 1 G 0 0 •rH G 0 o G 0 E m O • O to 0 G i — ( 0 CO 0 CO bO G 0 G m G ms -p G P p O C G *H X 0 G G 0 m m 0 0 O G 0 G 0 P . g k4 G P P a, 0 o 43 44 ^ - G •H a • 44 £ G •H o G CO ,g CO c o co 1 cd t> i — i G CM o p cd 43 CO S3 cn G CO cd •H g H G 0 P3 G ■H CO P G p G O P> G i — 1 i— i cd g G P’4 i — l 3 P Pp •H 44 G 0 • 43 CO * G G g i — 1 •H •H i — 1 P P ■P o z P hfl G P -P O • cd G G G p CO 44 0 1 O 50 i — 1 G k4 -P P G cd O . a G CO G CO G 3 i — 1 G •H O1 cd 3 P G G i — 1 G P •H cd CO 44 43 > P bC G o i — l G > cd •r4 •H G > 43 -P 43 P S3 ctf P1 G G i — 1 P G 0) P S3 CO 0 •r4 cd FIXED— INTERVAL - (D CO ADN3n03HJ 3A 1 1V13EI FIGURE Median probability of response as a function of intertrial interval time in 0) bJO g d P CD CO x aj •p c£ CD -P -p G d CD o E ■H CD 73 O G P •rH 0 Hh CO G +-> •rH G CD •H P 0 a hh O d +-> CD d rH 73 G 73 CD CD XI X +-> O CO E 0 CD p X > i — 1 • i — 1 G d CD X bfi ■P G •H CD 73 -P G d CD rH P> G X O CD rH c3 CO O CD G O •rH -P i — 1 73 CD CD X CO Eh G co 73 c o o i— < i— i 0 •H -P d 1 73 0) X •H pH *H o FIXED-RATIO 44 co Q O O UJ CO 3SN0dS3d 30 Ainiavaodd FIGURE 45 times were approximately 40, 80, 100 and 130-seconds, respectively. Figure 4 shows median probability of response plotted as a function of the programmed intertrial interval. The proportion of trials with responses to total trials presented decreased dramatically as the ITI increased. These data are from the last five days of each ITI value. Corresponding latencies are shown in the top half of Figure 5. Although changes were evident, there were no systematic trends common to all three subjects. In the bottom half of Figure 5, the percentage of total session trials occurring during the post-food pause (open circles) are plotted as a function of ITI value. The post-food pause was defined as the time following food presentation up to the first trial response. It can be seen that the percentage of trials occurring in the post-food pause increased as the ITI length increased thus complementing the decline in probability of response observed in Figure 4. The filled circles represent the percentage of total trials accounted for by the post- food pause trials and response-trials combined. That the majority of trials were of these two types indicates a definite two-state performance in which no responding was followed by responding on every trial until food was pre- sented. Figure 6 shows the percent-pause and combined data for the fixed-interval group. In contrast to the FR group, the pause functions are flat. The combined data indicate, however, the same kind of two-state performance found with the FR group. The data for pigeon 2577 on the descending d G CO . & O CD CO CO o •H CO CD d X X CO G bO CO o CO O G CO o CD a d X CD X CO CO G G G 1 CD O G X i — 1 G CD o bo CO d X i — 1 X CO i — 1 G G d 0 0 d X CD •H a G X O G bO G X d X G C CD X 0 CD X a X d & G G X 0 G G d O rH G d (D G O ■H CO X • CD CD O X 1 — 1 X X +-> X 0 CD d X d X B •H E i — 1 CO G O 0 LO G X X X G X G •H d CD X X H d ■H G X d « X CO G *H O a EG 73 CD X X 0 X i O 0) i — 1 E X 1 — 1 X i— i E o G 0 1 d CD Ph G 0 O X O X CD •H •H CO X X X O CO (D o X a X CO X a G G CD X CD CO CO CD CO CD > d & a & X & 0 o r — 1 o X bO XI d X G X CO -P CO X X G . O X X CD CD CO -P CO o G E G X CD X CD CD G G X i — 1 X O bO O 0 o >» X G X o G X CD 0 x CD CD •H X CO bO O G CO G CD d O CD X X G -P X X G CD X •H G CD •H G CD X X i — 1 X 1— 1 O I — 1 CD X o Eh G X X CD 0 X CD X G X X a G Eh CD i — 1 X CD O rH d O G X O G X d Eh O • X G •H d X 0) a O X CD X o •H CD • CO G o X +-> B CD i — 1 G CO O CO d G CD G CD G X O CD X G X d G o X H X -p a X 0 Eh FIXED— RATIO 47 SQNOD3S Nl X0N31V3 1 N30U3d FIGURE top half of the figure shows the median percentage of total session trials 73 . G CO erf CP bD CO G I — 1 erf (rf G •H CO G CP (P P 0 P P erf 73 erf O 0 > P o 73 P i— i G 1 P 4-> 1— 1 CO CO o (P p ft G c P p 0) o G P Oh P O erf 73 G P • G 0 erf G erf •H 73 P P a h — \ o (P i CO G G CO CP 0 P H — 1 P p o a erf G erf o > •H G G o CO P (P erf P G >> G (P H N i — 1 p Oh CO i — 1 1 0 (P erf 73 ' — 1 — i O (P o P X CP g P •rH CO •H G Ph 0 o CP erf > CO Oh 73 erf CP bfi & 73 i — l G O i — l P P O •rH 73 G P P G CP 1 ' — CP S P P

■H G CD P o CD X3 •H ■G H Ph G CD i — 1 P E CO G H •H G > (—1 -P & FIXED-RATIO 52