Skip to main content

Full text of "Structural linguistics"

See other formats

25 (16s net) 



"This book is the 
most important 
contribution to 
descriptive linguistics 
since . . . BloomfieldV 
'Language.'" / 

American /' ^ 
AnfhropolojiisK ,^- 

•filJONVSOV^^^ '%MIN(1 JUV^ 

Or .v\MIRRAP\ 

aWF fMiVFr?T^>. .vinvAwrFir 



>;{>, ,^MFl'V!VfR% ^.^\lOSANCnfj>^ ^^OF-CMIFO;?,f>^ ^^0FCAIIF0%^ 



i ' ^/ 



Zellig S. Harris 

Phoe7iix Books 



Foniurly Entitled 

The University of Chicago Press, Chicago & London 
The University of Toronto Press, Toronto 5, Canada 

Copyright 1951 under the International Copyright Union 

Published 1951. Sixth Impression 1963. Composed and 

printed by The University of Chicago Press, Chicago, 

Illinois, U.S.A. 


THIS set of structural methods for descriptive linguistics is intended 
both for students of linguistics and for persons who may be in- 
terested in the character of linguistics as a science. 

For those who use linguistic methods in research or teaching, the tech- 
niques are given here in some detail, without employing the terminology 
of logic. For those who are primarily interested in the logic of distribu- 
tional relations, which constitutes the basic method of structural lin- 
guistics, a minimum of knowledge about language and linguistics has been 
assumed here. Chapters 1, 2, and 20 deal with the general character of 
linguistic methods. 

This book is, regrettably, not easy to read. A single reading should be 
enough for a picture of the operations and elements of linguistics. But 
anyone who wants to use these methods or to control them critically will 
have to work over the material with paper and pencil, reorganizing for 
himself the examples and general statements presented here. 

The procedures of analysis discussed here are the product and out- 
growth of the work of linguists throughout the world, to whose investiga- 
tions the meager references cited here are an inadequate guide. This 
book owes most, however, to the work and friendship of Edward Sapir and 
of Leonard Bloomfield, and particularly to the latter's book Language. 

In preparing this book for publication, I had the benefit of many dis- 
cussions with C. F. Voegelin and Rulon S. Wells III, and of important 
criticisms from Roman Jakobson, W. D. Preston, and Fred Lukoff. 
N. Chomsky has given much-needed assistance with the manuscript. 

Z. S. Harris 

January 1947 


SINCE this book was written, there have been several developments 
which add to the general picture of linguistic methods, without 
affecting the specific set of procedures presented here.' 

1. Sentence center: Chapters 12-19 show how sequences of mor- 
phological elements constitute constructions at a higher level; but 
they do not give a general indication of how these constructions con- 
stitute a sentence. This can now be obtained from center-analysis, ac- 
cording to wliich every sentence can be analyzed into a center, plus zero 
or more constructions (which are adjoined next to specified elements 
of the center or of a construction); in addition, specified elements of 
the center or of a construction may be replaced by a suitable construction. 
The center is thus an elementarj' sentence; adjoined constructions are 
in general modifiers. ]Most constructions are themselves derivable from 

2. Transformations: The basic approach of structural linguistics (in 
tliis book) is to characterize each linguistic entity (element or construc- 
tion) as composed out of specified ordered entities at a lower level. 
A different linguistic analysis can be obtained if we try to characterize 
each sentence as derived, in accordance with a set of transformational 
rules, from one or more (generally simpler) sentences, i.e. from other 
entities on the same level. A language is then described as consisting 
of specified sets of kernel sentences and a set of transformations. The 
transformations operating on the kernels yield the sentences of the lan- 
guage, either by modifying the kernel sentences of a given set (with the 
same modification for all kernels in the set) or by combining them (in 

' In addition to the three items mentioned in this Preface, which 
go beyond the material of this book, reference should perhaps be made 
to one method that belongs in the sequence of procedures, specifically 
in chapter 12: a procedure for locating morpheme and word boundaries 
among the successive phonemes of a sentence. Given a .sentence vi pho- 
nemes long, for I < n < m we count after the fu'st n phonemes of the 
sentence how many different n -(- 1th phonemes ("successors") there are 
in the various sentences which begin with the same first n phonemes. If 
the successor count after the first n phonemes is greater both than that 
after the first n — 1 phonemes and than that after the first n -f 1 
phonemes of the .sentence, we place a tentative morphological boundary 
after the nth phoneme of the given sentence. This is a first approximation ; 
adjustments have to be made for con.sonant-vowel differences and for 
various other factors. Cf. Language 31.190-222 (1955). 


fixed ways) with other kernel sentences. Such an analysis produces 
a more compact yet more detailed description of language and brings 
out the more subtle formal and semantic relations among sentences. 
For example, sentences which contain ambiguities turn out to be derivable 
from more than one transformational source. 

3. Discourse analysis: Exact linguistic analysis does not go beyond 
the limits of a sentence; the stringent demands of its procedures are not 
satisfied by the relations between one sentence and its neighbors, or 
between parts of one sentence and parts of its neighbors. There are, 
however, structural features which extend over longer stretches of each 
connected piece of Avi'iting or talking. These can be investigated by more 
differentiated tools, e.g. by setting up equivalence classes of elements 
which are in a restricted sense substitutable (or positionally similar) 
in respect to other elements or classes of elements throughout a connected 
discourse. The procedures useful for finding such discourse structures 
are extensions of the methods of linguistics. 

Z. S. Harris 


1. Intuodi'ctiox 

2. Mkthodouugu'al Preliminaries 

2.0. Introductory 

2.1. The Criterion of Relevance: Distribution 

2.2. Schedule of Procedures 

2.3. The Universe of Discourse 

2.31. Dialect or Style 

2.32. Utterance or Discourse 

2.33. Corpus or Sample 

2.4. Definition of Terms 

2.5. The Status of Linguistic Elements 

2.6. Preview of the Phonologic and Morphologic Elements 

2.61. Correlations Outside of Descriptive Linguistics 

2.62. Relation between Phonologic and Morphologic Ele- 

3-11. Phonology 

3-4. Phonologic Elements 

3. Segmentation .... 

3.0. Introductory' 

3.1. Purpose: Speech Composed of Discrete Parts 

3.2. Procedure: Segmenting Utterances at Arbitrary Points 

3.3. Result: Unique Segments 

Appendix to 3.2: On the Segmentation of Single Utterances 

4. Phone.mic Distinctions 

4.0. Introductory 

4.1. Purpose: To Establish Linguistic Equivalence 

4.2. Procedure: Grouping Substitutable Segments . 

4.21. In Repeated Utterances 

4.22. In Different Utterances 

4.23. Paired Utterances . 

4.3. Result: Equivalent and Non-equivalent Segments 
4.31. Distinct Utterances and Distinct Elements 

4.4. Length of Segments 

4.5. Correcting Possible Errors 

Appendix to 4.1: The Reason for Equating Segments 
Appendix to 4.21: On the Equivalence of Repetitions . 
Appendix to 4.22: Matching in Frames 


Appendix to 4.23: Interpretation of the Paired Utterance Test 38 

Appendix to 4.3 : Intermittently Present Distinctions ... 39 

Appendix to 4.5: Continued Testing of New Utterances . 40 

5-11. Relations among Phonologic Elements .... 42 

5. Unit Length 42 

5.0. Introductory 42 

5.1. Purpose: Descriptively Equal Segment Lengths ... 42 

5.2. Procedure: Joining Dependent Segments 42 

5.3. Result: Utterances Divided into Unit Lengths ... 43 
Appendix to 5.3: Unit Length and Phoneme Length ... 44 

6. Utterance-long Elements 45 

6.0. Introductory 45 

6.1. Purpose : Utterance-long Equivalent Features ... 45 

6.2. Preliminaries to the Procedure: Discovering Partial Simi- 
larities 46 

6.21. In Paired Utterances 46 

6.22. In Otherwise Non-equivalent Utterances ... 47 

6.3. Procedure: Extracting Segmental Portions of Utterances 49 

6.4. Segmental Length of Contours 49 

6.5. Contours Which Occur Simultaneously 50 

6.6. Result: Suprasegmental Elements Extending over Utter- 
ances 51 

Appendix to 6.1. Morphemic Independence of Utterance-long 

Elements 52 

Appendix to 6.3: Formulaic Statement of the Procedure 53 

Appendix to 6.4 : Contours of More than One Utterance Length 55 

Appendix to 6.5: Grouping Complementary Contours . 56 

Appendix to 6.6: Phonemic Status of Contours 56 

7. Phonemes 59 

7.0. Introductory 59 

7.1. Purpose : Fewer and Less Restricted Elements ... 59 

7.2. Preliminaries to the Procedure 60 

7.21. Stating the Environments of Segments .... 60 

7.22. Summing over the Environments 61 

7.3. Procedure: Grouping Segments Having. Complementary 

Distribution 61 

7.31. Adjusting Environments in the Course of Phonemi- 

cization 62 

7.4. Criteria for Grouping Segments 62 

7.41. Number and Freedom 63 


7,12. Syinim'tiv ill Hoproseiitatioii of Sounds 04 

7.421. Identity of Hojiiesontation arnoiim Scmniont.s ()4 

7.422. Identity of Intor-scgmental Relation among 
Phonemes 60 

7.42.S. Relative to Complete Phoneme Stock 67 

7.4."?. Symmetry of I">nvironment 68 

7.5. Result : (^lasses of Complementary Sediments . . 72 

Appendix to 7.21 : Tahuhitin)>; the I']nvii()nnients of a Segment 72 

Appendix to 7.22: Tabulating Knvironments by Segments 73 

Apjiendix to 7.3: Plionetie and Phonemic Distinctions 75 

Appen(hx to 7.4: The Criterion of Morphemic Identity 76 

8. JrNrri'Rics 79 

8.0. Introductory 79 

8.1. Purpose: Eliminating Restrictions on Sets of Phonemes . 79 

8.2. Procedure: Defining Differences between Phoneme Sets 79 

8.21. Matching Sets of Tentative Phonemes ... 81 
8.211. Syllabification Features 82 

8.22. Replacing Contours by Junctures 82 

8.221. Periodicities of Segmental Features . . . 83 

8.222. Partial Dependence of Juncture on Contour 84 

8.223. Partial Dependence of Contour on Juncture 84 

8.3. Result : Group of Similarly Placed Features .... 85 

8.4. More than One Juncture 86 

Appendix to 8.2: Junctures as Morphologic Boundaries 87 


9.0. Introductory 90 

9.1. Purpose: Eliminating Exceptional Distributional Limita- 
tions 90 

9.2. Procedure: Dividing the Segment 90 

9.21. Special Cases 92 

9.3. Result: Dependent Segments as Allophones .... 93 

9.4. Sequences of Segments 93 

9.5. Reduction of the Phonemic Stock 94 

Appendix to 9.2: Considerations of Symmetry 94 

Appendix to 9.21 : Junctures as a Special Case of Resegmenta- 

tion 96 

Appendix to Chapters 7-9: The Phonemes of Swahili 97 

10. Phonemic Long Components 125 

10.0. Introductory 125 

10.1. Purpose: Replacing Di.stributionally Limited Phonemes 125 

10.2. Procedure: Phonemes Occurring Together Share a Com- 
ponent 126 


10.3. Properties of Components 128 

10.31. Various Lengths in Various Environments 128 

10.32. Various Definitions over Various Segments . 129 

10.33. Extension of a Component 130 

10.4. Complementary Long Components 131 

10.5. Reducing Whole Phonemic Stocks into Components 132 

10.6. Result: Components of Various Lengths 133 

Appendix to 10.2: Phonemic Status of Long Components . 135 
Appendix to 10.5: Component Analysis of Swahili ... 136 
Appendix to 10.1^: Unit-Length Components; Tone Pho- 
nemes 143 

Appendix to 10.1-5: Unit-Length Components of a Whole 

Phonemic Stock 146 

11. Phonological Structure 150 

11.1. Purpose: Phonological Constituency of Utterances . 150 

11.2. Procedure: Stating What Combinations Occur 150 

11.21. Not All Combinations Occur 150 

11.22. Utterance Formulae 151 

11.3. Result: A Representation of Speech 152 

Appendix to 11.22: Utterance Diagrams 152 

12-19. Morphology 156 

12. Morphological Elements: Morphemic Segments . . 156 

12.0. Introductory 156 

12.1. Purpose: Phoneme Distribution over Longer Stretches . 156 

12.2. Procedure: Independent and Patterned Combinations . 157 

12.21. Free Morphemic Segments 158 

12.22. Upper Limit for Number of Morphemic Segments 

in an Utterance 158 

12.23. Lower Limit for Number of Morphemic Segments 

in an Utterance 160 

12.231. For Free Forms 160 

12.232. For Bound Forms 161 

12.233. Summary 161 

12.3. Phonemic Identification of Morphemic Segments 164 

12.31. ContigU(His Phonemic Sequences 165 

12.32. Non-contiguous Phonemic; Sequences .... 165 

12.321. Staggered Phonemes 165 

12.322. Broken Sequences 165 

12.323. Repetitive Sequences 165 

12.324. Partially Dependent Non-contiguous Se- 
quences 167 

12.33. Replacement of Phonemes 167 


12.331. Amon^ Individual Phonemes lt>7 

12.332. Among Classes of Phonemes 1()8 

12.333. Rephicement by Zero . 108 

12.34. Suprasegmental Elements . 109 

12.341. Components 109 

12.342. Contour Change . . 109 

12.343. Morpheme-Length Contours . 109 

12.344. Utterance Contours 170 

12.35. Combinations of the Above 170 

12.4. Result: Elements with Stated Distributions over Utter- 
ances 171 

12.41. Morphemic Segments Correlate with Features of 

Social Situations 172 

12.5. Correlations between Morphemes and Phonemes in Each 
Language 173 

12.51. Phonemic Combinations in Morphemes 174 

12.52. Intermittently Present Pause 174 

12.53. Adjusting Junctures as Morpheme Boundaries . 175 

12.54. Xew Phonemic Junctures 170 

Appendix to 12.22: Partial and Seeming Independence 177 

Appendix to 12.23: The Criterion of Similar Distributions 179 

Appendix to 12.233: Alternatives in Patterning .... 181 

Appendix to 12.323^: Complex Discontinuous Morphemes . 182 

Appendix to 12.3-4: Order as a ^lorphemic Element 184 

Appendix to 12.41: The Criterion of Meaning 180 

Appendix to 12.5: Relation between Morphologic and Phono- 
logic Segmentation 195 

13-19. Relations among Morphologic Elements .... 197 

13. Morpheme Alternants 197 

13.0. Introductory' 197 

13.1. Purpose: Reducing the Number of Elements 197 

13.2. PreliminaPt' Operation: Free Variants in Identical En- 
vironments 198 

13.3. Procedure: Equating Unique Morphemic Segments 199 

13.31. Phonemically Identical Segments 199 

13.32. Phonemically Different Segments 200 

13.4. Criteria for Grouping Elements 201 

13.41. Matching Environments of Phonemically Identi- 
cal Elements 201 

13.42. Phonemically Diflferent Elements 203 

L3.421. Matching Environments 203 

13.422. Simplifying Environmental Differentia- 
tions . " 204 


13.43. Choosing among Complementary Elements . 207 

13.5. Relations among the Members of a Morpheme . 208 

13.51. The Environments of Each Member .... 208 

13.511. Phonemically Differentiable ... 208 

13.512. Morphemically Differentiable ... 209 

13.52. Phonemic Differences among the Members 210 

13.521. Slight Difference 210 

13.522. Partial Identity 211 

13.523. No Identity 211 

13.53. Similarity between Member and Its Environment 212 

13.531. No Similarity 212 

13.532. Identity in Phonemic Feature 212 

13.533. Identity in Phonemes 212 

13.6. Result: Classes of Complementary Morphemic Segments 212 
Appendix to 13.42: Zero Members of Morphemes .... 213 
Appendix to 13.43: Alternative Groupings 215 


14.0. Introductory 219 

14.1. Purpose: Identical Constitution for All Alternants of a 
Morpheme ■ 219 

14.2. Preliminaries to the Procedure: Morphemes Having 
Identical Alternations among Their Members 219 

14.21. Unique Alternations 220 

14.22. Alternations Generalizable within Morphemically 
Defined Limits 220 

14.221. Identity of Part of the Alternation . . 220 

14.222. Identical Alternation in Phonemically 
Undifferentiable Morphemes . .221 

14.223. Alternations in Phonemically Differen- 
tiable Morphemes in Phonemically Un- 
differentiable Environments . . .221 

14.224. Summary 222 

14.23. Alternations within Phonemically Differentiable 
Morphemes and Environments 222 

14.3. Procedure: Interchanging Phonemes among Alternants 

of One Morpheme 224 

14.31. Status of the Morphophonemic Symbols . 225 

14.32. Several Morphophonemes in One Alternation 22(5 

14.33. Types of Alternation Represented by Morpho- 
phonemes 227 

14.331. Morphophonemic Redefinition of Pho- 
nemic Symbols 227 

14.332. New Symbols Required 230 


14.4. Result: Mori)li(»i)l)i»nomes us Classes of Substitutable 

Phonemos 231 

14.0. Herunsiiieratiun of the Grouping of Phonological Seg- 
ments 233 

14.51. Morphophonemic Criterion for Grouping Seg- 
ments 233 

14.52. Phoncmicization of Gross-Boundar\- Alternations 234 

14.53. Kquivalent Phonemic Spellings 235 

14.6. Reconsideration of the Grouping of Morphological Seg- 
ments 236 

Appendix to 14.32: Morphophonemic Equivalent for Descrip- 
tive Order of Alternation 237 

Appendix to 14.33: Alternations Not Represented by Morpho- 
phonemes 238 

Appendix to 14.331 : Maximum Generality for Morphopho- 
nemes 239 

Appendix lo 14.332: Choice of Marking Morpheme, Environ- 
ment, or Juncture 240 

15. Morpheme Classes 243 

15.1. Purpose: Fewer Morphologically Distinct Elements 243 

15.2. Preliminaries to the Procedure: Approximation . 244 

15.3. Procedure : Rough Similarity of Environments 245 

15.31. Descriptive Order of Setting Up Classes . 246 

15.32. General Classes for Partial Distributional Iden- 
tity 248 

15.4. Alternative Procedure: Classes of Morphemes-in-Envi- 

ronments 249 

15.41. General Classes for Partial Distributional Iden- 
tity 251 

15.5. Result: Morpheme-Position Classes 251 

15.51. Morpheme Index 252 

Appendix to 15.2: Culturally Determined Limitations and 

Productive Morphemes 253 

Appendix to 15.3: Identical Distribution within Short Environ- 
ments 255 

Appendix to 15.32: Identical Morphemes in Various Classes . 257 
Appendix to 15.4 : Tabulating Morpheme-Environment Classes 259 
Appendix to 15.5: Correlation between Morpheme Classes and 
Phonemic Features 261 

16. Morpheme Sequences 262 

16.0. Introductory^ 262 

16.1. Purpose: Fewer and More General Classes .... 262 


16.2. Procedure: Substitutable Sequences of Morpheme 
Classes 263 

16.21. Non-repeatable Substitutions 265 

16.22. Analysis of the Complete Corpus 268 

16.3. Sequence Substitution as a Morphologic Tool 268 

16.31. Exceptionally Limited Morphemes .... 268 

16.32. Morphemic Resegmentation 269 

16.33. Indicating Differences among Utterances 271 

16.4. Result: Classes of Substitutable Morpheme Sequences . 273 

16.5. Relation of Class to Sequences Containing It . . . 275 

16.51. Resultant Class Differing from Sequence Classe.^ 275 

16.52. Resultant Class Identical with One of the Se- 
quence Classes 276 

16.53. All Sequences Containing a Class 276 

16.54. Immediate Constituents 278 

Appendix to 16.1: Why Begin with Morpheme Classes? . 280 

Appendix to 16.2: Morphemic Contours in the Substitutions . 281 
Appendix to 16.21: Alternative Methods for Non-repeatable 

Substitutions 283 

Appendix to 16.22: Morpheme-Sequence Substitutions for Mo- 
roccan Arabic 285 

Appendix to 16.31: Sequence Analysis of Words Containing 

wh- and th- 289 

Appendix to 16.4: From Classes of Morphemes to Classes of 

Positions 295 

17. Morphemic Long Components 299 

17.0. Introductory 299 

17.1. Purpose: Relations among Morpheme Classes 299 

17.2. Preliminaries to the Procedure: Disregarding the Rest of 

the Utterance 300 

17.3. Procedure: Morphemes Occurring Together Share a 
Component 301 

17.31. Classes Which Accompany Each Other . 301 

17.32. Restrictions among Sub-classes 303 

17.33. Sub-classes Representable by Several Cohipo- 
nents 306 

17.4. Result: Components Indicating Patterned Concurrences 

of Morphemes 309 

17.5. Re.strictions Not Represented by Components . .311 
Appendix to 17.32:Sub-classesConsistingof Single Morphemes 312 
Appendix to 17.33: Morphemic Components for Intersecting 

Limitations 314 


18. C\)NSTHi;cTioNs 325 

18.0. Introductory . 325 

Purpose: Recurrent Arrangements of Morpheme Classes 325 

Procedure: Substitution in Short Environments . 325 

18.21. Features of Construction 328 

18.22. Successively Enclosing Constnictions 329 
Result: Constructions Included in the Next Larger Con- 
structions 332 



18.4. Reconsideration of Previous Results 

Appendix to 18.2: Zero Segments and Voided Elements 

1. Zero Segment.s Represented by Elements 

2. \'oided Elements: Non-zero Segments Represented by 
Absence of Element 

3. Relation between Zero Segments and Voided Elements 
Appendix to 18.4: Correlation with Previous Results 

1. With Phonemic Features 

2. With Boundaries . 

3. With Contours 

4. With Morpheme Classes 

5. With Meaning 




Morphological Structurk 349 

19.1. Purpose: Stating What Utterances Occur in the Corpus 349 

19.2. Procedure: Sequences of Resultants or Constructions 349 

19.3. The Selective Substitution Diagram 350 

19.31. Different Conditions for Different Substitutions . 350 

19.4. Result : Sentence Types 351 

Appendix to 19.31: Detailed Diagrams 352 

20. Survey 361 

20.1. Summary of the Results 361 

20.11. Phonology 361 

2P.12. Morphology 362 

20.13. General 364 

20.2. Survey of the Operations 365 

20.21. To State Regularities or To Synthesize Utter- 
ances? 365 

20.22. Operations of Analysis 366 

20.3. De.scription of the Language Structure 372 

20.4. Correlations Outside of Descriptive Linguistics . 373 
Appendix to 20.3: A Grammar of Lists 376 

Index 379 


THIS volume presents methods of research used in descriptive, or, 
more exactly, structural, linguistics. It is thus a discussion of the 
operations which the linguist may carry out in the course of his 
investigations, rather than a theory of the structural analyses which re- 
sult from these investigations. The research methods are arranged here 
in the form of the successive procedures of analysis imposed by the work- 
ing linguist upon his data. It is hoped that presentation of the methods 
in procedural form and order may help reduce the impression of sleight- 
of-hand and complexity which often accompanies the more subtle lin- 
guistic analyses. 

Starting with the utterances which occur in a single language commu- 
nity at a single time, these procedures determine what may be regarded 
as identical in various parts of various utterances, and provide a method 
for identifying all the utterances as relatively few stated arrangements 
of relatively few stated elements. 

These procedures are not a plan for obtaining data or for field work. 
In using them, it does not matter if the linguist obtains the data by tak- 
ing texts, questioning an informant, or recording a conversation. Even 
where the procedures call for particular contact with the informant, as 
in obtaining repetitions of an utterance, it does not matter how this is 
carried out: e.g. the linguist can interrupt a conversation to ask the 
speaker or hearer to repeat an utterance that has occurred, and may 
then alter the conversation so as to get its recurrence in different en- 

These procedures also do not constitute a necessary laboratory sched- 
ule in the sense that each procedure should be completed before the next 
is entered upon. In practice, linguists take unnumbered short cuts and 
intuitive or heuristic guesses, and keep many problems about a particu- 
lar language before them at the same time: they may have figured out 
the positional variants of several phonemes before they decide how to cut 
up into segments certain utterances which presumably contain a pho- 
netically unusual phoneme; and they will usually know exactly where 
the boundaries of many morphemes are before they finally determine the 
phonemes. The chief usefulness of the procedures listed below is therefore 
as a reminder in the course of the original research, and as a form for 
checking or presenting the results, where it may be desirable to make 



sure that nil the iiiformatidii called for in these procethiies has been 
validly obtained.' 

The methods deseriheil here do not eliminate non-uniijueness- in lin- 
guistic descriptions. It is possible for different linguists, working on the 
same material, to set up different phonemic and morphemic elements, to 
break phonemes into simultaneous components or not to do so, to equate 
two sequences of morphemes as being mutually substitutable or not to 
do so. The only result of such differences will be a correlative difference 
in the final statement as to what the utterances consist of. The use of procedures is merely to make explicit what choices each linguist 
makes, so that if two analysts come out with different phoneme lists for 
a given language we should have exact statements of what positional 
variants were assigned by each to what phonemes and wherein lay their 
differences of assignment. 

The methods presented here are consistent, but are not the onlj- pos- 
sible ones of arranging linguistic description. Other methods can be sug- 
gested, for example one based upon relations of selection among seg- 
ments, whether phonemic or morphemic. As more languages are ana- 
lyzed, additional refinements and special cases of these or of comparable 
techniques come to attention. 

The particular way of arranging the facts about a language which is 
offered here will undoubtedh' prove more convenient for some languages 
than for others. However, it should not have the undesirable effect of 
forcing all languages to fit a single Procrustean bed, and of hiding their 
differences by impo.sing on all of them alike a single set of logical cate- 
gories. If such categories were applied, especially to the meanings of 
forms in various languages, it would be easy to extract parallel results 
from no matter how divergent forms of speech ; a set of suffixes one or 
another of which always occurs with every noun (say, Latin -is, -i, -e), 
and a selection of frequently used directional adjectives (say, English 

' In the interests of clarity and in order not to cloud the succession of 
the procedures, only the skeleton of each procedure will be given in the 
various chapters. Discussions of complicated points, justifications of the 
methods propo.sed, and longer examples of a complete procedure, will be 
given in appendices to each chapter. chapters will open with a no- 
tation, in conventional linguistic terminology, of the procedure to be 
di.scu.ssed in it. The chapter will then contain a statement of the objec- 
tiv^es of the procedure, a description of the methods used, and a state- 
ment of the results obtained thereby-. 

- See, for example, Yuen Ren Chao, The Non-Uniqueness of Phonemic 
Solutions of Phonetic Sv.stems, Bulletin of the Institute of Historv and 
Philology 4.363-397 (Academia Sinica; Shanghai, 1934). 


o/, to, in) can both be called case systems.^ The procedures given below, 
however, are merely ways of arranging the original data; and since they 
go only by formal distinctions there is no opportunity for uncontrolled 
interpreting of the data or for forcing of the meaning. 

For this reason, the data, when arranged according to these proce- 
dures, will show different structures for different languages. Furthermore, 
various languages described in terms of these procedures can be the more 
readily compared for structural differences, since any differences be- 
tween their descriptions will not be due to differences in method used by 
the linguists, but to differences in how the language data responded to 
identical methods of arrangement. 

The arrangement of the procedures follows the fundamental division 
into phonology and morphology, each of which is further divided into the 
determining of the elemental distinctions (phonemic or morphemic) and 
the determining of the relations among the distinct elements. In order to 
be consistent in the reduction of linguistic methods to procedures, there 
are here offered procedures even for those steps where linguists tradi- 
tionally use hit-or-miss or intuitive techniques to arrive at a system 
which works to a first approximation, but which can with greater diffi- 
culty — and greater rigor — be arrived at procedurally. Examples of cum- 
bersome but explicit procedures offered here in place of the simpler in- 
tuitive practice are: the stress upon distribution rather than meaning in 
setting up the morphemes; and the deferring of morphophonemics until 
after the morpheme alternants have been fully stated. 

The central position of descriptive linguistics in respect to the other 
linguistic disciplines and to the relationships between linguistics and 
other sciences, makes it important to have clear methods of work in this 
field, methods which will not impose a fixed system upon various lan- 
guages, yet will tell more about each language than will a mere catalogue 
of sounds and forms. The greatest use of such explicit structural descrip- 
tions will be in the cataloguing of language structures, and in the com- 
paring of structural types. These descriptions will, however, be also im- 
portant for historical linguistics and dialect geography; for the relation 
of language to culture and personality, and to phonetics and semantics; 
and for the comparison of language structure with systems of logic. In 
some of these fields much work has been done by use of individual de- 
scriptive linguistic facts, but important new results may be expected 
from the use of complete language structures. 

•^Cf. Language 16.218-20 (1940). 


2.0. Introductory 

Before we list the procedures of analysis, we must first discuss what 
kind of analysis is possible in descriptive linguistics. It is possible of 
course to study speech as human behavior, to record the physiological 
motions which are involved in articulation, or the cultural and interper- 
sonal situation in which the speaking occurs, or the sound waves which 
result from the activity of talking, or the auditory impressions gained 
by the hearer. We could try to state regularities in the description of 
each of these bodies of data.' Such regularities might consist of correla- 
tions among the various bodies of data (e.g. the dictionary correlation 
between sound-sequences and social situation or meaning), or they might 
note the recurrence of 'similar' parts within any one of these bodies of 

' Phonetics is the most developed of these fields, and the one most 
closely associated with descriptive linguistics. See Arvo Sotavalta, Die 
Phonetik und ihre Beziehungen zu den Grenzwissenschaften, Publica- 
tiones Instituti Phonetici Universitatis Helsingforsiensis 4 (Annales 
Academiae Scientarum Fennicae 31.3; Helsinki, 1936). Bernard Bloch 
and George L. Trager, Outline of Linguistic Analysis 10-37 (1942); 
Kenneth L. Pike, Phonetics; Otto Jespersen, Lehrbuch der Phonetik; 
R. H. Stetson, Bases of Phonology (1945); 0. G. Russell, The Vowel 
(1928); O. G. Russell, Speech and Voice (1931); P.-J. Rousselot, Prin- 
cipes de phonetique experimentale (1924) ; P. Menzerath and de Lacerda, 
Koartikulation, Steuerung und Lautabgrenzung (1933); the Proceedings 
of the International Congresses of Phonetic Sciences; Le maitre phone- 
tique; Zeitschrift fiir E.xperimental-Phonetik; Archiv fiir vergleichende 
Phonetik; Phonometrische Forschungen; Archives of Speech; Archives 
neerlandaises de phonetique experimentale; and publications of the In- 
ternational Society of Experimental Phonetics. Helpful bibliographies 
are published by S. X. Trevino in American Speech. Most phonetic in- 
vestigations have dealt with articulation: e.g. G. Panconcelii-Calzia, 
Die experimentelle Phonetik ( 1931). More recently, the center of interest 
has been shifting to acoustic studies of the sound waves, where electronic 
instruments, physical theories, and mathematical methods permit more 
exact observations. See, for example, Harvey Fletcher, Speech and Hear- 
ing (1929); A. Gemelli and G. Pastori, L'analisi elettroacoustica del 
linguaggio (1934); P. David, L'electroacoustique (1930); J. C. Steinberg 
and N. R. French, The portrayal of visible speech, Journal of the 
Acoustical Society of America 18.4-18 (1946); G. A. Kopp and H. C. 
Green, Basic phonetic principles of visible speech, ibid. 74-89; Bell Sys- 
tem Technical Journal; R. K. Potter, G. A. Kopp, and H. C. Green, 
Visible Speech; M. Joos, Acou.stic Phonetics (Language ISIonographs No. 
23, 1948). 


data. One could note the recurrence of lip-closure in the course of some- 
one's talking, or the recurrence of various combinations and sequences 
of articulatory motions. The data of descriptive linguistics can be derived 
from any or all of these features and results of behavior — by observing 
the articulatory motions of the speaker, by analyzing the resulting air- 
waves, or by recording what the hearer (in this case the linguist) hears. 
In the first case we obtain modifications of the air-stream in the course 
of the speaker's breathing; in the second case we obtain complex wave 
forms; in the third, impressionistic identifications of sound sequences. 
Descriptive linguistics deals not with any particular one of these records 
of behavior, but with the data common to them all; for example, those 
frequencies or changes in the air-waves to which the human ear does not 
react are not included in the data of linguistics. 

2.1. The Criterion of Relevance: Distribution 

Descriptive linguistics, as the term has come to be used, is a particular 
field of inquiry which deals not with the whole of speech activities, but 
with the regularities in certain features of speech. These regularities are 
in the distributional relations among the features of speech in question, 
i.e. the occurrence of these features relatively to each other within utter- 
ances. It is of course possible to study various relations among parts or 
features of speech, e.g. similarities (or other relations) in sound or in 
meaning, or genetic relations in the history of the language. The main 
research of descriptive linguistics, and the only relation which will be 
accepted as relevant in the present survey, is the distribution or arrange- 
ment within the flow of speech of some parts or features relatively to 

The present survey is thus explicitly limited to questions of distribu- 
tion, i.e. of the freedom of occurrence of portions of an utterance relative- 
ly to each other. All terms and statements will be relative to this cri- 
terion. For example, if the phonemic representation of speech is de- 
scribed as being one-one (7.5), this does not mean that if a particular 
sound X is associated with a i)honeme }', then when we are given the 
phoneme Y we associate with it the original particular sound x. The 
one-one correspondence means only that if a particular sound x in a given 
position is associated with a phoneme 1' (or represented by the sym- 
bol }'), then when we are given the phoneme Y we will associate with it, 
in the stated position, .some sound x', x", which is substitutable for the 
original x (i.e. has the same distribution as x). In the stated position, the 
symbol Y is used for any sound which is substitutable (in the sense of 
4.21) for x, x', etc. 


The only preliminary stop tluit is essential to this science is the restric- 
tion to distribution :u>; determining the relevance of inquiry. The par- 
ticular methods described in this book are not essential. They are offered 
as general procedures of distributional analysis applicable to linguistic 
material. The specific choice of procedures selected for detailed treat- 
ment here is, however, in part determined by the particular languages 
from which the examples are drawn. The analysis of other languages 
would undoubtedly lead to the discussion and elaboration of additional 
techniques. Even the methods discussed in detail here could be made to 
yield many additional results over and above those brought out in this 
survey. Furthermore, the whole framework of basic procedures presented 
below could be supplanted by some other schedule of operations without 
loss of descriptive linguistic relevance. This would be true as long as the 
new operations dealt essentially with the distribution of features of 
speech relatively to the other features within the utterance, and as long 
as they did so explicitly and rigorously. Any such alternative operations 
could always be compared with the procedures presented here, and the 
results of one could always be put into correspondence with the results 
of the other. 

2.2. Schedule of Procedures 

The whole schedule of procedures outlined in the following chapters, 
which is designed to begin with the raw data of speech and end with a 
statement of grammatical structure, is essentially a twice-made applica- 
tion of two major steps: the setting up of elements, and the statement 
of the distribution of these elements relative to each other. First, the dis- 
tinct phonologic elements are determined (chapters 3-4) and the rela- 
tions among them investigated (5-11). Then the distinct morphologic 
elements are determined (12) and the relations among them investi- 
gated (13-19). 

There are various differences between the application of these steps 
in phonology and the application of the same steps in morphology. These 
derive from the differences in the material^ and from the fact that when 
the operations are repeated for the morphology they are being carried 
out on material which has already been reduced to elements.^ Never- 

^ E.g. the fact that in all languages which have been described there 
are far more distinct morphologic elements than distinct phonologic ele- 

^ An example of this is the fact that the morphologic elements could 
be determined not afresh (as is done in chapter 12) but on the basis of 
limitations of distribution of the phonologic elements. 


theless, the two parallel schedules are essentially similar in the type and 
sequence of operations. 

In both the phonologic and the morphologic analyses the linguist 
first faces the problem of setting up relevant elements. To be relevant, 
these elements must be set up on a distributional basis : x and y are in- 
cluded in the same element A if the distribution of x relative to the other 
elements B, C, etc., is in some sense the same as the distribution of y. 
Since this assumes that the other elements B, C, etc., are recognized at 
the time when the definition of A is being determined, this operation can 
be carried out without some arbitrary point of departure only if it is car- 
ried out for all the elements simultaneously. The elements are thus de- 
termined relatively to each other, and on the basis of the distributional 
relations among them.* 

It is a matter of prime importance that these elements be defined rela- 
tively to the other elements and to the interrelations among all of them.^ 
The linguist does not impose any absolute scale upon a language, so as to 

■• Objection might be raised here to the effect that meaning considera- 
tions too, are involved in the determinations of elements, since, for ex- 
ample, when sounds for sound-features) x and y occur in identical en- 
vironments they are assigned to different phonemes if the complexes con- 
taining them constitute different morphemes (e.g. (1) and (r) in the en- 
vironment / — ayf/: life, rife). However, this differentiation of life and 
rife on the basis of meaning is only the linguist's and the layman's 
shortcut to a distributional differentiation. In principle, meaning need 
be involved only to the extent of determining what is repetition. If we 
know that life and rife are not entirely repetitions of each other, we will 
then discover that they differ in distribution (and hence in 'meaning'). 
It may be presumed that any two morphemes A and B having different 
meanings also differ somewhere in distribution : there are some environ- 
ments in which one occurs and the other does not. Hence the phonemes 
or sound-features which occur in A but not in B differ in distribution at 
least to that extent from those which occur in B but not in A. 

A more fundamental exception to the distributional basis lies in the 
possibility of distinguishing the elements on the basis of physical (in 
particular, acoustic) measurements. Even in this case, however, the dis- 
tinguishing would be relative: the absolute measurements themselves 
wojld not determine the various elements, but rather the relative differ- 
ences among the measurements. 

* The most explicit statement of the relative and patterned character 
of the phonologic elements is given by Edward Sapir in Sound Patterns 
in Language, L,\nguage. 1.37-51 (1925); now also in Selected Writings 
of Edward Sapir 33-45. See also the treatment of phonologic elements in 
Ferdinand de Saussure, Cours de linguistique generale; Nikolai Tru- 
betzkoy, Grundziige der Phonologic (Travaux du Cercle Linguistique 
de Prague 7, 1939). 


set up lis elements, for example, the shortest sounds, or tlic most fre- 
quent sounds, or those havini? particuhvr artieulatory or acoustic jirop- 
erties. Rather, as will he seen in the chai)ters to follow, he sets up a group 
of elements (each by comparison with the others) in such a way as will 
enable him most simply to associate each bit of talking with some con- 
struction com|)osed of his elements.* 

In both the phonoK)gic and the morphologic analyses the linguist then 
investigates the distributional relations among the elements. This task 
can be made simpler by carrying it out in successive operations such as 
those procedurally described here. In those cases where the procedure 
seems more complicated than the usual intuitive method (often based on 
the criterion of meaning) of obtaining the same results, the reason for the 
more complex procedure is the demand of rigor.' 

It thus appears that the two parallel analyses lead to two sets of de- 
scriptive statements, constituting a phonologic system and a morpho- 
logic system. Each set of statements consists of a list of relatively- 
defined, or patterned, elements, plus an organized specification of the 
arrangements in which they occur. In the following chapters many such 
specifications are given by defining a new stock of elements formed out 
of the previous stock on the basis of the distributional relations among 
the previous elements. However, it does not matter for the basic de- 
scriptive method whether the statements are expressed in this or any 
other way: Instead of defining a new stock of elements in terms of the 
old, in such a way that the distributional characteristics of the old ele- 
ments are included in the definition of the new (this makes for compact 

*The fact that the determination of elements is relative to the other 
elements of the language means that all such determining is performed 
for each language independently. All lists of elements, relations among 
them, and statements about them are applicable only to the particular 
language for which they are made. The research methods of the linguist 
may be roughly similar for many languages, but the statements that re- 
sult from his work apply in each case to the language in question. 

' It may be noted that distributional procedures do more than offer 
a rigorous alternative to meaning considerations and the like. Distribu- 
tional procedures, once established, permit, with no extra trouble, the 
definite treatment of those marginal cases which meaning considera- 
tions leave indeterminate or open to conflicting opinion. Thus distribu- 
tional considerations may be more cumbersome than meaning in de- 
termining whether boiling is boil + ing (similar to talking) or boy + ling 
(similar to princeling). But distributional considerations can determine 
whether sight is see + -t and flight is flee + -I (similar to portray and 
portrait) as readily as they can determine the question of boiling; whereas 
meaning considerations might not be decisive for these forms. 


symbolic manipulation), we can keep the old elements and merely list 
the distributional statements (element x occurs next to y only in environ- 
ment z). All that matters is that the defining of the elements and the 
stating of the relations among them be based on distribution, and be 
unambiguous, consistent, and subject to check. Beyond this point, it is 
a matter of other than descriptive purposes how compact and conveni- 
ent the formulation is, or what other qualities it may have.* 

2.3. The Universe of Discourse 

2.31. Dialect or Style 

The universe of discourse for a descriptive linguistic investigation is a 
single language or dialect. 

These investigations are carried out for the speech of one particular 
person, or one community of dialectally identical persons, at a time. 
Even though any dialect or language may vary slightly with time or 
with replacement of informants, it is in principle held constant through- 
out the investigation, so that the resulting system of elements and state- 
ments applies to one particular dialect. In most cases this presents no 
problem, since the whole speech of the person or community shows dia- 
lectal consistency; we can define the dialect simply as the speech of tlic 
community in question. In other cases, however, we find the single 
person or the community using various forms which are not dialectally 
consistent with each other. Several ways are then open to us. We can 
doggedly maintain the first definition and set up a system corresponding 
to all the linguistic elements in the speech of the person or the commu- 
nity. Or we may select those stretches of speech which can be described 
by a relatively simple and consistent system, and say that they are cases 
of one dialect, while the remaining stretches of speech are cases of an- 
other dialect. We would usually do this on the basis of a knowledge of 
the different dialects of other communities. The material which is rejected 
as being not of the dialect in question may consist of scattered words used 
with the trappings of foreignisms (e.g. use of role, raison d'itre by some 

* It therefore does not matter for basic descriptive method whether 
the system for a particular language is so devised as to have the least 
number of elements (e.g. phonemes), or the least number of statements 
about them, or the greatest over-all compactness, etc. These different 
formulations differ not linguistically but logically. They differ not in 
validity but in their usefulness for one purpose or another (e.g. for teach- 
ing the language, for describing its structure, for comjjaring it with 
genetically related languages). 


spotikers of Knglish); or it. may consist of whole utterances and con- 
versations, as in the speech of bilinguals.* 

In contradistinction to dialect, there arc various differences in speech 
which are not held constant throughout a descriptive investigation. It may 
be possible to show, in many languages, that there are differences in style 
or fashion of speech, in respect to which whole utterances or even dis- 
courses are consistent.'" Thus we may not readily find an utterance 
which contains both good morning and also good mornin' or good evenin', 
nor one containing a brighty and also sagacious. We may say that forms 
like good morning and good mornin occur in different styles of speech, as 
do forms like o brighty and sagacious. These differences are usually dis- 
tributional, since forms of different style do not generally occur with 
each other. In many cases, the differences between two stylistic sets of 
forms (such that members of one set don't occur with members of the 
other) affect only limited parts of the descriptive system; for example, a 
distinctive stylistic set may include particular members of a morpheme 
class and particular types of morpheme sequence. This differs only in 
degree from dialect differences, which in many cases are also restricted 
to particular parts of the descriptive system, the rest of the system being 
identical for both dialects. 

As in the case of different dialects, different styles too can be written 
with marks, each mark extending over all the material specific to its 

' For productivity, as an example of problems involving variation of 
language, see ch. 12, fn. 81. In investigations which run across dialect 
lines and include material from more than one dialect, the material of 
one dialect can be marked so as to distinguish it from the material of 
the other. All forms which have in common the fact that they occur in a 
particular dialect would be written with a mark indicating that dialect. 
These marks could be manipulated somewhat along the lines of the 
phonemic components of chapter 10. For example, if in the material in 
question dialects are never mixed in one utterance, so that each utterance 
is wholly in one dialect or in the other, we would say that the mark in- 
dicating dialect extends over whole utterances. Cf. W. L. Wonderly, 
Phonemic Acculturation in Zoque, International Journal of American 
Linguistics 12.192-5 (1946). 

'" These styles may be related to various cultural and interpersonal 
situations. In addition to the examples discussed here, which border on 
social dialect difference, we could consider styles which mark par- 
ticular speakers or socially differentiated groups of speakers (e.g. adoles- 
cent girls' style), styles which mark particular types of interpersonal re- 
lation (e.g. styles of respect and the like; these border on gesture-like 
intonations, such as that of anger). The latter types of style are discussed 
by Karl Biihler in his Sprachtheorie (1934). 


style. Furthermore, because of the great degree of structural identity 
among various styles within a dialect, it is usually feasible to keep the 
indications of style as subsidiary differentiations, within utterances 
which are otherwise structurally identical. Thus in the stylistic contrast 
between be seein' ya and be seeing you (13, fn. 5), the utterances are 
identical except for one difference in style. Since seein' does not occur 
before you, and seeing does not occur before ya, we can set up just one 
style marker which extends over the whole utterance and indicates the 
differences between seeing you and seein' ya. 

Although differences of style can be described with the tools of de- 
scriptive linguistics, their exact analysis involves so much detailed study 
that they are generally disregarded." The procedures presented in the 
following chapters will not take note of style differences, but will assume 
that all styles within a dialect may be roughly described by a single struc- 
tural system. 

2.32. Utterance or Discourse 

The universe of discourse for each statement in the descriptive anal- 
ysis is a single whole utterance in the language in question. 

Investigations in descriptive linguistics are usually conducted with 
reference to any number of whole utterances. Many of the results apply 
explicitly to whole utterances. Even when studies of particular interrela- 
tions among phonemes or morpheme classes are carried out, the frame 
within which these interrelations occur is usually referred ultimately to 
their position within an utterance. This is due to the fact that most of 
the data consists (by definition) of whole utterances, including longer 
stretches which can be described as sequences of whole utterances. When 
we consider aii element which has occurred aS part of a whole utterance 
(say, the [d], or fair, or ly in Fairly good, thanks.), we note its relation to 
the utterance in which it is attested. 

On the other hand, stretches longer than one utterance are not usually 
considered in current descriptive linguistics. The utterances with which 
the linguist works will often come in longer discourses, involving one 
speaker (as in texts taken from an informant) or more than one (as in 
conversations). However, the linguist usually considers the interrela- 
tions of elements only within one utterance at a time. This yields a pos- 
sible description of the material, since the interrelations of elements with- 

" It must also be recognized that predictions based on statements 
about style are generally less accurate than predictions based on state- 
ments about dialect. 


in each utterance (or utterance type) are worked out, and any longer dis- 
course is describable as a succession of utterances, i.e. a succession of ele- 
ments having the stattnl interrehitions. 

This restriction means that nothing is generally said about the inter- 
relations among whole utterances within a discourse. Now in many, per- 
haps all, languages there are particular successions among types of ut- 
terances within a discourse. This may be seen in a stretch spoken by one 
speaker (compare the first sentence of a lecture with one of the later sen- 
tences), or in a conversation (especially in such fixed exchanges as "How 
areyouf" "Fine; how are you?"). Since these are distributional limitations 
upon the utterances with respect to each other within the discourse, they 
could be studied with the methods of descriptive linguistics. The amount 
of data and of analytic work required for such a study would, however, 
be much greater than that required for stating the relations of elements 
within single utterances. P"or this reason, the current practice stops at 
the utterance, and the procedures described below do not go beyond 
that point. 

2.33. Corpus or Sample 

Investigation in descriptive linguistics consists of recording utterances 
in a single dialect and analyzing the recorded material. The stock of 
recorded utterances constitutes the corpus of data, and the analysis 
which is made of it is a compact description of the distribution of ele- 
ments within it. The corpus does not, of course, have to be closed before 
analysis begins. Recording and analysis can be interwoven, and one of 
the chief advantages of working with native speakers over working with 
written texts (as is unavoidable, for example, in the case of languages no 
longer spoken) is the opportunity to check forms, to get utterances re- 
peated, to test the productivity of particular morphemic relations, and 
so on.'^ 

'^ If the linguist has in his corpus ax, bx, but not ex (where a, b, c are 
elements with general distributional similarity), he may wish to check 
with the informant as to whether ex occurs at all. The eliciting of forms 
from an informant has to be planned with care because of suggestibility 
in certain interpersonal and intercultural relations and because it may not 
always be possible for the informant to say whether a form which is 
proposed by the linguist occurs in his language. Rather than constructing 
a form ex and asking the informant 'Do you say exf or the like, the lin- 
guist can in most cases ask questions which should lead the informant to 
use ex if the form occurs in the informant's speech. At its most innocent, 
eliciting consists of devising situations in which the form in question is 
likely to occur in the informant's speech. 


To persons interested in linguistic results, the analysis of a particular 
corpus becomes of interest only if it is virtually identical with the analysis 
which would be obtained in like manner from any other sufficiently large 
corpus of material taken in the same dialect. If it is, we can predict the 
relations among elements in any other corpus of the language on the basis 
of the relations found in our analyzed corpus. When this is the case, the 
analyzed corpus can be regarded as a descriptive sample of the language. 
How large or variegated a corpus must be in order to qualify as a sample 
of the language, is a statistical problem; it depends on the language and 
on the relations which are being investigated. For example, in phonologic 
investigations a smaller corpus may be adequate than in morphologic in- 
vestigations. When the linguist finds that all additional material yields 
nothing not contained in his analysis he may consider his corpus ade- 

The procedures discussed below are applied to a corpus of material 
without regard to the adequacy of the corpus as a sample of the language. 

2.4. Definition of Terms 

For the purposes of descriptive linguistic investigations a single 
LANGUAGE or dialect is considered over a brief period of time. This com- 
prises the talk which takes place in a language community, i.e. among a 
group of speakers, each of whom speaks the language as a native, and 
may be considered an informant from the point of view of the linguist. 
None of the terms used here can be rigorously defined. The limits for a 
community vary with the e.xtent of language difference as geographic 
distance or boundaries, and social divisions, increase. Only after lin- 
guistic analysis is under way is it possible to tell definitely whether two 
individuals or two sub-groups in a community differ in respect to the lin- 
guistic elements or the relations among these elements. Even the speech 
of one individual, or of a group of persons with similar language histories, 
may be analyzable into more than one dialect: there may be appreciable 
linguistic differences in a person's talk in different social situations (e.g., 
in some societies, in talk to equals or to superiors). And even when the 
social matrix is held constant, the talk of an individual or of a language 
community may vary stylistically-in ways which would register varia- 
tions in elements or in relations among elements. 

The question as to who talks a language as a native is also one which 
can, in the last analysis, be settled only by seeing if the analysis of a per- 
son's speech agrees with that of the speech of others in the community. 
In general, any person past his first few years of learning to talk speaks 


t lio huiKUiist' of his i-ommunity as 'a native', if he has not been away from 
the community for long periods. However, persons with more checkered 
hxngutige careers may also s])eak a hinguage natively from the point of 
view of the linguist. 

An UTTERANCE is any stretch of talk, by one person, before and after 
which there is silence on the part of the person. The utterance is, in gen- 
eral, not identical with the 'sentence' (as that word is commonly used), 
since a great many utterances, in English for example, consist of single 
words, phrases, 'incomplete sentences', etc. Many utterances are com- 
posed of parts which are linguistically equivalent to whole utterances 
occurring elsewhere. For example, we may have Sorry. Can't do it. I'm 
busy reading Kafka, as an utterance, and also Sorry. I'm busy reading 
Kafka, or Sorry, or Can't do it. as an independent utterance.'' 

Utterances are more reliable samples of the language when they occur 
within a conversational exchange. The situation of having an informant 
answer the questions of a linguist'* or dictate texts to him is not an ideal 
source, though it may be unavoidable in much linguistic work. Even 
then, it must be remembered that the informant's answers to the linguist 
are not merely words out of linguistic context, but whole utterances on 
his part (e.g. bearing a whole utterance intonation). 

The linguistic elements are defined for each language by associating 
them with particular features of speech — or rather, differences between 
portions or features of speech — to which the linguist can but refer. They 
are marked by symbols, whether letters of the alphabet or others, and 
may represent simultaneous or successive features of speech, although 
they may in either case be written successively. The elements will be said 
to represent, indicate, or identify, rather than describe, the features in 
question. For each language, an explicit list of elements is defined. 

The statement that a particular element occurs, say in some position, 
will be taken to mean that there has occurred an utterance, some feature 
of some part of which is represented linguistically by this element. 

Each element may be said to occur over some segment of the utter- 
ance i.e. over a part of the linguistic representation of the time-extension 

'^ Linguistic equivalence requires identity not only in the successive 
morphemes but also in the intonations and junctural features. Hence, 
while the utterance 'Sorry, can't do it.' may be linguistically equivalent 
to the two utterances 'Sorry.' and 'Can't do it.' the utterance 'Can't do it.' 
is not linguistically equivalent to 'Can't', and 'Do it.' since the intona- 
tions on the latter two do not together equal the intonation on the first. 

'* About how something is said in his language, not about his lan- 
guage. Cf. Leonard Bloomfield, Outline Guide for the practical study of 
foreign languages (1942). 


of the utterance. A segment may be occupied by only one element (e.g. 
only an intonation in the English utterance which is ^vritten Mm.), or 
by two or more elements of identical length (e.g. two simultaneous com- 
ponents), or by one or more short elements and one or more elements 
occupying a long segment in which the segment in question is included 
(e.g. a phoneme, plus a component like Moroccan Arabic ' stretching 
over several phonemic segments, plus an intonation extending over the 
whole utterance). ^^ 

The ENVIRONMENT or position of an element consists of the neighbor- 
hood, within an utterance, of elements which have been set up on the 
basis of the same fundamental procedures which were used in setting up 
the element in question. 'Neighborhood' refers to the position of ele- 
ments before, after, and simultaneous with the element in question. Thus 
in I tried /ay#trayd./, the environment of the phoneme /a/ is the pho- 
nemes /tr — yd/ or, if phonemic intonations are involved in the dis- 
cussion, /tr — yA/ plus /./, or most fully /ay#tr — yd./. The environment 
of the morpheme try /tray/, however, is the morphemes 1 — ed or, if 
morphemic intonations are involved in the discussion, / — ed with the 
assertion intonation.'® 

The DISTRIBUTION of an element is the total of all environments in 

'^ The segment over which an element extends is in some cases called 
the DOMAIN or interval or length of the element (cf . C. F. Hockett, A Sys- 
tem of Descriptive Phonology, Lang. 18,14 [1942]). In the course of 
analysis it is usually more convenient not to set up absolute divisions, 
e.g. word and phrase, and then say that various relations cross these 
divisions (e.g. syllabification rules cross word division in Hungarian but 
not in English). Instead, the domain of each element, or each relation 
among elements, is indicated when the element in question is set up. If 
many of these domains appear to be equivalent, as is frequently the case, 
that fact may then be noted and we may define a domain such as word 
or the like. 

'® Traditional spellings, and the variables of general statements, will 
be given in italics: e.g. tried, filius, the morpheme X. Impressionistic 
phonetic transcription will be given in square brackets [ ]: e.g. [trayd]; 
for the usual values of the alphabetic letters see Bernard Bloch and 
George L. Trager, Outline Guide of Linguistic Analysis 22-6 et passim. 
Phonemic elements will be given in diagonals //: /trayd/. Classes of 
complementary morphemic elements will be indicated by j } : e.g. 
{-ed\. The position of an element within an environment will be indicated 
by a dash — ■: e.g. /tr — yd/ or I-ed. Silence or break in the sequence of 
elements will be indicated by #. Italics within diagonals will indicate 
the name of a phoneme: e.g. /glottal slop/ instead of /'/. Roman let- 
ters within braces will indicate the name of a morpheme: e.g., j plural 
suffix j, instead of j-s). Loud stress will be indicated by ' before the 
stressed syllable, while i marks secondary stress. Length will be indi- 
cated by a raised dot (•). 


which it ocrurs, i.e. the sum of ;ill the (different) positions (or occur- 
roncos) of an element rehitive to the occurrence of other elements. 

Two utterances or features will be said to be ling:uistically, descrip- 
tively, or distributionally equivalent if they are identical as to their lin- 
guistic elements and the distributional relations among these elements. 

The particular types of elements (phonemes, morphemes), and the 
operations such a:s substitution and classification which are used through- 
out this work, will be defined by the procedures in which they are used 
or from which they result. 

2.5. The Status of Linguistic Elements 

In investigations in descriptive linguistics, linguistic elements are as- 
sociated with particular features of the speech behavior in question, and 
the relations among these elements are studied. 

In defining elements for each language, the linguist relates them to the 
physiological activities or sound waves of speech, not by describing these 
in detail or by reproducing them instrumentally, but by uniquely identi- 
fying the elements with them.'^ Each element is identified with some fea- 
tures of speech in the language in question:'* for most of linguistic anal- 
ysis the association is one-one (the features in question are associated 
only with element A', and element X is associated only with the features 
in question) ; in some parts of the analysis the association may be one- 
many (element X is associated only with certain features, but these 
features are sometimes associated with X and sometimes with another 
element }'). 

The features of speech with which the elements are associated do not 

'^ It is widely recognized that forbidding complexities would attend 
any attempt to construct in one science a detailed description and in- 
vestigation of all the regularities of a language. Cf . Rudolf Carnap, Logi- 
cal Synta.x of Language 8: "Direct analysis of (languages) must fail just 
as a physicist would be frustrated were he from the outset to attempt to 
relate his laws to natural things — trees, etc. (He) relates his laws to 
the simplest of constructed forms — thin straight levers, punctiform 
mass, etc." Linguists meet this problem differently than do Carnap 
and his school. Whereas the logicians have avoided the analysis of exist- 
ing languages, linguists study them; but, instead of taking parts of the 
actual speech occurrences as their elements, they set up very simple ele- 
ments which are merely associated with features of speech occurrences. 
For more advanced discussion of related problems, see now the Proceed- 
ings of the Speech Communication Conference at M.I.T., in the Journal 
of the Acoustical Society of America 22. 689-806 (1950), especially M. 
Joos, Description of Language Design 701-8. 

"* See Leonard Bloomfield, Language 79. 


include all the features of a speech occurrence, nor are they ever unique 
occurrences, which happened at a particular place and time.'^ Element X 
may be associated with the fact that the first few hundredths of a second 
in a particular bit of talking involved a given tongue position, or a given 
distribution of intensity per frequency, or produced a sound as a result 
of the occurrence of which (in relation to the following sounds) the hearer 
acted in one way rather than in another way. No matter how this is de- 
fined, element A' will then be associated not only with that feature of that 
bit of talking, but also with a feature of some other bit of talking (e.g. in 
which the tongue position was very close to that in the first instance), 
and to features in many other bits of talking, the class consisting of all 
these features being determined by the fact that in each case the tongue 
was within a certain range of positions, or the hearer's action was of one 
kind rather than another, or the like. 

For the linguist, analyzing a limited corpus consisting of just so many 
bits of talking which he has heard, the element X is thus associated 
with an extensionally defined class consisting of so many features in so 
many of the speech occurrences in his corpus. However, when the lin- 
guist offers his results as a system representing the language as a whole, 
he is predicting that the elements set up for his corpus will satisfy all 
other bits of talking in that language. The element X then becomes as- 
sociated with an intensionally defined class consisting of such features of 
any utterance as differ from other features, or relate to other features, in 
such and such a way. 

Once the elements are defined, any occurrence of speech in the lan- 
guage in question can be represented by a combination of these elements, 
each element being used to indicate the occurrence in the speech of a 
feature with which the element is associated by its definition. It is then 
possible to study these combinations (mostly, sequences) of elements, and 
to state their regularities and the relations among the elements. It is pos- 
sible to perform upon the elements various operations, such as classifica- 
tion or substitution, which do not obliterate the identifiability of the 
elements^" but reduce their number or make the statement of interrela- 
tions simpler. At each point in the manipulation of these elements, state- 
ments about them or about their interrelations represent statements 

'^ See W. F. Twaddell, review of Stetson's Bases of Phonology, Inter- 
national Journal of American Linguistics 12.102-8 (1946); also W. F. 
Twaddell, On Defining the Phoneme, Language Monograph IG (1935). 

^° I.e., which maintain their one-to-one correspondence with features 
of speech. 


about selected features of speech and their interrelations. It is this that 
underlies the usefulness of descriptive linguistics: the elements can be 
manipulateii in ways in which records or descriptions of speech can not 
be: and as a result regularities of speech are discovered which would be 
far more difficult to find without the translation into linguistic symbols. 

The formvdation of 2.5 could be avoided if we considered the elements 
of linguistics to be direct descriptions of portions of the flow of speech. 
But we cannot define the elements in such detail as to include a complete 
description of speech events. Linguistic elements have also been defined 
as variables representing any member of a class of linguistically equiva- 
lent portions of the flow of speech. In that case, each statement about 
linguLstic elements would be a statement about any one of the portions of 
speech included in the specified classes. However, in the course of reduc- 
ing our elements to simpler combinations of more fundamental elements, 
we set up entities such as junctures and long components which can only 
with diflRculty be considered as variables directly representing any mem- 
ber of a class of portions of the flow of speech. It is therefore more con- 
venient to consider the elements as purely logical symbols, upon which 
various operations of mathematical logic can be performed. At the start 
of our work we translate the flow of speech into a combination of these 
elements, and at the end we translate the combinations of our final and 
fundamental elements back into the flow of speech. All that is required 
to enable us to do this is that at the beginning there should be a one-one 
correspondence between portions of speech and our initial elements, 
and that no operations performed upon the elements should destroy this 
one-one association, except in the case of particular branching operations 
(e.g. in chapter 14) which explicitly lose the one-one relation and which 
cannot be kept in the main sequence of operations leading to the final 
elements (unless special lists or other devices are used to permit return 
at all times to the one-one identification). 

Furthermore, the formulation of 2.5 enables us to avoid the reificatory 
question of what parts of human behavior constitute language. This 
question is not easy to answer. We can all agree that much of the vocal 
activity of human beings past the age of two is to be considered as lan- 
guage. But what of a cough, or of the utterance Hnimf, or of gestures 
whether accompanying speech or not? In terms of 2.5 we do not attempt 
to aiLSwer this question. We merely associate elements or symbols with 
particular differences between particular bits of human behavior. Let 
X, X-', x" be the various bits of behavior with which our element }' is as- 
sociated. Then if an aspect of behavior ^ occurs in x and in x' and in x", 
we consider ^ as associated with ]' (included in the definition of }'). If ^ 


occurs in x and x' but not in x", we do not consider ^ as associated with Y. 
Thus a glottal release, which might be considered as a slight cough, oc- 
curs with every occurrence of post-junctural German [a]-sounds. If we 
associate all these occurrences of sound with the symbols [a], we include 
the glottal release as something represented by that sequence of symbols. 
On the other hand, the somewhat different sound of a light cough may be 
found to occur with some of the German [a]-sounds, or with various other 
German sounds. However, we are not able to state the regularity of dis- 
tribution of this cough in such a way as to associate a special symbol for 
it, nor does it occur in all the sound-occurrences which we have associated 
with any other particular symbol. We may therefore say, if we wish, that 
the glottal release is included in our linguistic description, while the 
cough, which cannot be included in any of our symbols, is not. The de- 
scription which we will make in terms of our symbols will yield a state- 
ment about the occurrences of the glottal release, but will not yield a 
statement about the occurrences of the cough. Thus we do not have to 
say whether the cough (which may have such meanings as 'hesitation') is 
part of the language. We say merely that it is not such a part of the be- 
havior as we can associate with any of our elements. The linguistic ele- 
ments can be viewed as representing always the behavioral features as- 
sociated with them, and irregularly any other behavioral features (such as 
coughs) which sometimes occur. If we ever become able to state with 
some regularity the distribution of these other behavioral features, we 
would associate them too with particular linguistic elements. 

Of course, it is possible to get out from the symbols only a more con- 
venient organization of what was put in. The symbols and statements of 
descriptive linguistics cannot yield complete descriptions of speech oc- 
currences (in either physiological or acoustic terms), nor can they yield 
information about the meaning and social situation of speech occurrences, 
about trends of change through time, and the like. In most of current 
linguistic research, the statements cannot even deal adequately with cer- 
tain differences between slow and fast speech (e.g. good-bye as compared 
with g'hye), or with stylistic and personality differences in speech. ^^ 

'^' The attack made upon the validity of descriptive linguistics in R. H. 
Stetson, Bases of Phonology 25-36, is therefore not quite applicable. It 
is true that the linguistic elements do not describe speech or enable one 
to reproduce it. But they make it possible to organize a great many 
statements about speech, which can be made in terms of the linguistic 
elements. When the results of linguistic analysis are given in conjunction 
with detailed descriptions of speech, or with actual samples of speech, a 
description of the language is obtained. 


2.6. Preview of the Phonologic and Morphologic Elements 

It may be useful to see now how the relevant categories of investiga- 
tion are determined. In so doing it must be remembered that speech is a 
set of complex continuous events — -talking does not consist of separate 
sounds enunciated in succession — and the ability to set up discrete ele- 
ments lies at the base of the present development of descriptive lin- 

The question of setting up elements may be approached with little 
initial sophistication. It is empirically discoverable that in all languages 
which have been described we can find some part of one utterance which 
will be similar to a part of some other utterance. 'Similar' here means 
not physically identical but substitutable without obtaining a change in 
response from native speakers who hear the utterance before and after 
the substitution : e.g. the last part of He's in. is substitutable for the last 
part of That's my pin. In accepting this criterion of hearer's response, 
we approach the reliance on 'meaning' usually required by linguists. 
Something of this order seems inescapable, at least in the present stage 
of linguistics: in addition to the data concerning sounds we require data 
about the hearer's response.^'' However, data about a hearer accepting 
an utterance or part of an utterance as a repetition of something pre- 
viously pronounced can be more easily controlled than data about mean- 
ing. In any case, we can speak of similar parts, and can therefore divide 
each utterance into such parts, or identify each utterance as being com- 
posed of these parts. The essential method of descriptive linguistics is to 
select these parts and to state their distribution relative to each other. 

Since the occurrences of speech are bits of continuous stretches of 
physiological activities or sound waves, we could cut each one into small- 
er and smaller parts without limit. However, there is no point in doing so : 
once we have gotten such parts or features with which we can associate 
linguistic elements which can also be associated w^th parts or features of 
various other utterances, we may find that nothing is gained by setting 
up elements associated with yet smaller segments of the utterance. 
Unity of practice, and simplicity of method, is achieved in linguistics by 
fixing a point beyond which the division of utterances into parts for 
linguistic representation is not carried. If we are dividing Let's go 
[ilec'gow.] and To see him? [ta'siyim?], we will break the affricate [c] into 
two parts [t] and [s] which occur separately in the second utterance. But 

'^ Cf . Leonard Bloomfield, A set of postulates for the science of lan- 
guage, L.\NGu.\GE 2.153-64 (1926). 


we will not break the [s] of both utterances into three successive parts : 
say, the curving of the tongue blade, the holding fast of the curved 
tongue blade, and the straightening out of the blade and sliding of the 
tongue away from the [s] position. The point at which segmentation stops 
may be stated as follows : We associate elements with parts or features of 
an utterance only to the extent that these parts or features occur inde- 
pendently (i.e. not always in the same combination) somewhere else. 
It is assumed that if we set up new elements for successive portions of 
what we had represented by [s], and then used them in representing vari- 
ous other utterances, these new elements would not occur except to- 
gether. We therefore do not subdivide [s] into these parts. As will be 
seen, this means that we associate with each utterance the smallest num- 
ber of different elements which are themselves just small enough so that 
no one of them is composed of any of the other elements. We may call 
such elements the minimum, i.e. smallest distributionally independent, 
descriptive factors (or elements) of the utterances.^' 

Linguists use two choices of criteria, leading to two different sets of 
elements, the phonologic and the morphologic. Each of these sets of ele- 
ments by itself covers the whole duration of all utterances : every utter- 
ance can be completely identified as a complex of phonemic elements, 
and every utterance can be completely identified as a complex of mor- 
phemic elements. 

The elements in each set are grouped into various classes, and state- 
ments are made about the distribution of each element relative to the 
others in its set. 

2.67. Correlations Outside of Descriptive Linguistics 

Studying the interrelation of the short phonologic elements enables us 
to make various general statements and predictions, in which no informa- 
tion about morphemes is necessarily involved. E.g., we may show that 
all the sounds made in a given language can be grouped into a more or 
less patterned set of phonemes, or into a smaller set of components. We 
may predict that if glottalized consonants do not occur in English, or 
if [t)] does not occur after silence, then l^nglish speakers will in general 
find difficulty in pronouncing them.^^ We may predict that if in Hidatsa 

^' Leonard Bloomfield, Language 79, 166. 

^* All such predictions are outside the techniques and scope of de- 
scriptive linguistics. Linguistics offers no way of quantifying them. 
Nevertheless, taking the linguistic representation as a dear and systemic 
model of selected features of speech, we may find that this model corre- 


[w] and [m] are allophoiies both of one jihoneme and of one morj)hopho- 
neme, wliile in I'lnglish they are phonemically distinct from each other, 
then English speakers will be able to distinguisli [w] from [m], whereas 
Hidatsa speakers will not." 

Studying the interrelations of the frequently longer morjjhologic ele- 
ments enables us to make various general statements and predictions in- 
dependently of any phonological information. E.g. we may show that all 
the morphemic elements of a language can be grouped into a very few 
classes, and that only particular sequences of these classes occur in ut- 
terances of that language. Given that we have no record of anyone having 
ever said either The blue radiator walked up the window, or Here is man the. 
we can devise a few situations in which the former will be said but can 
predict that the latter will be said far less frequently (excejit in situations 
which can be stated for each culture, e.g. exjilicitly linguistic discus- 

Phonology and morphology, therefore, each independently provides 
information concerning regularities in selected aspects of human be- 
havior.''^ The general methods of scientific technique are the same for 
both: associating discrete elements with particular features of portions 
of continuous events, and then stating the interrelations among these 
elements. But the results in each — the number of elements and classes 
of elements, the type of interrelations — are different. The applications 
too are also often different. Both fields give us information about a par- 
ticular language; but phonology is more useful in taking down anthro- 
pological te.xts, learning a new dialect, etc., while morphology is more use- 

lates with other observations about the peoi)le who do the speaking. 
Cf., for instance, the data and e.xamples in Edward Sapir, La realite 
jKsychologique des phonemes, Journal de Psychologie 30.247-65 (1933); 
now also in Selected Writings of Edward Sapir 46-60. 

^^ After linguistic science has developed sufficiently, it may be possible 
also to predict some of the direction of the phonologic diachronic change 
through time on the basis of descriptive (synchronic) phonological 

^* In these utterances the intonations are of course to be taken into 
account. E.g. in the second, the end of the assertion intonation would 
have to occur with the final the. 

^^ This does not imply that we can speak of any identifiable linguistic 
behavior, much less phonologic or morphologic behavior. There is inter- 
personal behavior which may include gesture, speech, etc. Linguistics 
sets up a system of relations among selected features of this general be- 


fill in the understanding of texts, in discovering "what is said" in a new- 
language, etc. 

2.62. Relation between Phonologic and Morphologic Elements 

Although the scientific status and uses of phonology and of morphol- 
ogy are independent of each other, there is an important and close con- 
nection which can be drawn between them. If, disregarding phonology, 
we have first determined the morphemes of a language, we can proceed, 
if we wish, to break these morphemes down into phonemes. And if we 
have only determined the phonemes, we can use these phonemes to 
identify uniquely every morpheme. 

As will be seen in-the Appendix to 12.5, it is possible to determine the 
morphemes of a language without any previous determination of the 
phonemes.^* The morphemic elements obtained in this manner would 
each represent an unanalyzed segment of utterances, e.g. mis, match, s 
(plural), z (plural), etc., in We both made mistakes. Some mismatched 'pairs. 
However, just as utterances can be represented by sequences of elements 
such that each element occurs in various utterances, so the morphemic 
elements, which represent segments of utterances, can be considered as 
sequences of smaller elements. Thus we find that the first part of mis is 
substitutable for the first part of match, or the last part of viis for the 
whole of s. It is therefore possible to describe each morphemic element as 
a unique combination of sound elements. Breaking the morphemic ele- 
ments down into these smaller parts does not help us in stating the in- 
terrelations among morphemes; we could deal just as well with whole 
unanalyzed morphemes. This further analysis of the morphemic ele- 
ments merely enables us to identify each of them more simply, with a 
much smaller number of symbols (one symbol per phoneme, instead of 
one symbol per morpheme). 

Just as we can go from morphemes to ])honemes, so can we go, but, far 
more easily, from phonemes to morphemes. Given the phonemic ele- 
ments of a language, we can list what combinations of them constitute 
morphemes in the language. The phonemic elements, being fewer and in 
general shorter than the morphemic elements, are much easier to deter- 
mine, so that identifying each morphemic element as a particular com- 
bination of the previously discovered phonemes is more convenient than 
determining afresh the phonetic uniqvieness of each mori)homic element. 
This does not mean that the phonemes automatically give us the mor- 

" This is not done for a whole language, because of the comjilcxity of 
the work. 


phenies. In most lanpiagps only some of the combinations of phonemes 
constitute morphemes, and in all languages a morphological analysis such 
as that used in 12.23 is required to tell which these are. 

There are thus two independent reasons for carrying out phonologic 
analysis: to find the interrelations of the phonemic elements, or to obtain 
a simple way of identifying the mori)hemic elements. 

Whether it arises from breaking morphemes down or from combining 
phonemes, the connection between phonology and morphology lies in 
using phonemes to identify morphemes. This connection does not make 
the two divisions ultimately identical. There remain phonological in- 
vestigations which are not used in identifying the morphemes and would 
not be derived from the morphemes: e.g. phonetic classification of 
phonemes or of their positional variants. And morphologic techniques 
are required which cannot be derived from phonology: e.g. in some cases, 
what sequences of phonemes constitute a morpheme. 

The practice of linguists is usually a combination of methods. The lin- 
guist makes a first approximation by setting up tentative morphemes. 
He then uses his phonologic investigation to verify his postulated mor- 
phemes. In some cases where he has the choice of two ways of assigning 
phonemic elements, he chooses the way that will fit his guess: if the [t] 
of mistake could be equally well grouped phonemically with the [t*'] of 
take or the [d] of date, he will choose the former if he wants to consider 
the take of mistake to be the same morpheme as take. In some cases he 
has to distinguish between two morphemic elements because it turns out 
that they are phonemically diflferent : e.g. /'ekanamiks/ and /iykanamiks/ 
(both economics) have to be considered two distinct morphemes. 




3.0. Introductory 

As the first step toward obtaining phonemes, this procedure represents 
the continuous flow of a unique occurrence of speech as a succession of 
segmental elements, each representing- some feature of a unique speech 
sound. The points of division of these segments are arbitrary here, since 
we have as yet no way of enabling the analyst to make the cuts at pre- 
cisely those points in the flow of speech which will later be represented 
by inter-phonemic divisions. Later procedures will change these segmen- 
tations until their boundaries coincide with those of the eventual pho- 

3.1. Purpose: Speech Composed of Discrete Parts 

Utterances are stretches of continuous events. If we trace them as 
physiological events, we find various parts of the body moving in some 
degree independently of each other and continuously: e.g. the tongue 
tip may move forward and upward toward the upper gum while the base 
of the tongue sinks in the mouth, the vocal chords begin to vibrate, the 
nasal passage is stopped off, etc. In general, the various muscles start and 
stop at various times ; the duration of each separate motion of theirs often 
has little to do with descriptive elements. If we trace utterances as acous- 
tic events, we find continuous changes of sound-wave periodicities: there 
may be various stretches during each of which the wave crests are similar 
to one another, but the passage from one such stretch to a second will 
in general be gradual.' 

Fortunately, it is possible to represent each continuous sj)ccrh event 
in such a way that we can then compare various speech events and say 
that the first is different from the second to such and such an extent. 
Our ability to do this rests on the observation that in each language we 
can substitute a close imitation of certain parts of one utterance for cer- 
tain parts of another utterance Avithout getting any consistent difference 

' Cf. the descriptions of speech-sounds in the sources cited in chapter 2, 
in. 1 ; and the speech s})ectrograms jjublished in the Journal of the 
American Acoustical Society 18.8-89 (1940) and in R. K. Potter, G. A. 
Ko])p, and H. C. Green, Visible Speech (1947). 



of response from native hearers of the changed second utterance. We can 
take the utterances Can't do it and Cameras cost too much. If we substitute 
a repetition of the first short ])art of Cant do it for the first short part of 
Cameras cost too much, the changed form of the second utterance will be 
accepted by every native hearer as a repetition of the original Cameras 
cost too much. We therefore set out to represent every utterance by seg- 
mental elements which are substitutable for segments of other utter- 
ances.- For when we have done so we have some way of describing each 
utterance: by saying that it is composed of such and such segments. And 
we have some way of comparing utterances : by saying that one utterance 
(say, Can't do it.) is similar to another utterance (say, Cameras cost too 
much.) in respect to one segment (say, their first part), but differs from 
it in the remaining segments. 

3.2. Procedure: Segmenting Utterances at Arbitrary Points 

We represent an utterance by a succession of segments which end at 
arbitrary points along its duration. We hear a (preferably brief) utter- 
ance, i.e. a stretch of sound, say English Sorry. Can't do it., and consider 
it as a succession of any number of smaller elements. Each of these seg- 
ments may be described very roughly as the sum of particular coincident 
movements of speech organs^ (lip closing, etc.), or as so many sound- 
wave crests of such and such form.'* 

^ Such a dissection can be attempted in various ways. We could find 
a mathematical basis for selecting points at which the sound-wave crests 
change appreciably in form. We could trace the path of each body organ 
clearly involved in speech, from rest through various motions and on to 
rest again. Or we can break the flow of speech, as it is heard, into an 
arbitrary number of time-sections. Menzerath snipped a sound track 
film, rearranged the parts, and played the revised sound track to ob- 
tain new sequences of sound segments. Cf. P. Menzerath, Neue Unter- 
suchungen zur Lautabgrenzung und Wortsynthese mit Hilfe von Ton- 
filmaufnahmen, Melanges J. van Ginneken 35-41 (1937). 

^ Speech organs are those parts of the body whose motions affect the 
air stream in the process of breathing in or out, in such a way as to make 
speech sounds. This is done by determining the extent of air pressure, 
the shape of closed or partly closed resonance chambers, the manner of 
forming or releasing these resonance chambers, and by moving in such a 
way as to communicate a vibration to the air stream. 

* Linguists usually select the segments in such a way as to include 
traditional articulatory features, e.g. the maximum approach of the 
tongue tip to the teeth in the course of the movement of the tongue. 
They may select the segments so that their boundaries represent the 
j^oints where the sound waves change appreciably in form, so that each 
segment represents a portion of speech within which the wave is relatively 


3.3. Result: Unique Segments 

In order to write about our segments, we assign a mark to each one, 
e.g. k'' for the first part of Can't do it. Each sign corresponds to a unique 
and particular segment in a particular stretch of speech. And each sign 
(or the segment which it indicates) is now considered a single element : 
how the sound waves or speech organs changed continuously throughout 
its duration is no longer relevant, except when we reconsider and adjust 
the lengths, below. 

Appendix to 3.2: On the Segmentation of Single Utterances 

It is necessary to justify our use of a single brief utterance, i.e. the 
total speech of a single person from silence to sUence, as a sufficient initial 
sample of the language. We have to show that we can perform upon this 
single utterance the same operations we propose to perform for any 
stretches of speech. That is to say, we have to show that relative to 
what we are now investigating (namely, segmentation) each utterance 
has the same structure as the whole language (i.e. as the totality of all 
utterances in all situations). The justification depends on the empirical 
observation that practically every speech event, from the briefest utter- 
ance of an individual to the longest discussion, consists of an integral 
number of sound-elements of phonemic length. Even interrupted speech 
hardly ever stops except at the end of a phoneme-length sound-element. 

All we want to do in the first few procedures is to show that the 
totality of all speech occurrences which make up a language can be repre- 
sented by segments, which can then be adjusted so that the length of 
each segment is the length of a phoneme. In order to do this to various 
arbitrary utterances instead of to the totality of events in that language, 
we must show that segmental representation of these few utterances is 
equivalent to all the occurrences of speech. 

uniform. Or they may mark as a segment any stretch which sounds like 
what they have elsewhere (e.g. in lOnglish orthograjihy) learned to re- 
gard as 'one sound.' However, neither these nor any other criteria can 
always show us what points of division will turn out later to be most 
useful (i.e. which will come out at the boundaries between the eventual 
phonemes). For example, we may have to recognize two phonemic seg- 
ments in a stretch during which all organs involved are each making 
single continuous movements. This uncertainty leads to no loss in exact- 
ness, because later procedures will determine the boundaries of these 
.segments. If the segment divisions arbitrarily .selected here do not pass 
the test of the later procedures, they can be adjusted, and if necessary 
the utterance can be recorded, anew, with the symbols that will be chosen 
for the adjusted segments. 


If biiof utterances were likely to contain scattered broken bits of 
phonemes, or to end in the middle of a phoneme, then it would be im- 
possible to cut that utterance into segments of phonemic length (or into 
segments which could be adjusted to phonemic length). But this is not 
the case, and the extremely rare case of speech breaking off not at the 
end of a phonemic-length segment could as well occur at the end of a 
long conversation as at the end of a brief utterance. Practically all com- 
plete speech occurrences, from silence to silence, are thus sequences of 
phonemic-length segments. For our present purposes of representing 
speech by segments (arbitrary at first but later to be adjusted to phone- 
mic length), any utterance, no matter how brief, is equally serviceable 
as a sample of speech. The few cases where the utterance does not con- 
tain an integral number of phonemic-length segments can be treated as a 
residue; i.e. the part which cannot be described as such a segment can 
nevertheless be described in our terms by calling it a fraction of a phone- 
mic-length segment. Finally w-e must note that the totality of speech 
occurrences in a given language is merely an integral number of utter- 
ances (including some interrupted utterances), 

This does not mean that for other purposes a brief utterance is also 
serviceable as a sample of speech. Some tone, stress, and rhythm se- 
quences, and some morphological features and morpheme-class sequences 
may appear only in longer utterances (long sentences, or long dis- 
courses), or perhaps only in the conjunction of more than one utterance 
(by more than one speaker) in a conversation. There are limitations upon 
successions of sentences by one speaker, characteristic features marking 
the beginnings and ends of long discourses by one speaker, special fea- 
tures of the succession of utterances among different speakers in natural 
and in hurried conversations, and the like. For the incidence of formal 
features of this type only long discourses or conversations can serve as 
samples of the language. 


4.0. Introductory 

This procedure establishes free variants. It first determines the range 
of variation of a particular sound-segment in repetitions of a particular 
utterance. It then takes utterances (or parts of utterances) which are 
not repetitions of each other and enables us to recognize when a sound- 
segment in one of them is a free variant of a sound-segment in the other. 

4.1. Purpose: To Establish Linguistic Equivalence 

I.e. to enable us to say whether any two segments are descriptively 
equivalent. As long as every utterance is composed of unique segments, it 
cannot be compared with other utterances, and our linguistic analysis 
cannot make headway. 

4.2. Procedure: Grouping Substitutable Segments 
4.21. In Repeated Utterances 

We make analogous segmentations of repetitions of the utterance. 
Having recorded an utterance in terms of the segments we associate with 
it, we now record repetitions of the utterance in identical environment.' 
We then say that each segment of one repetition is freely substitutable 
for (or a free v.\riant of) the corresponding segment of every other 
repetition. That is, if an utterance represented by segments A' B' C is a. 
repetition of the utterance recorded as ABC (where A' is the first n% — 
e.g. the finst third— of the length of ^' 5' C and A is the first n% of 

' In many cases this involves asking an informant "say it again" or 
"what", or asking another informant who is present "Would you say 
that?". In some cultures and in some social situations there may be diffi- 
culties in obtaining repetitions. Where it is impossible, we must wait until 
the utterance recurs in the informants' speech; this may happen more 
frequently in certain situations, e.g. in the course of a conversation 
between informants or in a stylized recital. When what we obtain is not 
an admitted repetition, (and, sometimes, even when it is) we have to 
judge whether utterance B is indeed a repetition of utterance A, by con- 
.sidering the situation, meaning, and sounds. The validity of our judg- 
ment is checked in 4.5 and the Appendix to 4.21. This is equivalent to 
Rloomfield's 'fundamental assumption of linguistics: we must assume 
that in every speech-community some utterances are alike in form and 
meaning' (Language 78). 



tlu' longth of ABC, oiv.)- then A' = A, B' = B, C = C. If segments 
are freely substitutable for each other they are descriptively equivalent, 
in that everything we henceforth say about one of tlicm will be equally 
applicable to the others.'' 

4.22. In Different I Iterances 

Wc substitute a repetition of segments of one utterance for equivalent 
segments of another. As preparation for doing this, we may first note the 
range of free variants in a segment of a repeated utterance, e.g. what we 
may have recorded as [k**, kh], in repetitions of Can't do it. We then 
choose another utterance whose repetitions show an apparently similar 
range for one of its segments, e.g. what we may have recorded as [k**] in 
Cameras cost too much. We now substitute the [k''] or [kh] segment of the 
first utterance for the [k""] segment of the second. We do this by pronounc- 
ing Can't do it with our best imitation of the [k**] we had heard in Cameras 
and seeing if the informant will accept it as a repetition of his (or another 
informant's) Can't do it.'^ Alternatively, we may wait to hear some in- 

^ This is necessary in case one repetition is much slower than another, 
so that only the relative and not the actual lengths of the segments are 

^ Any differences among the mutually substitutable segments are not 
clue to linguistic environment or relevance. It is therefore immaterial if 
we recognize many or few differences among the equated segments. In 
some cases, we may be unable to hear any difference among free variants, 
as when the initial segments of two repetitions of Sorry, sound absolutely 
identical to us. In other cases, we may notice the difference, as between 
a very strongly and a less strongly aspirated [k**] in two pronunciations 
of Can't do it., or between an [o]-like and an [u]-like segment in two pro- 
nunciations of a foreign utterance. 

It is in general easier to notice differences between freely varying 
segments in a foreign language than in one's native language, where one 
has become accustomed not to notice such linguistically irrelevant facts. 
On the other hand, one may easily fail to notice slight but relevant dif- 
ferences in a foreign language if these differences do not occur in one's 
native language, or if they occur there only between members of one 
phoneme and morphophoneme. 

However, if we used exact measurement (such as sound-wave records), 
or if we can hear each repetition many times over by machine duplication 
(as in magnetic recording), w'e would probably find that every seg- 
ment differed in some way from each of its equivalent segments. What 
we hear as identical free variants are therefore merely an impressionistic 
special case of different free variants. 

■* This substitution is not as simple as the direct repetition of 4.21. If 
we cannot quite tell whether the substituted form is accepted as a repeti- 
tion, we may check to see if the informant identifies the new pronuncia- 


formant pronouncing Cameras cost too much with a segment identical to 
our ears with the original [k^].^ 

More generally: We take an utterance whose segments are recorded 
as DEF. We now construct an utterance composed of the segments DA'F, 
where A' is a repetition of a segment A in an utterance which we had 
represented as ABC. If our informant accepts DA'F as a repetition of 
DEF, or if we hear an informant say DA'F in a situation which permits 
us to judge that utterance as equivalent to DEF, and if we are similarly 
able to obtain E'BC (£" being a repetition of E) as equivalent to ABC, 
then we say that .4 and E (and A' and E') are mutually substitutable 
(or equivalent), as free variants of each other, and write A = E} If we 
fail in these tests, we say that A is different from E and not substitutable 
for it. 

The test of segment siibstitutability is the action of the native speaker : 
his use of it, or his acceptance of our use of it.'' In order to avoid misunder- 

tion by the same rough translation which had identified the original ut- 
terance. This still avoids struggling with exact meanings. 

Behind our ability to substitute parts of one utterance for pai'ts of 
another lies the empirical fact that in every language the speakers recog- 
nize not an indefinitely large number of distinct, unsubstitutable sounds 
(so that every new utterance may contain a new distinct sound), but a 
relatively small stock of distinct classes of sound. This stock is in general 
closed. I.e. when a sound occurs in speech we can in general assign it to 
one of relatively few classes of soimds used in describing the language; 
or we may say that the speaker imd hearer react to it as to a member of 
one of these few classes of sounds in the language — or else as to a sound 
from outside their language. The classes of sounds recognized in the 
language are thus limited in number. 

^ If it were possible to work with sound tracks, we would record 
Can't do it with [kh] and Cameras cost too much with [k'']. We would then 
snip the [k]-segments out (leaving the smoothest break possible) and 
interchange them, and play the film back to our informants to see if they 
will accept the new Cameras cost too much with [kh] as a repetition of their 
original Cameras cost too much which had [k'']. Distortion would occiu-, of 
course, at the points of snipping, but that should not prechide the ac- 
ceptance of the repetition. 

* If we obtain DA'F as repetition of DEF, while E'BC is not accejjtable 
as a repetition of ABC, we can say that A = E in the environment 
D — F but not in the environment — B or — BC. 

'' The use of instruments which permit exact measiu-ement, as the ear 
does not, may enable us to omi)loy tests of measured similarities (of 
sound waves or body motions) instead. But only those tests will be lin- 
guistically relevant which will accord with (even if they are not based 
on) the speakers' actions. This ultimate correlation is the only one 
which has so far been found to yield a simple language structure. 


standings or false informant responses, it is sometimes necessary to repeat 
the test under various conditions and to obtain statistical reliability for 
the response.* 

4,23, Paired Utterances 

A more exact test is possible when we wish to find out if two utterances 
are repetitions of each other, i.e. equivalent in all their segments (homo- 
nyms): e.g. She's just fainting as against She's just feigning. We ask two 
informants to say these to each other several times, telling one informant 
which to say (identifying it by some translation or otherwise) and seeing 
if the other can guess which he said. If the hearer guesses right about 
fifty percent of the time then there is no regular descriptive difference 

* At this point the question could be raised whether the procedure of 
4.22 does not include and render superfluous the procedure of 4.21. For 
both procedures show that particular segments are equivalent to each 
other, and the range of differences among the mutually equivalent seg- 
ments is identical in both sections. The only advance made in 4.22 is 
our ability to spot the mutually equivalent segments in any utterance in 
which they occur, whereas 4.21 permitted this only in repetitions. How- 
ever, it is preci.sely this advance that requires the preliminary procedure 
of 4.21. For 4.22 finds that different utterances are similar in some fea- 
tures of parts of their duration. But since each segment of each utterance 
is a unique event, presumably different in some way or other from every 
other unique segment, how are we to decide which features should be 
subjected to the test of substitutability? If we take the [k*"] of a par- 
ticular occurrence of Can't do it. we may find that it is similar to the 
unique [k''] of cameras in general character (in articulation: voicelessness, 
aspiration), but somewhat different from it in loudness, while it may be 
similar to the [g] of 77/ gather some, in loudness but somewhat different 
in general character. It is true that in this case we would unhesitatingly 
guess that the initial segments for can't and cameras will prove equiva- 
lent, rather than those of can't and gather. However, in working with 
languages foreign to us, we may be hard put to decide what substitutions 
are worth attempting. The point is that these unique segments are sub- 
stitutable for each other because they are identical in some respects (e.g. 
voicing, in English) without regard to any differences they may have in 
other respects (e.g. absolute differences in loudness, in English). If we 
take 4.22 without a preceding 4.21, we would be unable to supply an 
orderly method of treating the data, such as would tell us what respects 
to disregard, what substitutions are worth attempting. Instead of that, 
we use the procedure of 4.21, which offers a simple program for discover- 
ing what respects to disregard. The range of differences among the unique 
but mutually substitutable segments is identical both in 4.22 and in 4.21, 
but in 4.21 we assume in advance that certain unique segments are equiv- 
alent and all we need do is note their differences in order to disregard 
them. Equipped with this information, we can then seek in 4.22 for seg- 
ments whose equivalence we do not know in advance but whose differ- 
ences are similar to those which we have decided to disregard. 


between the utterances; if he guesses right near one hundred percent, 
there is. 

4.3. Result: Equivalent and Non-equivalent Segments 

We can now tell what segments in any utterances are descriptively 
equivalent to each other. Whatever symbol we use for a particular seg- 
ment we will now use for every segment equivalent to it. When we get a 
class of segments which are free variants of each other, we use one symbol 
which indicates any member of that class: we write just [k''] both in 
can't and in cameras. Any differences we may have noticed among the 
equivalent segments are henceforth disregarded.^ This reduces consider- 
ably the number of different symbols (or of differentiated segments) in 
our record of utterances. 

The comparisons of utterances in 4.22 and 4.23 not only enable us to 
say that certain segments are descriptively equivalent, but also enable 
us to say that certain segments (which have proved to be not mutually 
substitutable) are descriptively non-equivalent or distinct (i.e. unsub- 
stitutable). For the further analysis of the language, the explicit record 
of descriptive (or, as they are called, phonemic) distinctions is as impor- 
tant as that of descriptive equivalences. If we have a body of text in a 
language, and do not know which segments in it are equivalent to each 
other (e.g. whether a gr in one line of the text is substitutable for a k in 
another), we can do little in the way of further analysis. If we do not 
know which segments are distinct from each other, e.g. whether a word 
gam in one line is distinguishable from a gam, or kam, in another, we still 
can do little analysis. When these two sets of data are explicitly given, 
however, it is possible to carry out the rest of the analysis. The funda- 
mental data of descriptive linguistics are therefore the distinctions and 
equivalences among utterances and parts of utterances. The operations 
of 5-11, including the setting up of phonemes in chapter 7, are manipu- 
lations of these distinctions on the basis of distribution. 

4.31. Distinct Utterances and Distinct Elements 

The fundamental purpose of descriptive linguistics would be served if 
we merely listed which utterances were distinct from which others: e.g. if 

^ The difference among equivalent segments here is no greater than 
that which may be noticed among the analogous segments of repetitions 
of an utterance such as had been equated in 4.21. The i)rocedure of 4.22 
reduces to that of 4.21; for when we substitute the [k''] of cameras in 
can't we get a new repetition of can't containing the new [k''] heard in 
cameras, and by 4.21 that new [k''] thereupon becomes equivalent to the 
other [k''] segments heard in previous repetitions of can't. 


we said that tack, pack, tip, dig, It's lacking, It's lagging, were each dis- 
tinct from each other. However, in order to operate with this informa- 
tion, it is necessary to put the data in the form of elements, to localize 
the distinctions between tack and tip in particular segments. Equivalent 
utterances are then defined as being equivalent in all their segments; 
distinct utterances are non-equivalent in at least one of their .segments. 
If we wish to work out a system of distinct elements for many utterances 
(e.g. for tap as well as for tack and tip) we will have to recognize more 
than one distinction between certain non-equivalent utterances. Thus 
tack will be distinct from tap in its last segment, and tip will be distinct 
from tap in its middle segment, and instead of saying that tack is distinct 
from tip in just one segment we will use our two i)revious distinctions 
and say that tack is distinct from tip in its middle and last segments.'" 

It may be noted that the representation of speech as a sequence or ar- 
rangement of unit elements is intimately connected with the setting up 
of phonemic distinctions between each pair of non-equivalent utterances. 
If each utterance were considered by itself, it might be represented as a 
continuum or as a simultaneity of features which change with time; and 
the segmenting operation of chapter 3 might not come into consideration 
at all. However, if we match utterances, we obtain some individual dif- 
ference between the members of each particular pair of utterances; that 
is, we obtain discrete elements each of which represents some particular 
inter-utterance difference. By the method of chapter 5, fn. 3, these dif- 
ferences may be expressed as combinations of a few basic differences: the 
difference between some utterances is exactly one basic difference (e.g. 
tack-tap); the difference between others is some particular sum of par- 
ticular basic differences (e;g. tack-tip). We thus obtain discrete elements 
which can be combined together. These elements are phonemic distinc- 
tions, rather than phonemes; i.e. they are the difference between /k/ 
and /p/ (more exactly, between tack and tap, between sack and sap, etc.) 

^° See chapter 5, fn. 3. The equivalent segments are phonemically not 
distinguished from each other, since substitutable segments will be con- 
sidered in chapter 7 to be free variants of each other within the same 
phoneme. If we find a group of equivalent segments (in a particular en- 
vironment) which is not substitutable for another group of equivalent 
segments in the same environment, we say that the two groups are pho- 
nemically distinct from each other. In chapter 7 it will not be possible to 
include in the same phoneme two segments (or two groups of segments) 
which are phonemically distinct from each other. This establishment of 
the basis for phonemic distinction is the major contribution of the 
present procedure toward the setting up of the phonemes of a language. 


rather than being /k/ and /p/ themselves. However, for convenience, 
we will set up as our elements not the distinctions, but classes of seg- 
ments so defined that the classes differ from each other by all the phone- 
mic distinctions and by these only. These elements are obtained by sum- 
ming over all distinctions: [k] — [p], [k] — [1] {pack — pal, sick — sill), [k] — [s] 
(pack — pass), etc.; [1] — [t] (sill — sit), [I] — [s] (pal — pass), etc. In this way 
we define /k/ to represent all the paired distinctions in which [k] was 
a member, /I/ to represent all the distinctions in which [1] was a member, 
and so on. The classes, or phonemes, are thus a derived (but one-one) 
representation for the phonemic distinctions. The segmentation of chap- 
ter 3 was carried out in order to permit the representation of continuously 
varying speech to express the discrete elemental phonemic differences. A 
phonemically written form therefore is not a direct record of some spoken 
form, but rather a record of its difference from all other spoken forms of 
the language. 

4.4. Length of Segments 

So far, we have left the length of segments arbitrary (3.2). The utter- 
ance I'll tack it could be divided into seven segments a, I, t'', ce, k, i, t. 
Later we will find that this division, into segments whose length we will 
call phonemic, is convenient for linguistic analysis. However, to start 
with we might just as easily have divided it into fo\ir segments, say 
A (= al), T (= fw), k,I (= it). 

It can now be shown that the procedure of 4.22 will prevent any seg- 
ment from being longer than one phoneme length. This breaking down of 
longer segments results automatically from the repeated use of substitu- 
tion if we make one condition: that we will carry out the test of 4.22 not 
only between a previously derived segment and some new segment which 
seems similar to it, but also between any part of a new segment which 
seems similar to some previous segment or part of segment. 

Suppose we segmented I'll tack it as A Tkl, I'll pack it a^ AP (= p''w) 
kl, I'll tip it as AQ (= tH) pi, and I'll dig it as AD (= di) gl. Then as 
soon as we investigate the substitutability of parts of our segments, we 
would find that the first part of T was substitutable for the first part of Q, 
the last part of T for the last part of P, and the last part of Q for the last 
part of D. We would thus isolate /'', ce, and i; and this would force us to 
isolate p'' (as the remainder of P) and d (as the remainder of D). 

The only segments longer than one ])honeme which would remain 
would be those whose parts are not substitutable for any other segment: 
e.g. a linguist might find no reason for dividing l^nglisli |^1 (as in That's 


his chair) because its first (stop) part differs from English [t], and its 
last (continuant) from [s|. However, all these remaining long segments 
will be broken down in the procedure of chapter 9 belov. 

4.5. Correcting Possible Errors 

In obtaining repetitions of an utterance we may have equated utter- 
ances and segments which would later prove to be phoncmically differ- 
ent. The substitution test will bring these out. If a foreign linguist took 
English men as a repetition of man, he would discover his error as soon 
as he tried to substitute these presumed free variants in ten or plan. If 
we had thought Moroccan Arabic y'fiuh 'they give him' a repetition 
of y't'iu 'they give', and had equated uh with u, we could still succeed in 
substituting ^^h for u in byau 'they want' (obtaining unawares byauh 
'they want it'), but would fail in fau 'god', where we could not get a form 
lauh. In all such cases we go back and correct our record of what we 
thought had been rejjPtitions." 

Appendix to 4.1: The Reason for Equating Segments 

The procedure of chapter 3 represents each uniquely heard whole ut- 
terance by a sequence of unique segmental elements. If we are to be able to 
compare various utterances and to make general statements about them, 
the mere representation by segments will not suffice, if the segments re- 
main unique. We must therefore find ways of comparing the segments, so 
that we should be able to say that segment A is equivalent to, or differ- 
ent from, segment B}^ 

In order to do this, we first find out how to compare the segments in 
two occurrences of a repeated utterance. We assume the repetitions to be 
descriptively equivalent to the first pronunciation. Therefore, if a per- 

" In some cases we may have to make this correction even though 
we can find no mistake in our original work. E.g. Moroccan Arabic 
[bg9r'] and [bqar'] 'cow' occur as repetitions of each other, yielding 
[r] = [q] according to 4.21. However, we now find that [g] and [q] are 
not mutually substitutable in [gr'a'] 'squash' and (qr'a'] 'ringworm'. If 
upon checking back we find that the first two are actually simple repeti- 
tions of each other, then we have [g] = [q] in some utterances and [g] :t^ 
[q] in others, a crux which will remain unresolved until chapter 7, fn. 14. 

'^ Then we would be able to say whether utterance X, represented by 
segments A, D, E, is descriptively equivalent to utterance Y, represented 
by segments B, C, F. We will say that Z = I' if ^ = B,D = C,E = F: 
one occurrence of Yes. is equivalent to a second occurrence of Yes. if the 
y of the first is equated to the y of the second, etc. The utterances may 
have differed, of course, in many other respects (e.g. energy of speech), 
but we are equating only the segmental representation. 


son says Can't do it. twice' ^ we disregard any small differences that we 
may notice between the two pronunciations. Since these two utterances 
are frequently not distinguishable one from the other by native hearers,''* 
we say that as far as our present linguistic analysis goes, the repetition 
is equivalent to the original utterance. In order to be able to show this, 
we say that the various segments of the repetition are each equivalent 
to the corresponding segments of the original utterance. In order to do 
this, in turn, we will agree to disregard any differences between the corre- 
sponding segments of the two pronunciations. However, if we are to treat 
utterances in general, we must be able to compare even utterances which 
are not repetitions of each other. In two utterances which do not repeat 
each other (say, Why? and Did you try?), we must be able to say whether 
segments A and B (say, the final [ay] of each utterance) are equivalent, 
and whether segments C and D (say the the initial [w] and [d] of each) are 
different. Given utterances X and Y, the procedure of chapter 4 enables 
us to tell wherein (i.e. in which of their segments) X is equivalent to Y 
and wherein they differ. 

Appendix to 4.21: On the Equivalence of Repetitions 

It must be borne in mind that when we ask for repetitions we may get 
a totally different utterance (e.g. auto for car), or a partially different one 
{rocking-chair for rocker). The different utterance may be sufficiently 
similar in sound to mislead us. If the putative repetitions come in dif- 
ferent environments (e.g. knife, but knive before -s), the present pro- 
cedure does not enable us to equate the corresponding segments (we can- 
not say [f] = [v]). If we fail to notice a difference of environment, as 
may happen in Moroccan Arabic y't'iuh 'they give him' for y't'iu 'they 
give' (where even in conversational situations we may fail to recognize 
the presence of h 'him'), or in the differently pitched Auto and Auto?, 
we may falsely conclude that uh = u, or that the segments with falling 
intonation are equivalent to corresponding segments with rising intona- 
tion. There is no great loss in such false conclusions because they will 
necessarily fail to pass the test of 4.5. 

The procedure of 4.21 enables us to equate two utterances (and their 
component segments) as repetitions of each other without knowing what 

'^ With no recognizable difference in intonation: i.e. the second time 
is not Can't do it! or the like. 

'* Except that one occurred before the other; or if the rej)otition was 
made by another person, the hearer could identify the individual or, say, 
his social group (e.g. age or sex) by the (liffcrenco in some features which 
we are not selecting to measure at present. 


morphemes they are composed of or what the boimdaries of the com- 
ponent morphemes are. It avoids any reliance upon the meaning of the 
morphemes, or any need to state, at this early stage, exactly what the 
morphemes or utterances mean. It precludes our asking the informant 
if two morphemes are 'the same'. All that is required is that we have an 
exi^licit repetition of an utterance, or an utterance which we tentatively 
consider to contain the same morphemes (whatever they may be) as 
another utterance contained. If two utterances which we consider to be 
repetitions of each other are actually different in morphemic content, 
the error will necessarily be brought out in the following procedures 
of 4.5. 

Appendix to 4.22: Matching in Frames 

Where simple substitution is impracticable, any other method of 
matching two segments will serve the same purpose. Such matching is 
easiest when the two segments can be tried in a single (repeated) frame. 
E.g. if we know that Fanti denkem 'crocodile' and poon 'pound' each 
have a relatively higher pitch on the first vowel than on the second, but 
we want to know if the absolute pitch of denkem is higher than that of 
podn, we get a frame through which we can pass both of them. We may 
get the informant to say each of them before dndn 'four' and then we 
see that in denkem dndn 'four crocodiles' the last tone of denkem is higher 
than the first of dndn whereas in poon dndn 'four pounds' the last tone 
of podn is of the same height as the first of dndn. Hence the second tone 
of denkem is relatively higher than that of poon and we have at least three 
phonemically different tone levels: high, mid, and low. In doing this, we 
must make sure that the tone of poon or denkem, when these words occur 
before dndn, does not differ phonemically from their tone when they oc- 
cur by themselves ; and that the tone of dndn after poon does not differ 
from the tone of dndn after denkem}^ 

Appendix to 4.23: Interpretation of the Paired Utterance Test 

If the test of paired utterances shows them to be not linguistically 
equivalent, we still do not know exactly Avhat the differences between 
the two utterances are. We cannot assume that the two utterances are 
pairs," i.e. will later turn out to be phonemically different in only one 
segment: If we compare Marx sat with Mark's sad we could find a regu- 
lar difference which will turn out to be phonemic in only one segment [t] — 

'^ Data from W. E. Welmers, A descriptive grammar of Fanti, Lan- 
gt^age Dissertation 39 (1946). 


[d] (although a difference also occurs between [se*] and [se]). But if we 
compare He sat with He said the differences will later appear to be phone- 
mic in two segments: [x] — [e*] and [t] — [d]. Even if the utterances will 
later be shown to constitute a pair, their one phonemic difference is not 
always in the segments which most obviously differ phonetically: in The 
writer passed by as against The rider passed by the clearest difference is in 
the vowel, with little (or no) difference in the middle consonant; but we 
should note both differences because later we may wish to pin the pho- 
nemic difference on the consonant. Knowing that at least one segment of 
the first utterance differs from the corresponding one of the second (since 
the two utterances are distinguishable by natives), we must perform the 
test of 4.22 and note all the pairs of segments which are not mutually sub- 

Appendix to 4.3: Intermittently Present Distinctions 

In most cases, if we find that one utterance is not equivalent to an- 
other, the distinction to which this non-equivalence is due remains no 
matter how often we have each utterance repeated. This is a necessary 
condition for the operations of chapter 4. Not infrequently, however, we 
meet an utterance which is pronounced with different, non-equivalent, 
segments in different repetitions. 

In some of these cases, two segments appear equivalent in repetitions 
of one utterance but not in another utterance : [e] and [iy] seem equiva- 
lent if we get [ekanamiks] and [iykanamiks] ecoyiomics as repetitions of 
each other, but not in repetitions of even, ever, elemental. Because of the 
latter, we consider [e] and [iy] to be distinct segments. The relation be- 
tween them in forms like economics will be treated in 13.2. 

In other cases, while the alternation of segments occurs freely in 
some utterances and not at all in others, it never constitutes the sole 
distinction between two distinct utterances (in the manner of 4.23 and 
4.31). Thus in repetitions of The seat is loose, we may obtain both 
[§9 'siytiz'luws.] and [Sa 'siyr'iz'luws.] (where [r^] indicates a single al- 
veolar flap of the tongue). In repetitions of meter ['miyr'ar] we hardly 
over get [t] instead of [r']. Thus a variation of segments which occurs ap- 
l)arently freely in one utterance hardly ever occurs in another. Similarly, 
in repetitions of Take one. we will obtain both ['t'eykiwan] and ['t'eyk# 
iwen], while in repetitions of inquest we will only obtain ['inikwest] 
without any occurrences of a break (#) between the [k] and [w].'^ Many 

'* This pause which occurs only in some, not all, repetitions of an ut- 
terance is called facultative pause in Bernard Bloch, Studies in Colloquial 
Japanese II. vSyntax, Language 22.201 (1946). 


variations of this typo will appear, in chapters 8 and 12, to be related to 
junct vires anil morpheme boundaries. There we can treat them by saying 
that the two vittoranoes (e.g. Take one and inquest) differ in that one of 
them hiu> an intermittently present segment which the other lacks. In- 
termittently present segments or variations are then such as occur in 
some but not all repetitions of an utterance. If we take two utterances 
which are distinguished from each other only by the presence of an in- 
termittent segment in one which is lacking in the other, they will be 
equivalent to each other in some of their repetitions and not in others.'^ 

Appendix to 4.5: Continued Testing of New Utterances 

Since what we test is the substitutability of segments, we cannot tell 
whether in a new utterance, e.g. Cash it! the segment we record as [k'"''] 
is equivalent to our previous [k**] segments until we have substituted one 
of these for it. It is true that after several attempts, we get to recognizing 
the differences among the various segments, so that even without making 
the test we can be quite sure that, say, the [kh] of a newly recorded 

'^ The probability of [k] and [w] occurring in Take one and in inquest 
is 1. The probability of # occurring between these two in inquest is in 
effect 0. The probability of # occurring between these two in any par- 
ticular pronunciation of Take one is larger than zero and smaller than 1 : 
the # occurs only intermittently in repetitions of Take one. It could be 
objected that we are here changing the definition of repetition and 
equivalence, that in terms of 4.22 we should consider [kw] and [k#w] as 
non-equivalent (as we do here) and that therefore we should say that 
["teyk#iwan] is not a repetition of ['teykiWan]. This latter could indeed 
be done. But the conditions of obtaining the data (the fact that in- 
formants will regularly give the two pronunciations as repetitions of each 
other, and that many utterances will have this feature), and the con- 
venience of later morphological analysis, makes it preferable in such 
cases to preserve the repetitive relation between the two pronunciations 
by defining the intermittent segment. Then if we wish we can write in- 
termittent segments in parentheses, irrespectively of whether they occur 
in a particular pronunciation, and say that Take one is ['teyk (#) iWan], 
meaning that it is sometimes pronounced with the # and sometimes 
without. Segmental representations which are not designed to indicate 
intermittent segments can be based on a single occurrence of the utter- 
ance, and can be tested by a single occurrence of it (which must show 
exactly the segments of the representation). However, segmental repre- 
sentations which are designed to indicate intermittent segments can be 
based only on a number of repetitions of the utterance (since a single oc- 
currence of it would either contain or not contain the segment in ques- 
tion, and intermittency could not be noted), and can be tested only by a 
sufficiently large number of repetitions of the utterance. Only after such 
a number of repetitions can we say that an utterance has or does not 
have intermittently present segments. 


Can youf would be substitutable for the segments we have marked [k^]. 
But there is at present no way of measuring the difference between the 
various groups of mutually equivalent segments, no way of measuring 
the range of free variation within each group. For example, we can say 
that various labial nasal segments (e.g. the initial ones of Must I? 
Missed it?) are free variants of [m] and various alveolar nasal segments 
are free variants of [n]. But we cannot say that all alveolar nasal segments 
we ever meet in the language in question will be free variants of [n]. In 
fainting we get a nasal alveolar flap. A non-native linguist might take it 
for granted that this segment is an additional free variant of the [n], 
substitutable for it. Only if he runs into a pair like fainting-feigning and 
tests them in the manner of -i.23 will he find that these two are not mutu- 
ally substitutable.^* 

Until we know the language very well, therefore, we must be ready to 
apply the substitution or pair tests whenever circumstances suggest that 
a segment in a new utterance might not vary freely with the other seg- 
ments with which we wish to equate it. 

'* If the segments to be tested are so similar that the linguist cannot 
be sure that he can distinguish them, or that he can pronounce them dif- 
ferently for the informant, he cannot be sure of the results of the test 
in 4.22. The best test is that of 4.23, for which the linguist must try to 
find pairs which differ if at all only in the segments under suspicion. 


5.0. Introductory 

This procedure (and that of 4.4) sees to it that the length of the seg- 
mental elements should be that of phonemes. It provides that the seg- 
ments should be neither longer nor shorter than is necessary to differen- 
tiate phonemically distinct utterances, so that minimally different utter- 
ances will differ in only one of their segments. 

5.1. Purpose: Descriptively Equal Segment Lengths 

We now seek to obtain a linguistically (not physically) fixed length 
for segments. When stretches of speech were first segmented, the points 
of segmentation were left arbitrary (3.2). Later, an upper limit to the 
length of segments was automatically obtained as a result of extending 
our procedure of substitution to any sub-divisions of our original seg- 
ments f4.4). Our procedures have hitherto fixed no lower limit on length: 
e.g. in the stretch of speech Patsy, the section before the [se] could be 
considered to constitute one segment [p''], two segments [p] and [h], 
three segments ([lip closure], [lip opening], [h]), and so on.' 

To obtain a uniform way of determining the number of segments for 
any given utterance, all that is now necessary is a method for fixing a 
lower limit to segment length. 

.5.2. Procedure: Joining Dependent Segments 

We join into one segment any succession (even if discontinuous) of 
segments which always occur together in a particular environment. Sup- 
pose we have originally divided Tip it!. Pick it!. Stick it! into segments 
|thip it], fphik it], [stik it]; the [t] of [thip], [stick] are mutually sub- 
stitutable, as are the [h] of [thip], [phik]. We now find that utterance- 
initial [t], [p] never occur directly before a vowel : i.e. between silence and 
a sound such as [i] there is only the sequence of segments [th], [ph], never 

^ The procedure of this section is designed to yield automatically a 
fixed number of segments to represent any given stretch of speech. It 
should enable us to decide whether a given stretch of the utterance (ex- 
cept for boundary regions of segments), is part of the segment preceding 
or following it, or constitutes a segment by itself. This is desirable in order 
to simplify the manipulation of the segments, which is described in 
chapters 6 and 7. 



[t], [p] alone. We then say that the sequence of our original segments 
[th] between silence and vowel, is to be henceforth considered as one seg- 
ment (which we may write [t']). 

More generally, if in a given portion of many utterances (a given 
environment) segment A never occurs without segment B, we consider 
.4 + 5 to constitute together one segment. For this procedure it is not 
required that B should also never occur without A : B may occur inde- 
pendently of J., as [h] does in hill. It is also not required that A be always 
attended by B in any environment: after [s], [t] occurs without [h], but 
our joining of [t] and [h] was limited to the position between silence and 

If we had originally taken lip-and-nose closure and lip opening as 
separate segments, we would find that in most positions they occur al- 
ways together: in'those j^ositions, e.g. in pin, happy, the two would then 
constitute one segment.^ 

5.3. Result: Utterances Divided into Unit Lengths 

We now have a determinate way of placing the segment dividers in 
every utterance. Each stretch of speech now has some fixed number of 
segments, or, we may say, a fixed number of imit lengths each of which 
is occupied by some segment. All our segments now have this in common : 
there are as many segments in each utterance throughout the language 
as will enable us to distinguish each utterance from each other uttei'ance 
which is not a repetition of it, and no more.^ 

In effect, this means that the distinct phonemic (■omi)()siti()n of each 
utterance is defined as the sum of its minimal differences from all other 
utterances of the language. For a pair such as pick-pit we find no smaller 
minimal differences than the diffeience between these two members of 
the ])air. The difference in the pick-pat pair can be stated as the sum of 

2 We would also find a few positions in which lip-and-nose closure oc- 
curred alone, e.g. in unreleased final [pi (map), or before [m] {shipynate) 
wliere the lip opening occurred only after a lip-closed nose-open segment. 
These would be found in chapter 7 to be positional variants of the com- 
bined lii)-and-nose closing plus lij) oi)ening segment (i.e. the [p] of the 
other positions). 

^ The first part of fliis sentence results fi'om 4.4; and 'no more' from 
5.2. Thus stark will have 5 segments, sark 4 segments, arc 3 segments, 
are 2. The first of these may seem to have more segments t han the pre- 
vious sentence reciuires, since there is no tark; so that if stark merely had 
a different iS than sark, it would suflice to distinguish the two utterances. 
However, the substitutions of 4.4 would break the N (= |st 1) of stark 
into a sequence of the [s] of I'm sorry and the [t] of tar. 


the pick-pit and pit-pat differences. The difference in sick-land can be 
stated as the sum of the differences in, say, sick-sack, sack-lack, rack-ran, 
fine-find, each of these having been shown previously to be a minimal 
difference. We identify sick by its difference, stated in this manner, from 
all other utterances of the language. 

.\ppendix to 5.3: Unit Length and Phoneme Length 

It will be seen that the conditions of 5.3 are precisely true of phonemes, 
so that we may say that each of these unit-length segments now has the 
length of one phoneme.^ These segments differ from phonemes in that 
the operations of chapters 6 and 7 have not yet been carried out upon 

The segment lengths obtained in 5.2 may still differ from phonemic 
lengths in one case, which also escaped the net of 4.4: when there is a 
sequence of unique segments which are shown in chapter 9 to constitute 
a sequence of more than one phoneme. Thus the segment [6] (back [t] 
plus [s]-release), and in some English dialects perhaps the sequence [tr] 
(post-dental [t] plus voiceless spirant release), are each composed of 
smaller segments which occur only next to each other (back [t] only 
next to [s], [r]-spirant only ne.xt to post-dental [t]). Hence if we had taken 
the two rather different [t] parts of [6] and of [tr] as equivalent segments 
separate from their respective spirant parts, we would probably now be 
forced in each of these cases to join the respective stop and spirant parts 
of each into one segment (back [t] + [s] into a single [c], [t] -f- [r]-release 
into a single [tj]) on the basis of 5.2, since the particular type of stop of 
each of these segments occurs only before the particular type of spirant 
and vice versa. Only chapter 9 will enable us to break such segments into 
smaller phonemes. 

■* This length, of course, is not an absolute time measurement, but 
marks the number of segments per utterance as defined in 5.3. Length is 
thus a distributional and relative term. It measures how much of the 
duration of the utterance is dependent upon other parts of the duration 
of the same utterance (5.2), or is equivalent to parts of the duration of 
other utterances (4.4). 


6.0. Introductory 

This procedure develops representations for those features of speech 
such as tone or stress sequences and other contours ('secondary pho- 
nemes', prosodemes) which extend over whole utterances, whether or 
not these contours have independent meanings. Extraction of these con- 
tours as distinct single elements leaves in each utterance a sequence of 
segments which are devoid of such features as tone and stress, and which 
are in fact the traditional positional variants of phonemes. 

6.L Purpose: Utterance-long Equivalent Features 

We want to be able to say that two different utterances are equivalent 
throughout their duration in some one of their utterance-long features, 
whether or not they are similar in any of their successive segments. 

The procedures of 4.21 and 4.23 enabled us to tell if two utterances 
were similar, in their entirety: e.g. two repetitions of / sewed it; or the 
two utterances I sewed it and / sowed it. The procedure of 4.22 further 
enabled us to tell if two utterances were identical in part of their length : 
e.g. Can't If, Cameras. This was the result of dividing the utterances into 
successive segments, so that we could distinguish or equate individual 
segments, and not only whole unitary utterances. 

The division of utterances into segments was performed on the basis 
of the fact that short stretches (segments) of one utterance were sub- 
stitutable (hence, equivalent) in our representation to those of other ut- 

However, in the utterances of many languages we can find some fea- 
ture which extends throughout the length of an utterance and is de- 
scriptively equivalent to a comparable feature extending over the length 
of the others. Did he come? May I enter? He saw you? all have equivalent 
tone sequences: rising on every stressed vowel and on every segment 
after the last stressed vowel. In contrast, / met him. He's here. Just got in. 
may all be represented as having in common another tone sequence, dif- 
ferent from the preceding one. 

If our only object is to obtain some representation in terms of discrete 
elements for the utterances of a language, it would not be necessary to 
pay special attention to these utterance-long features. There are un- 
doubtedly many other features in respect to which some utterances are 



similar to eaoh other and different from other utterances. For example, 
we could consider all the utterances pronounced with a particular degree 
of loudness as against those with other degrees; or all utterances con- 
taining 15 unit-length segments as against those containing 14 or 16. 
However, the utterance-long features to be discussed below are particu- 
larly important for several reasons. In the first place, they are linguistical- 
ly relevant in many languages: e.g. a small difference in utterance intona- 
tion in English can be more easily correlated with different speech sit\ia- 
tions or hearer responses than can small differences in loudness, or in 
number of unit lengths per utterance. Secondly, they often occur in only 
a limited mmiber of contours (i.e. of successive changes in grade) : 
whereas almost any degree of loudness (within limits) occurs in English 
utterances, we have only certain sequences of tone changes, such that 
most utterances occur with one or another of these particular sequences 

6.2. Preliminaries to the Procedure: Discovering Partial Simi- 

6.21. In Paired Utterances 

It was seen in chapters 4, 5 that if two utterances are not repeti- 
tions of each other, ^ the elements by which one is represented must differ 
in at least one point from the elements of the other. If, in seeking to de- 
termine which elements are those that differ we notice a difference in 
some utterance-long feature such as intonation, but have no reason to 
localize the difference in any one segment rather than another (because 
all the segments may differ in the contour feature which differentiates 
the utterances), we must consider all or several of the segments to be 

Thus //e's coming? [hiyz kamiTjp is not a repetition of He's coming. 

' If we can find no difference in the other features (e.g. consonants and 
vowels) while the pairing test of 4.23 shows the utterances to be descrip- 
tively unequal, we may assume, as a working hypothesis, that the de- 
scriptive difference lies in the observed difference in the utterance-long 
feature (intonation or the like). 

^ Using higher numbers for higher relative pitch levels. The raised bar 
before letters indicates loud stress on the vowel following, the lowered 
bar secondary stress, and two raised bars (") extra loud stress. For a 
rather similar analysis of English intonation, see Rulon S. Wells, The 
Pitch Phonemes of English, Lang. 21.27-39 (1945); for a rather different 
analysis, see Kenneth L. Pike, The Intonation of American English 
(1946). See also H. E. Palmer, English Intonation (1922); Stanley S. 


[hiyz kamiT]]. The two utterances differ in their tone changes; hence it is 
this feature which we select to investigate.^ 

The regular difference between repetitions of the first utterance and 
repetitions of the second is most noticeable in [i] — [i], [i] — [i], [t)] — [t]]; 
no other difference between the two is regularly noticeable in repetitions of 
each of them. In 4.23 and the Appendix to 4.23, it was assumed that if only 
one regular difference appeared between two paired utterances, it would 
be convenient to have these utterances difTer in only one segment, i.e. 
to localize the difference in only one segment. This accords with the con- 
siderations of unit length in 5.3. In the present case such economy is im- 
possible if we consider the noticeable differences in the corresponding 
segments. However, if< we represent He's coining? not by 9 segments 
eaoh having its stated tone, but by 9 tone-less segments plus an utter- 
ance-long contour [hiyz kamir) + 123I^],^ we can localize the difference 
between the two utterances in the utterance-long element. The tone-less 
segments are identical in both. 

6.22. In Otherwise N on- equivalent Utterances 

It is possible to extract utterance-long features from two utterances 
even if these utterances differ in other features too (i.e. are not paired). 
To do this, we represent each utterance by its segmental elements, and 
then extract from these segments the successive segmental portions of 
the feature in question. 

We begin with the segments of 5.3, in which, e.g., the weak-stressed 
and low-pitched [i] of Fm marking, [aym 'markir)] is distinct from the 
loud and high-pitched first [i] of Kingsley! ["'kir)zliy], since the two seg- 
ments are not substitutable for each other in these contexts: no one 
says [aym 'mar"knj] or [kir)zliy].^ 

Newman, On the stress system of English, Word 2.171-187 (1946); 
Einar Haugen, Phoneme or Prosodeme?, Lang. 25.278-282 (1949). 

^ We may test this by having He's coming? repeated, and seeing if all 
the repetitions show the same tonal sequence (intonations) as compared 
with repetitions of He's coming. 

* The italic numbers represent a single contour element, extending in 
most cases over more than one unit-length segment of the utterance. 
When not in italics, the digits indicate relative phonetic tones (pitch) 
without reference to the contours which they constitute. Since the pre- 
cise phonetic data is relevant to phonemic discussion only when phonemic 
distinctions are being established, no attempt is made here to give exact 
phonetic transcriptions. 

^ The latter non-extant form has zero stress on both vowels, indicated 
by t he absence of stress bars. 


Wo then note wIumc there are limitations of occiirronce for these seg- 
ments, in respect to the feature which we suspect may constitute an ut- 
terance-long contour. E.g., we note that in the utterance 7s that a bite? 
[iz ,<Sivt e 'bayt] we cannot substitute low-pitched [a] for the [a] at the 
end ; we can substitute [o\v] for the [ay], obtaining Is that a boat?, but 
we cannot substitute [ow] or [ovv]. Furthermore, in [Sset] we can sub- 
stitute ("se] (for contrastive stress) or ["se] (for surprise), but not [ae]. In 
general, we note that while almost every vowel quality will occur in the 
positions of the vowel-segments in this utterance (e.g. [waz iSis Seyr 
'bowt] Was this their boat?), we can obtain only a few different tone 
sequences: e.g. in 7s that a bite? and in very many other utterances such 
as those cited here we may get 01123, and 03123, and 04123, but never 
11114, or 41111, or 14226, etc. We find that for all utterances having a 
given number of vowels only very few of the permutationally possible 
tone sequences occur. ^ 

The operation here is thus the testing of a working hypothesis. To 
carry out this operation we must have noticed that I'm marking, differs 
from Kingsley! not only, for example, in that the former contains the 
segments [a, m, r] not contained in the latter, but also in that the [i] of 
Vm marking, is not equivalent to the [i] of Kingsley!'' It then remains for 
us to guess what feature would most conveniently represent the differ- 
ence between these two [i]. Having fixed on this feature (say, tone), we 
observe its occurrence (in various grades of, in this case, tone) in all the 
segments of many utterances. If we find that only a few sequences of 
different grades of this feature occur, or that the feature correlates with 
other linguistic elements, we extract the whole sequence as an utterance- 
long contour. The two [i] segments which remain after the extraction are 
now identical. 

* There are, of course, a great number of limitations of occurrence 
among these segments. All the remaining sections of the phonology, will, 
in one way or another, deal with various of these limitations. If we want 
to obtain primarily the results suggested in the Appendix to 6.1, i.e. 
elements which will turn out later to be due to independent simultane- 
ous morphemes, it will suffice to limit ourselves here to those limitations 
of segments, such as the pitch limitation of vowel segments, which show 
only a few of all the possible sequences, extending over the whole of an 

^ Our awareness of the linguistic relevance of this feature may result 
from our having established it as the factor of difference in some paired 
utterances (6.21). 


6.3. Procedure: Extracting Segmental Portions of Utterances 

From the segments of each utterance we extract each feature which 
is such that relatively few fixed sequences of the various grades of that 
feature occur in all our utterances. We call that feature a long component 
(or a contour) over the utterance. 

We approach this operation by first considering the restrictions on 
occurrence of segments. We note those cases where there are successive 
restrictions throughout the utterance on the grades of some particular 
feature of all segments in the utterance (or all segments which noticeably 
possess that feature; in the case of tone this is usually the vowels). 

We note in particular those cases where the restriction on this feature 
in successive segments is so severe that only a few different sequences of 
the different grades of this feature occur in any of our utterances (e.g. 
the examples in 6.22).* This means that for each of the sequences that 
do occur we have many utterances which have the identical sequence 
though they differ in the remaining components of their segments. We 
then say that the utterance consists of two simultaneous sections: First, 
a SLiprasegmental component which extends over the length of the ut- 
terance, and represents the fixed sequence of grades of the feature in 
question, e.g. the tone sequence 01 123. Second, a sequence of segmental 
remnants identical with the original segments except for the extraction 
of the feature in question, e.g. the pitchless remainders [iz Sset 9 bayt]. 

6.4. Segmental Length of Contours 

In noting English contours we will find many short utterances with 
the tone sequence 0£0 (e.g. Fin coming, or Alaska.), many with 1020, etc. 
However, we will also find many utterances, usually longer ones, which 
have 20 not only at the end but also elsewhere : e.g. 0201020 in I'm going. 
Back at seven. It is clearly possible to consider this a succession of two 
tone contours, 020 and 1020, each of which sometimes covers a whole ut- 
terance by itself. Other sequences can be broken down into successive 
contours which do not have identical endings: e.g. 0200123 in I'm ready. 
Are you coming? is divisible into 020 and 0123. The point between the 

"* This procedure will apply equally well to any segments (not neces- 
sarily components of segments) which are restricted in this manner, over 
whole utterances. If successive tones are restricted, then each tone is de- 
pendent on the other tones in the utterance. Distributionally, the tones 
are not independent, and hence need not be regarded as sei)arato ele- 
ments. The independent elements are the whole sequences of tones within 
the utterance. 


two successive contours into which Iho long sequence is divided, will 
often contain a brief pause; or the end <-i each successive contour will 
very frequently be somewhat drawled (in the manner of the Appendix 
to 6.3, last paragraph), or will exhibit other characteristics otherwise 
found only in utterance final. On the other hand, many occurrences of 
such sequences in hurried conversations may have none of these addi- 
tional features. 

We may therefore divide any sequence of tones (or of j.';rades of any 
other feature) which extends over an utterance, into successive contours, 
if each of the successive contours occurs elsewhere extel^i!ng over some 
whole utterance by itself.^ This operation materially reduces the number 
of different contours which we have to recognize: 0200123 above is no 
longer a new contour. 

The length of a contour (the number of successive segments over which 
it extends) is in general more than one unit-length segment, but may be 
as short as the operation described here permits. 

6.5. Contours Which Occur Simultaneously 

In some cases, when the extraction of a feature for a putative contour 
reveals a relatively large number of sequences, it is possible to represent 
these sequences as various simultaneous occurrences of just a few con- 
tours. Thus the following contrastively-stressed utterances that's his 
BKDroom, That's his bedroom, That's his BEDroom, have tone contours 
S030, 1320, 1030 respectively, as compared with 1020 for That's his bed- 
room (with extra-loud stress accompanying tone 3). If we stop here, we 
would have to say that there are here four independent tone sequences. 
However, we notice that wherever the tone is not 3, the tone of each 
segment is that of the corresponding position in the sequence 1020. We 
therefore extend the procedure of 6.3 and extract from each of these 
tone sequences yet another sequence, consisting of tone 3 plus extra- 
loud stress on any one vowel and its neighboring consonants. We can 
then say that 3030 is our old contour 1020 plus two occurrences of tone 
contour 3, while 1320 and 1030 also contain that same sequence plus 
one occurrence of contour 3 in different positions. Instead of four or more 
different sequences, we now have 1020 extending over each utterance. 

' Such subdivisions will be u.seful if we find in chapter 12 that these 
contours extend over the same morpheme class sequences (and have the 
same morphemic status when they exiend over part of an utterance as 
when they extend over a whole utterance). 


and tone contour 3 placed at any vowel in the utterance.'" The length of 
the 1020 contour is the utterance; the length of the contour consisting 
of tone 3 plus extra-loud stress is one word (rarely one morpheme or 
vowel) within the utterance. 

The search for restricted sequences, therefore, leads us not only to 
extract particular phonetic features, but also to repeat the extraction in 
some cases. We may thus break any sequences of tone, or of any other 
feature, into two simultaneously-occurring contours, if we can thereby 
analyze many different component sequences as being varying combina- 
tions of a few contours. We then say that the two contours were super- 
posed upon one another. 

6.6. Result: Suprasegmental Elements Extending over Utter- 

We now have for our representation of the language a number of new 
elements, contours whose length is in general greater than unit-segment." 
As a result of 6.3, our original segments have now lost those features 
which have been extracted into the new long (contour) components.'^ 
Each contour is defined as occurring over certain stated lengths. It may 
be symbolized by a mark at the end of that length (e.g. ? at the end of 
an utterance, to indicate 123, 1234, etc.)." 

'" In chapter 12 we shall find that this breakdown into superposed se- 
quences not only is economical but also may correlate with morphemic 

" The fact that many (but not necessarily all) of them have morphe- 
mic status is irrelevant here, and will only appear later (cf. Appendix to 
6.1, and 12.344). 

'^ The extraction of the long components greatly simplifies our further 
analysis. Formerly, many segments had differed only in features that 
have now been extracted to make up the long components : e.g. the weak- 
stressed, low-pitched [i] of I'm marking and the loud high-pitched first [i] 
of Kingsley. Now that utterance-long tone and stress contours have been 
extracted all we have here is two equivalent occurrences of the tone-less 
and stress-less remnant [i], which is identical in both utterances and oc- 
curs in them simultaneously with any one of the extracted contours. In- 
stead of the great number of original segments we now have far fewer 
segmental remnants, plus the long contour components. 

'^ The utterance-long intonations are entirely different from the one- 
vowel -long tones which occur independently over each vowel. The latter 
are called phonemic tones (sometimes, tonemes) and languages contain- 
ing them are often called tone languages. They are discussed in chapter 9, 
fri. 2 and the Appendix to 10.1-4. 


.\|>(><Mi<liv t«) 6.1: Mtirphfiiiir Independence of Utterance-long 

The oxtraction of those contours from the segments of every utterance 
is particularly imi)ortant because in many hinguages some of the con- 
tours will turn out later to constitute suprasegmental contour morphemes 
(intonations, etc.) which may occur elsewhere separately from the mor- 
phemes constituted by the segmental remnants. For example, if we ex- 
tract the tone contoiu' out of the segments of Fm going, and He isn't, we 
get the tone sequence 020 for each ; wecan later ident ify this t onecontour as 
a mor])heme indicating assertion which occurs in both of these utterances. 
With this tone morpheme out of the way, we are left with tone-less 
morphemes [gow], [ii]]. etc. which are independent of their tone mor- 
phemes and which occur simultaneously with other tone contours as 
well, e.g. in Going? ['gowiT)]; here the tone sequence is 123, a morpheme 
indicating question. It would be difficult later on to identify morphemes, 
such as [gow], if we do not now separate out those soimd elements which 
belong to different morphemes (even if, like tone, they occur simul- 
taneously with the former morphemes). For if we did not separate out 
such elements, we would get units like [gow] in 7 'to going, and [gow] in 
Going? which would have to be considered as different morphemes, since 
they differ in form and often in environment (and meaning). 

We have thus two purposes in breaking our segments into their simvd- 
taneous components: Primarily, we wish to find the distributionally inde- 
pendent factors (e.g. tone-less vowels, and tone by itself) which can be 
variously combined to yield the segments of chapter 4. Secondarily, if we 
have segments some of whose components (e.g. sibilant position) will 
later appear to be members of one morpheme, while others of their com- 
ponents (e.g. tone) are members of another co-occurring morpheme, it is 
desirable that we separate these two groups of components. There is no 
procedure by which we can easily discover at this stage, when we have 
no knowledge of the phonemic limitations of the language, what break- 
down of our segments into independent component factors is most use- 
ful ; this will be done in cha])ter 10. There is also no procedure which will 
tell us at this stage, when we have no knowledge of the morphemes of 
the language, what components are members of simultaneously-occurring 
independent morphemes. However, it is possible to carry out here a pro- 
cedure which will separate off the independent utterance-long com- 
ponents. In many cases this includes most of the elements which will 
later turn out to be members of suprasegmental independent morphemes. 


Appendix to 6.3: Formulaic Statement of the Procedure 

If the segments ABCD oocur (e.g. Am I? [spm ay]) while ABEF do 
not (e.g. [sem ay]), where EF is equivalent to CD except for a stated 
feature (e.g. EF = [ay], CD = [ay]), we say that ABCD includes 
throughout its length a single contour component consisting of this 
feature, e.g. 123, and that the contour 123 occurs whereas 111 does not.'^ 
Then since AB contains the beginning of some contour which begins with 
1 (as in [sem-]), we know that it may be the beginning of ABCD, since 
123 occurs, but that it could not be the beginning of ABEF, since 111 
does not occur. 

This formulation enables us to state what sequences of whole segments 
do not occur: If 123 is a tone contour which extends over a whole utter- 
ance, while 111 I?, not, and if we have an utterance beginning with [sem], 
i.e. an utterance whose first segments contain the beginning of the 123 
component, then we can predict that the remaining segments may con- 
tain 23, but hardly ever 11. If we have an utterance beginning is he, 
we know that the rest of the utterance may be stch, but will hardly ever 
be sick. 

The procedure which has brought out these utterance-long compo- 
nents is equivalent to two steps: First, the extraction from each segment 
of whatever unit-length components we wish (e.g. extracting relative 
tone 4 from [i] and from [i]). Second, the application of the procedure 
of 5.2 to the components which have thus been obtained. That is, we 
join into one long component any succession of unit-length components 
if part of that succession occurs regularly with the other part. For ex- 
ample, the sequence of tone components 1, 2, 3 becomes one long com- 
ponent 123, and 1, 2, 3, 4 becomes a unitary 1234, because the portion 23 
occurs only with the portion 1, or 1 — 4, or various other portions, but 
not with a great many other portions, such as 33 — 33 (there would be 
no 332333)}^ 

''' In general the statement that an element does not occur indicates 
not that the feature it represents never occurs in speech, but that it oc- 
curs very rarely and when it does there is often cultural disapproval or 

'* It is generally found that when this second operation is attempted, 
the tonal or analogous components of each segment occur regularly with 
the particular tonal components of the other segments, whereas the con- 
sonantal and vocalic remnants of each segment do not have few regular 
sequences with the other segmental remnants. I.e. given the remnant 


The whulc operation of chapter 6 is in effect a search for regularities 
which extenii over whole utterances. We are investigating utterances in 
order to see if there may be a small number of long stretches of some 
sound-feature, such that each utterance can be represented as having 
one or another of these few stretches. If we find such stretches (e.g. the 
01123 of Is that your home' and So Gardner drank it?), we will dearly 
get a simpler description by taking the whole tonal stretch as a unit, as- 
sociated with many particular utterances, than by taking each successive 
part (the of 7s, or the 1 of that) as an indejjendcnt unit, each associated 
with the segment in which it occurs. The method of attack in G.1-3 was 
followed here so that we might have an orderly procedure which would 
satisfy this specific objective: the limitations of occurrence among the 
successive component features 0, 1, 2, etc. (in 6.22) yield the few ut- 
terance-long components which we seek.'* 

In view of this, it is not important if all the successive parts are grades 
of some one sound feature, e.g. if they are all various tones, or if some 
of them are physically different features, e.g. loudness, whispering, etc. 
Many contours consist of several features which vary throughout their 
length. In English, higher tone is usually associated with louder stress.'^ 

ho7ne after the tone has been extracted, we find a great variety of partially 
similar remnants: /oam, loam, whom, hum, hole, hose. After how,/ we 
may find almost any phoneme; but after tones 12 we usually find 3 or 0, 
with other successors only in stated environments. Hence we will suc- 
ceed in setting up utterance-long elements (dependent sequences) of the 
tonal parts, but not of the segmental remnants. 

'* It was not necessary to extract the contours in this way, via the re- 
strictions upon the segments which have the feature in question. The 
contours could have been e.xtracted from the utterances before the seg- 
mentation was carried out, by searching the unbroken utterances for 
features (such as tone) in which only a few variations existed, many ut- 
terances having the same variation in common. However, had we done so 
we could not have used the more e.xact method of searching for restric- 
tions which is seen in 6.22. Furthermore, if we had extracted the contours 
before setting up segments for each utterance, the tone-less segments 
would not have represented whole successive portions of the utterance, 
such as could be impressionist ically heard in succession or could be 
snipped and substituted for one another on a sound-track. These con- 
tour-less segments would not have been amenable to the fundamental 
substitution operation of chapter 4. Therefore, it was preferable to carry 
out the operations of 3-4 before that of 6. 

'^ But in varying ways. In the tone sequences which end in 20, pre- 
ceding zero stressed vowels have tone 0, preceding loud stressed vowels 


Many utterance-final tone sequences, such as 20, are accompanied by 
an increased duration of phonemes and a laxness of articulation: in 
Look at his book, the second [u] is longer than the first. We must therefore 
be prepared to recognize long contours (or fixed sequences) representing 
combinations of various features of speech. 

Appendix to 6.4: Contours of More than One Utterance Length 

When we test the extraction of various features which may comprise 
a contour, we may find that some of the putative contours extend not 
over whole utterances but over many separated utterances. In an argu- 
ment, for instance, one or all of the participants may speak loudly and 
with high pitch (or, given a different culture or personality, slowly; with 
low pitch and fortis articulation) ; these tone and stress features continue 
over many utterances in that conversation, and are superposed upon the 
previously-described utterance contours. Some phonetic features occur 
regularly in particular social situations: in sufficiently stratified societies 
there are recognizable differences between the way a, say, upper middle 
class woman talks to a social equal and to a servant (even when the ut- 
terance is otherwise identical). Other features may be present in all the 
speech of a particular person during several years of his life, witness the 
fact that we can recognize a person by his voice. Still others characterize 
the members of a particular age group, social class, etc., witness the fact 
that in certain cases we can tell the class or age group of a speaker be- 
fore we see him. 

All such features extend over more than single utterances, and can 
be described as superposed upon the utterance-length contours (or, as 
consisting of special combinations of them, after the method of 6.5). 
Since we have limited our contours to those whose extension is peculiar to 
utterance length, these features are excluded from our present considera- 
tion. For the purposes of present day descriptive linguistics, segments or 
contours which differ only in these features are considered as free vari- 
ants, i.e. descriptively equivalent to each other. In incidence and mean- 
ing, these features border closely upon gesture and are of importance 
to any consideration of how language occurs in social interaction. In 
their representation of speech, these features can be distinguished from 

have tone 1. In those which end in a rise (23, or 34, etc.), the first loud 
stressed vowel may be said to have tone 1, the next 2, and so on, while 
every zero stressed vowel has the same tone as the loud stressed one 
before it. 


the sejimeiits and contours which have been treated in these procedures 
only by the fact that they extend over more than single utterances.'* 

Appendix to 6.5: Grouping Complementary Contours 

The effect of 0.4 and 6.5 is to reduce the number of distinct contours 
in terms of which we describe the sequence of suprasegmental features 
extending over the utterance. This number can be further greatly re- 
duced if we find that the differences among some of our contours are de- 
pendent upon differences in their environment.'^ 

Thus, the difference between 0100120 {The fellow out there fumbled.) 
and 120 {I fumbled) correlates with the difference in number and position 
of loud and zero stressed vowels. Following the method of 7.3 and 10.4, 
we may group all the contours which consist of various I's and O's fol- 
lowed by a final 20, and list them as positional variants of one contour 
— -30. The environment which determines what positional variant of — 20 
occurs in any particular utterance which bears — 20 is the number of 
loud and zero stressed vowels in that utterance. 

Appendix to 6.6: Phonemic Status of Contours 

The advantages of reducing the stock of distinct segments, and of 
separating out the elements which may later have independent mor- 
phemic status, are obtained as soon as we extract the contours, no mat- 
ter how we view them, nor what we do to them thereafter. Nevertheless, 
it may be of interest to note the status of the new contour elements rela- 
tive to our other linguistic elements. 

Both the contour elements and the segment-remnant elements are 
now needed in identifying an utterance, for each represents features of 
the utterance not indicated by the other. In chapters 7-9 the unit-seg- 
ment elements will be subjected to certain operations, leading to com- 
pletely phonemic elements; these cannot, by their nature, be carried out 
upon the contours.^" 

'« Cf. 2.31-2 above. 

'^ In 6.5 the differences among 1320, 1030, 1020, etc., depended upon 
the differences in occurrence and position of the contour 3 among the 
respective utterances. Hence as soon as 3 was recognized as a separate 
contour, we were free to consider 1320, 1030, etc., as members of one 
1020 contour. 

^° This is related to the fact that the unit-segment elements and the 
contour elements have been established in quite different ways. The 


However, just as the operations of 7-9 will reduce the number of ele- 
ments and make them more convenient for linguistic analysis, so it may 
be possible to reduce the number of contours which have been set up for 
a particular language. To do this, we would take all the contours in that 
language and compare them. Several contours may turn out to be identi- 
cal in part of their length, while differing in the remainders. It may be 
possible to contrast one contour with another, in a single environment, 
and to find that contours A and B are equivalent except at one point 
while contours A and C are equivalent except at another point. In such 
cases we would say that the section of difference between A and B is one 
element, the section of difference between A and C another.^' These 
would be successive phonemic elements of contours, and the contours 
could be identified as consisting of them. 

For example, we take the low-rising intonation for /?/ {0123 in Are 
you coming?), high-rising for impatient question /??/ {1234 in Are you 
coming??), rising-falling for surprised question /?.'/ {0132 in Are you coin- 
ing?!), mid-falling for assertion /./ {120 in He's coining.), high-falling 
for excited assertion /.'/ {241 in He's coming!), mid-level for to-be-con- 
tinued assertion /,/ {122 in He's coming, — ). Instead of considering each 
of these an independent irreducible long component, we may purely on 
phonetic grounds say that the superposition of the mid-level intonation 
upon one of the other contours has the effect of raising the relative pitch 
of the contour: thus high-rising /??/ would be merely low-rising /?/ 
plus mid-level /,/; and high-falling /!/ would be mid-falling /./ plus 
mid-level /,/. Similarly, rising-falling /?!/ could be considered a super- 
posed combination of low-rising /?/ plus mid-falling /./. Instead of the 
six original elements, one for each contour, we now have only three ele- 
ments: /?/ for rise, /./ for fall, and /,/ for middle register (as against 
low register) base-line. Whereas our previous six elements each had both 
phonemic and morphemic status, these three elements are only phonemes 
(stretching over the unit segments of the utterance, and sometimes over 
each other), while the six (or more) morphemic contours are each a par- 

unit-length elements resulted from operations of substitution ((•ha})ter.s 
3-5), whereas the contour elements ex^jress limitations in the variety of 
succession of features of the segments. 

^' This may be regarded as an extension of the procedure of 6.2-3, 
since we may say that it is the sections of difference between A and B, 
rather than the whole of A and of B, that constitute the stretches within 
which the successive tone components arc (lei)endent on each other. 


ticular combination of various of these phonemic elements (just as seg- 
mental morphemes are combinations of various segmental phonemes). 
We now write the morphemic contour 012S as /?/, 12S4 as /,?/, 0132 as 

/?./, 120 as /./, Ul as /,./, 122 as /,/." 

^^ This, of course, has nothing to do with the meaning or morphemic 
status of the contours; when the element /./ occurs with the element 
/?/ in /?./). it rio more has the meaning of the contour which is written 
with the component / ./ above than does the phoneme /a/ which repre- 
sents the morpheme a in o man, represent that morpheme in /alart/ alerl. 

Note that the combinations suggested above are not phonetically 
exact. We do not have to restrict ourselves to cases where perfect pho- 
netic similarities may be found among the contours. We can say that /,/ 
is defined as mid-level tone (e.g. 122) when it occurs by itself, but as a 
raising of tone-level (register) when it occurs simultaneously with other 
components (so that /?/ is 0123, while /,?/ is 1234). 

It is not necessary that the phonemic elements which we will combine 
in order to form contours should also constitute contours by themselves 
as they are in the example above. We could have taken any new elements 
common to several contours, e.g. level-mid, level-high-rising, etc., and 
have defined each contour as some simultaneous combination of these 
new elements. The contours can be defined as successions of shorter 
phonemic elements, or as superpositions of phonemic elements each of 
which is as long as the contour itself. 


7.0. Introductory 

This procedure takes the segmental elements of chapter 5, after they 
have lost the components which were extracted in 6, and groups them 
into phonemes on the basis of complementary distribution. • 

7.1. Purpose: Fewer and Less Restricted Elements 

We now seek a more efficient set of symbols for our segments,^ one in 
which there are fewer elements, and in terms of which we can state 
more compactly which sequences of them occur. 

^ The literature concerning the technical setting up of phonemes in a 
language is post-1930. The basic methodological considerations are given 
in Leonard Bloomfield, Language (1933); Bernard Bloch and G. L. 
Trager, Outline Guide of Linguistic Analysis (1942) ; Nikolai Trubetzkoy, 
Anleitung zu phonologischen Beschreibungen (1935); Nikolai Trubetz- 
koy, Grundzuge der Phonologic (Travaux du Cercle Linguistique de 
Prague 7, 1939); Morris Swadesh, The phonemic principle, Lang. 
10.117-129 (1934); Morris Swadesh, The phonemic interpretation of 
long consonants, Lang. 13.1-10 (1937) ;G. L. Trager, The phonemic treat- 
ment of semivowels, La!^g. 18.220-223 (1942); J. Vachek, Can the 
phoneme be defined in terms of time?, Melanges van Ginneken (1937); 
Manuel Andrade, Some questions of fact and policy concerning pho- 
nemes, Lang. 12.1-14 (1936); H. E. Palmer, The Principles of Romaniza- 
tion (1931) ; D. G. Mandelbaum, ed., Selected Writings of Edward Sapir; 
K. Biihler, Phonetik und Phonologic, Travaux du Cercle Linguistique de 
Prague 4.22-53 (1931); Witold Doroszewski, Autour du phoneme, ibid. 
61-74; Daniel Jones, On phonemes, ibid. 74-9; J. Vachek, Phonemes and 
phonological units, ibid. 6.235 40 (1936); Daniel Jones, Some thoughts 
on the phoneme. Transactions of the Philological Society 1944.119-35 
(1945); E. Haugen and W. F. Twaddell, Facts and Phonemics, Lang. 
18.228-37 (1942). 

Articles giving the phonemic analysis of particular languages, or dis- 
cussing phonemic problems in various languages have appeared chiefly 
in Language, International Journal of American Linguistics, Travaux du 
Orcle Linguistique de Prague, and the Proceedings of the International 
Congress of Phonetic Sciences. 

As examples, the following may be noted here: Leonard Bloomfield, 
The stressed vowels of American English, Lang. 11.97-116 (1935); Mor- 
ris Swadesh, The vowels of Chicago p]nglish, Lang. 11.149-151 (1935); 
Morris Swadesh, Twaddell on Defining the Phoneme, Lang. 11.244-250 
(1935); Edward Sapir, Glottalized continuants in Navaho, Nootka, and 
Kwakuitl, Lang. 14.248 274 (1938); Morris Swadesh, On the analysis 
of English syllables, Lang. 23.137- 150 (1947). 

^ In the remaining chapters, the term segments will be used for the 
remnants of our original segments after the component contours of 
chapter 6 have been extracted. 



^^'itll the segments as we now have them, we can represent any stretch 
of speech by one or more contours phis a sequence of letters (symbols for 
the segments), each representing a length of sound (or sound feature) 
which is substitutable for any other written with the same letter. For 
many lan2;uages this representation has drawbacks. First, each segment 
is highly restricted in occurrence: e.g. the alveolar flap (r'] written It in 
I'm setting it here occurs only after a loud stressed vowel and before a 
weak stressed one. It thus occurs in a relatively small variety of environ- 
ments: e.g. we have setting ['ser'ir)] but no [seVit)] or [ser'l or [r'es], etc. 
Second, the number of different segments is very great. 

Both of these considerations can be met by a classification of our seg- 

7.2. Preliminaries to the Procedure 

7.21. Stating the Environments of Segments 

We list each different segmental element. 

For each segment, we state all the environments in which it occurs.'' 
Since this can be an almost endless task, it is best to attempt tentative 
approximations. For each segment, we may list the most differentiated 
environments we can find, within short utterances, and then see if every 
other environment is identical, over a short stretch or in a particular 
respect, with one of the environments noted at first. 

We may soon find that all our listed environments for any given seg- 
ment have some one feature (or any one of several stated features) in 
common, and are otherwise random, e.g. [r'] is always preceded by '-|- 
vowel, which is in turn preceded by any random segment including #. 
We may further find that this is also true of all additional environments 
which we add to the table. W^e then stop listing additional environments 
and summarize our results. That is, we state that certain features, or 
some one or another of several features, are present (or absent) in every 

' The classificatory operation of 6.3 will be carried out only upon the 
segments as they remain after the operations of 5.3. The contours which 
were obtained in chapter 6 have been subjected to an analogous classifica- 
tory operation in the Appendix to 6.5. 

* As defined in 2.4, the environment of X will mean the rest of the 
utterance (or of some stated part of the utterance) in which X occurs, 
stated in terms of elements comparable to A'. In the case of our present 
segments, the environment is the other segments around it, plus the 
phonetically recognizable silence or pause. 


environment of the segment : e.g. [r'] always has a loud stressed vowel 
before it and a weak stressed vowel or [f, }, rp] after it.^ 

7.22. Summing over the Environments 

We now arrange the segments according to the sum of environments 
in which each occurs. This totality of environments is called the distri- 
bution of the segment, or its freedom of occurrence.^ 

It is now possible to consider how segments can be complementary in 
distribution. We take any segment and note the sum of environments in 
which it occurs: say, [i^] which occurs in [# — V].^ We then cast about 
for some other segment which never occurs in any of the environments 
in which the first segment occurs: say, [t] which occurs only in [-j;] as 
in ['triy] tree. We say that such a segment is complementary to the first 
one. We then look for a segment which is complementary to the first two 
(i.e. which never occurs in any environment in which either of the first 
two occurs) ; say, [r'] which occurs in the position stated in 7.21; neither 
[t''] nor [t] ever occurs in that environment. We continue the search until 
we can no longer find a segment which is complementary to all the pre- 
vious segments. At this point we close the list of these mutually com- 
plementary segments and begin afresh with a new segment for which 
we will seek other complementaries, forming a second set of mutually 
complementary segments. 

7.3. Procedure: Grouping Segments Having Complementary Dis- 

We take any number of segments, each of which is complementary to 
every other one we have taken, and say that they comprise a single class 
which we call a phoneme (writing its symbol between slanting lines). 
E.g. segments [K, k, k]* can all be included in a phoneme /k/. Each of 
the mutually complementary sets of 7.22 may thus constitute a phoneme 
by itself. On the first chart of the appendix to 7.22 we can determine a 
phonemic group of segments merely by drawing a line which will pass, 
from left to right, through not more than one check in each column, i.e. 
not more than one segment for each environment: e.g. [t] in [# — fl and 
[t] in [s — se] can be crossed by one line and included in one phoneme, but 

* The mark i under a letter indicates vocalic quality (syllabicity). 

* 'Privileges of occurrence' in Bloomfield's Language. 

' V indicates any segment of a group which we call vowel. In most 
languages it is convenient to set up this group, on distributional groimds, 
in contrast to consonants (C). 

* K indicates back k; k indicates front k. 


[t] in [s — ae] and [k] in [s — se] cannot. Then all the checks which are 
crossed by the line indicate members of the ])honeme which is represented 
by the line.' 

7.31, Adjusting Environments in the Course of Phonemicizotion 

The environments are themselves composed of segments; e.g. in a 
complete chart, the same segments appear both in the vertical segment 
a.xis and in the horizontal environment axis. Therefore, whenever a num- 
ber of segments is grouped into one phoneme, we must find these seg- 
ments in the environment list and replace them by that phoneme. E.g. 
when we group the segments [r] of cry and [r] of try into one phoneme /r/, 
we must change the two environments [# — r] and [# — r] into a single 
(phonemic) environment /# — r/, since [r] is now identified with [r]. 
When we have done this, we can no longer group [t] and [k] into one 
phoneme, since they now contrast in . # — r/: both [t} and [k] occur be- 
fore /r/. This prevents us from using a single restriction ([t] onh^ be- 
fore [r], [k] only before [r]) twice: e.g. from grouping [t] and [k] into one 
phoneme because they are complementary before [r] as against [r] (as- 
suming that they do not contrast elsewhere), and at the same time 
grouping [r] and [r] into one phoneme because they are complementary 
after [t] as against [k].'° 

7.4, Criteria for Grouping Segments 

The operation of 7.2-3 determines whether segment X can be asso- 
ciated with segment Y in a single phoneme. But it is not sufficiently se- 
lective to determine which of two complementaries. X and Z, shall be 
included with }' (if X and Z are not mutually complementary, so that 
only one of them, but not both, can be associated with ]'). 

^ This leaves the line free, for any given column, to pass through no 
check at all; in this case the phoneme (which consists of the segments 
checked for each environment) is represented by no segment in the en- 
vironment indicated by that column. I.e. the phoneme doesn't occur in 
that environment. E.g. the line for /g/ in the chart of the Appendix 
to 7.22 may go through no checks in the [s — ] columns; we then say 
that / g/ does not occur after /s/. 

'" If we did not do this, but had included [r] and [r] in one phoneme 
/r/ ([r] after [t], [r] after [k]) and [t] and [k] in one phoneme /T ' ([t] be- 
fore [r], [k] before [r]), we would have try and cry both phonemically writ- 
ten /Tray/. This would conflict with a basic consideration of phonemics, 
namely, to write differently any two utterances which are different in 
segments. This inadmissible situation does not arise if we group [r] and 
[r] into /r/ while keeping [t] and [k] phonemically distinct from each 
other, since they contrast before the new /r/. 



In most cases there will be more than one way of grouping segments 
into phonemes. If we choose one segment, say, [t], we may find several 
other segments, say, [t*^], [k^], [p^] all occurring in /# — V/ and so com- 
plementary to [t] but not to each other. Only one of these can be put into 
one phoneme with [t]; which shall it be?'^ 

It is therefore necessary to agree on certain criteria which will deter- 
mine which of the eligible segments go together into a phoneme.'^ 

7.41. Number and Freedom 

A general desideratum is to have as few phonemes as possible, and to 
have each phoneme occur in as many environments as possible, i.e. to 
give it the maximum freedom of occurrence among the other phonemes. 
To this end, we would try to have every phoneme include some segment 

" In the chart we will have, for each column, a choice of any one of the 
blocks, checked or empty, since our only instruction so far is to take not 
more than one check in each column. 

'2 We select such criteria, of course, as will yield phonemes most con- 
venient to our language description. Other criteria might be better for 
different purposes. The criteria should be stated not in order to fix a 
single method of segment grouping, but to make explicit in each case 
what method is being followed. 

It will be noticed that the criteria listed below serve primarily to give 
a simple distributional description of the phonemes, and only secondarily 
to give a convenient reference system for any description of the speech 
features represented by the segments. 


or else] 



for each environment, as far as possible. In the Appendix to 7.22, this 
means selecting a check from every cokmin if possible; in the less ex- 
plicit form of 7.22 it requires that the sum of the environments of all the 
seujments in a phoneme equal if possible the total of environments any 
segment could possibly have. 

This means that it is not necessary to set up more phonemes than the 
greatest number of different segments in any single environment (the 
greatest number of checks in any single column of the chart). '^ If every 
environment had the same number of segments which contrasted in it 
(i.e. which were different iated in terms of the operation of chapter 4), 
each phoneme would consist of one segment from each environment, and 
the number of different phonemes would be the number of contrasting 
segments in any environment. However, we will usually find that in 
some environments there is a greater number of contrasting segments 
than in others. E.g., if we consider the small portion shown here of the 
English chart, we find [p*', t**, k*', b, d, g] in [# — V], against which we 
can match only [p, t, k] in [s — V]. If then we were to associate segments 
[p, t, k] with [p*', t*", k*"] respectively, there would be no segments in the 
[s— V] environment to associate with segments [b, d, g]. The phonemes in 
which [b, d, g] are included w'ould thus have no representative in the [s — V] 
position; they would be said not to occur there. 

7.42. Symmetry in Representation of Sounds 

7.421. Idextity of representation among segments. Since the 
segments are defined as identifying particular sound stretches or sound 
features, and since the phonemes will be defined as groupings of seg- 
ments, it is convenient to have the definitions of the various segments 
within a phoneme simply related to each other. We may try to group 
segments into phonemes in such a way that all the segments of each 
phoneme represent sounds having some feature in common which is 
not represented by any segment of any other phoneme: to use articu- 
latory examples, all segments included in /p, would represent the feature 
of lip closure plus complete voicelessness (or fortisness) which would not 
be represented by any other segment. We would then be able to speak of 

'^ The first chart in the Appendix to 7.22 shows at a glance in which of 
the listed environments the greatest number of distinct segments are 
differentiated. These environments are [f — C], [ae — C], [g — C]. 

.'he number of phonemes may be reduced below the highest number of 
segments in any single environment, with the aid of the operations of 
chapters 8-10. In some cases the criteria of 7.42-3 may lead to the setting 
up of more phonemes, for reasons of symmetry, than the arbitrary group- 
ing of complementary segments would require. 


the phoneme as representing this common feature, rather than as being 
a class of segments. Relations between phonemes would then represent 
relations between sound features. 

As a special case of this, we try to keep in one phoneme all occurrences 
of a segment, in all its environments. If [K] occurs in /u — C/ and in 
/s — u/, we would try to have both sets of occurrence in the same pho- 
neme, say /k/. However, more powerful reasons will in some cases appear 
below which lead to listing a segment in two phonemes, depending on its 
environment : e.g. since [k] and [gl contrast in most environments while 
only [k] occurs after /s/, if for some reason we preferred to have /g/ 
rather than /k / represented after /s/ we might assign [K] in /n — C/ to 
/k/, and [K] in /s— u/ to /g/.'" 

'^ This is called partial overlapping of the two phonemes in respect to 
the segment [K]; cf. Bernard Bloch, Phonemic overlapping, American 
Speech 16.278-284 (1941). Complete overlapping, which associates a 
segmental element in one environment sometimes with one phoneme and 
sometimes with another, is excluded, since it conflicts with the one-one 
requirement of phonemics. E.g. if some occurrences of [K] in [s — u] 
were /k/ and others were /g/, we could hear the segments [sku] and not 
know whether to write it /sku/ or /sgu/. Partial overlapping may also 
occur among the parts of segments (which are no longer considered here, 
since we are now dealing only with whole segments). Thus [h] in hill is a 
member of the phoneme /h/; but the somewhat similar [h] of pill was 
included in the segment [p''] (5.2). A different case arises with the crux 
of chapter 4, fn. 11, in which two distinct sounds were freely varying 
repetitions of one another in some utterances, but constituted different 
not mutually substitutable segments in the other utterances: [bgar'] and 
[bqar'] 'cow', but only [gr'a'] 'squash' and Iqr'a'] 'ringworm'; similarly, 
only [Trag] 'he was parched' and [iraq] 'it sank'. 

If we can show a difference in phonemic environment (short of listing all 
the utterances) between the cases where the two vary freely and those 
where they do not, we will say that there is partial overlapping: in the 
first environment [g] is a free variant of the /q/ phoneme, and in the 
second it is a member of the /g/ phoneme. The /q/ phoneme then will 
have free members [g] and [q] in the first environment, and only [q] in 
the second (while /g/ will have only [g] in the second, and will not occur 
in the first). If we cannot show such an environmental diff'erence, the 
best we could say is that in some ut terances, which we would have to list , 
[gl is a free variant member of the ,'q/ phoneme while in others it is a 
member of the /g/ phoneme. That would be complete overlapping, since 
given Ibgor] and [gr'a'l we would not know which of these can also be 
pronounced with [q] unless we know the list of utterances or of mor- 
phemes. In such cases, therefore, we say that [g] is always in /g/ and |q) 
in q,', and that utterances like /bgar/ 'cow' and /bqar/ 'cow' are not 
phonemically repetitions of each other (as nick and niche are not), but 
are different utterances with similar or identical meanings; this will re- 
(luire us to go back and correct our result in 4.21 where [bgor] and [bqor] 
may have been taken as rej)etitions of each other. 


7.422. Idkntity of inter-segmental relation among phonemes. 
It is also convenient to have the relation among segment definitions 
within one phoneme iilentical with the relation in other phonemes. This 
requires that the segments be grouped into phonemes in such a way that 
several phonemes have correspondingly differing allophones (i.e. seg- 
ment members) in corresponding environments. E.g. English [p, t, k] all 
occur in /s — V/, as in stone; [p**, t^, k*"] all occur in /# — V/ as in tone. 
We could have grouped [p] and [t**] together, since they are complemen- 
tary. But the above criterion directs us (barring other relevant relations) 
to group [p] with [p""] into /p/, and similarly for /t/, /k/. For if we do 
so, we can say that the /# — V ' member of all these phonemes is virtually 
identical with the /s — • member except that C"] is added; such a simple 
general statement would not have been possible if we had grouped the 
segments differently.'^ 

]More generally: if segments a and b both occur in environment 
W — X (but not in Y — Z), and if c and d occur in }' — Z (but not in 
W — X), and if the difference between a and c is identical, in respect to 
some criterion, with the difference between b and d, then we group o and c 
(rather than a and d) into one phoneme, and b and d into another. We 
can rename c as a' and d as b'. 


W—X Y~Z 










[a] + [c] = M/, and 

[6] + [d] = /B/, if, 

in respect to some criterion, 

a:c = b:d 

'^ Symmetrical statements can often be made for several alternative 
arrangements of segments. For instance we can group [p] with [t''], [t] 
with [k*"], [k] with [p**] and say that the [# — V] member of each phoneme 
involves a.spiration plus a shift of the point of closure one place back or 
two places forward ("place" being defined in terms of the tongue-palate 
contact positions recognized for the other phonemes.) However, sim- 
plicity of statement, as well as phonetic similarity, decide in favor of the 

[p]-[p^] grouping. 


7.423. Relative to complete phoneme stock. The criteria of 7.421- 
2 do not determine around which sound features we should grou}) the 
segments, and with which difference between a and c we should match 
an identical difference between b and d. If we select the articulatory 
feature of aspiration, we would associate the [t] of /s — V/ with the [d] 
of /4^ — V/. Alternatively, if we select fortisness or complete voiceless- 
ness, we would associate the [t] of /s — V/ with the [t*"] of /# — V/. 
Similarly, if we select spirantization as the articulatory difference be- 
tween two complementary segments a and c, then the difference between 
[p] (spin) and [f] or [6] (fin, ihin) would be identical with the difference 
between [t] (stini) and [f] or [d]: we could satisfy this criterion by taking 
[p] of /s — V/ with \d] of /# — -V/ into one phoneme, and [tj of /s — V/ 
with [f] of /# — V/ into another. '^^ Alternatively, if we selected difference 
of aspiration plus remaining within one out of the three (front, mid, or 
back) tongue-palate contact areas as the relation between a and c, then 
the relation between [p] of /s — -V/ and [p**] of /# — V/ would be identi- 
cal with the relation between [t] and [t^], or between [k] and [k''J: we 
would put [p] and [p*"] into one phoneme, and so on. 

If the objective is a minimal stock of phonemes, the definition of each 
of which is to be as simple as our other criteria permit,'^ it follows that 
the selection of the common features or comparable differences should 
be governed by the generality of these features and differences among all 
the segmental elements of the language. If the voiced-voiceless or lenis- 
fortis difference obtains among many more segments than does the 
aspirated-unaspirated difference,'^ we will usually obtain a more con- 
venient set of phonemes by basing our grouping on the former. Similarly, 
if the difference among three areas of tongue-palate contact obtains 
within a large or as yet undifferentiated class of segments, our grouping of 
the segments on the basis of this difference will help in devising a more 
simply-defined stock of phonemes, since the one statement of difference 
among segment members will then apply to many phonemes. 

15a ^p disregard here the ocicurronce of initial [sf] in a very few words, 
e.g. sphere. 

'" The criterion of environmental symmetry may in some cases lead to 
a grouping of segments in conflict with the immediate considerations of 
phonetic symmetry. 

'' Or if it obtains throughout a class of segmental elements which in- 
cludes all those segments unaffected by the other differences hitherto 
considered. For example, t he voiced-voiceless or Icnis-fort is difference can 
also be shown in [x] — [s), [v]— (f), etc. 


Instead of independent statements about the membership of each 
phoneme we wouUi make a single statement about the membershij) of 
several phonemes. We can say: the three English voiceless stop pho- 
nemes have aspirated (but otherwise equivalent) members after #, and 
unaspirated members after / s, . 

The difference among members of a phoneme, which we wish to find in 
other phonemic groupings too, is in general a relative difference in the 
represented speech feature. Thus when [t*"] and [t] are put into one pho- 
neme /t/, there is not an absolute degree of aspiration which marks the 
post-# member of /t/; but the post-# member will be relatively more 
aspirated than the post- s member. 

In general, the bases upon which to group segments into phonemes 
are therefore determinable only in relation to the whole stock of seg- 
mental elements. We can discover which groupings will yield the most 
simply defined phonemes by testing the differentiation, upon which we 
propose to assign particular segments, throughout all the segments.'^ 

7.43. Symmetry of Environment 

Since the phonemes are to be defined not merely as consisting of par- 
ticular segments, but as consisting of particular segments each in a par- 
ticular environment," it will be convenient to group the segments in such 
a way that several phonemes, especially such as have similarities of 
phonetic symmetry or may otherwise be grouped together, should have 
roughly identical total environments.^" Even if the member segments of 
one phoneme do not severally have environments identical with those of 

'* The considerations stated in 7.42 correspond to the differently stated 
studies of inter-phoneme relations made by N. Trubetzkoy, R. Jakobson, 
and the many linguists associated with the work of the Cercle Linguistique 
de Prague. Cf. N. Trubetzkoy, Grundziige der Phonologic (Travaux du 
Cercle Linguistique de Prague 7, 1939), summarized in Lang. 17.345-3-49 
(1941); articles on phonemics in other issues of the Travau.x. Cf. chapter 
10, fn. 48 below. 

■^ When we say that the phoneme p , which includes [p*"] and [p], 
occurs after s, , as well as in other positions, we mean that only a par- 
ticular member [p] of the phoneme occurs in the /s — / position. 

^^ Roughly rather than exactly, because if long enough environments 
are taken, each phoneme will be found to occur in only some environ- 
ments and not in others (the analysis of 12.1 is based on this); different 
phonemes will fail to occur in arbitrarily different ones of these long en- 
vironments. We say 'arbitrarily,' because the differences among these 
long environments are irrelevant for present purposes; in 12.1 these 
differences are taken into account. 



the segments of another phoneme, the sum of all environments of all the 
segments of the first could equal the sum of the second. E.g., if we avoid 
the more delicate distinctions, we can record the following segments in 
English:^' [t**] in the environment /# — / (i.e. after silence); [t] in /s — , 
— C/, and often /— #/; [V] (with slight aspiration) in /C— V/ (C in- 
dicating consonants except /s/), in /V — 'V/, and often in / — #/; 
[ri] in /'V— V/. We also record [p*^] in /#— /; [p] in /s— ,' — C, — #/; 
[p'] in /C— V, — #, V— V/. If we combine all the [t]s (including [r^]) into 
/t/ and all the [p]s into /p/, we will find that the distribution of /t/ is 
identical with that of /p/ : each occurs in /# — , — #, C — ^V, — C, V — V/. 
This result obtains even though their several member allophones were 
not identical in distribution, the environments of [f] and [r*] together 
being equalled by those of [p'] alone. ''^ 

More generally: if segment a occurs in environments X — , and b in 
}' — and in Z — ; and if segment e occurs in X — ■ and in Y — ■, while / oc- 
curs in Z — ; we group a and b into one phoneme, say /A/, and e and /into 
another, say /E/. The result is that /A/ and /E/ each have identical 
distributions: each of the two phonemes occurs (is represented by some 
member) in X — •, }' — , Z — . 














[a] + [b] = /A/; [e] + [/] = /E/ 

^' For a more detailed statement of the members of /t/ see George L. 
Trager, The phoneme 't': a study in theory and method, American 
Speech 17.144-148 (1942). 

^^ Al much more complicated situation, which can, however, be similar- 
ly treated, occurs if we compare the environments of the various seg- 
mental members of /d/ with those of ft/. Note that this criterion can 
not decide for us whether the [t] which occurs in /s — / should be in- 
cluded in the /t/ phoneme or in /d/. Since [t] is the only segment occur- 
ring in /s — •/ which could be assigned to either one of these phonemes, 
whichever we assign it to will leave the other phoneme without a member 
in the /s — / position: if stay is analyzed /stey/, there is no segment se- 
quence which could be analyzed /sdey/. 


As a corollary of this, we try to avoid having a phoneme limited to an 
environment, or to a type of environment, to which no other phoneme is 
limited. If we have not yet decided whether to consider [p**, t"", k*"] as 
members of /p, t, k/ or of the sequence /ph, th, kh/ respectively, we 
note that /h/ does not occur after any other consonants (except after 
certain sejcments which we will analyze as consonant plus open juncture; 
in any case h ' does not occur after other initial consonants). Rather 
than say that /h/ occurs after no /C/ except /p, t, k/, we include the 
[•"l in /p, t, k/ Un /# — /) and say that /h/ occurs after no /C/. Saying 
that /h/ never occurs in /C — / (except for /C- ■ — /, where / — / repre- 
sents open juncture) still leaves /h/ with a unique distributional limita- 
tion, but at least the class /C/ appears elsewhere as limiting phonemic 
environments in English: e.g. /h/ never occurs before /C/ (including 
/p, t, k/). However, to say that /h/ is lacking only after /C"/, where 
/C"/ represents all consonants other than /p, t, k/, is more out of the 
way, since nowhere else in English do we have a phonemic environment 
limited by 'all consonants other than /p, t, k,'. We would therefore 
prefer to put [p**] into /p/, etc. 

This criterion may be used in complicated cases, e.g. ones involving 
overlapping. Thus in some dialects the alveolar flap consonant of writer 
is identical with that of rider. The preceding vowel qualities, however, 
differ, so that we have, in terms of segments, [raeyr'a] and [rayr'a]. Be- 
fore all segments other than [r'] the [sey] and [ay] are complementary: 
[sey] before voiceless consonants, [ay] before voiced segments, as in [f^eyt] 
fight, [pseynt] pint, [maynd] mind. We have here two distributional ir- 
regularities. First [gey] occurs only before voiceless sounds including [r^], 
while [ay] occurs only before voiced sounds and [r'].^^ Nowhere else in 
English do we have phonemes with just such a distribution, nor is it 
elegant to have two phonemes which are complementary through so 
much of their distribution. Second, if we include [r^] in /t/, then /t/ will 
have general distribution, but /d/ will not occur in /'V — V/. Our alter- 
native, following the criterion above, is to phonemicize the whole se- 
quence [seyr^] as /ayt/, and [ayr'] as /ayd/: /raytar/ writer, /raydar/ 

^^ We can avoid overlapping by placing [ay] in fight and [ay] in mind 
into different phonemes. Alternatively we can have /ay/ (with [ay] and 
[sey] as its members) occurring everywhere, its member before [r^] being 
[ay]. We would then have to define a new phoneme /sey/ occurring only 
before [r'] where alone it contrasts with /ay/. In the latter case we would 
write /fayt/ fight, /maynd/ mind, /rayr'ar/ rider, but /r&yr'ar/ writer. 


rider.'^* The segment [r'] is then a member of /t/ when it occurs after 
[sey], and of /d/ when after [ay]; [sey] is the member of /ay/ occurring 
before voiceless phonemes. The distribution of /ay/ is now quite like 
that of /oy/, etc., and the distribution of /t, d/ like that of /p, b/, etc.^^ 

Decisions involving whole groups of phonemes can best be made with 
the aid of this criterion. Such is the question of how to phonemicize long 
vowels — whether to consider them sequences of like vowels /ee/, or 
vowel plus length phoneme /e-/, or vowel plus some other phoneme 
/ey, e''/. In all these cases the problem is to what phoneme or phonemes 
we should assign the second mora of the long vowels.^* 

In all cases of associating segments on the basis of environmental 
symmetry, as in associating them by phonetic symmetry, the final de- 
cision rests with the way the grouping in question affects the whole stock 
of phonemes. Assigning a segment, in some environment, to a particular 
phoneme not only affects the membership and environmental range of 
that phoneme, and its similarity in these respects to other phonemes, but 
also prevents any other phonemes from having that segment in that en- 

^^ As in the case of chapters 8 and 9, the method used in this solution 
is the assigning of a sequence of segments to a sequence of phonemes, 
rather than a single segment to a single phoneme. 

^^ If the other vowels are not comparably distinguishable before [r'], 
i.e. if latter, ladder are homonymous, we will have to say that /d/ does 
indeed not occur in /'V — V/ when the first /V/ is one of these others, in 
our case a. Even so, the simplification of the [ay]-[£ey] distribution into 
one phoneme remains. 

^® A somewhat similar consideration can be added to the reasons for as- 
signing [p*"] to /p/. If we phonemicize [p""] as /ph/, etc., we would have 
/#ph/ but no /#pV/, etc., whereas with the voiced stops we have 
/#bV/ but no /#bh/, etc. This would make the distribution of /p, t, k/ 
quite different from that of /b, d, g/; whereas if we assign [ph] to /p/, 
and so on, the distribution of /p, t, k/ will be identical to that of /b, d, g/ 
except that the latter do not occur in /s — /. 

^^ In some languages we will find a number of segments which differ 
from all the others (though they may be similar among themselves) in 
the kinds of environmental limitations they have. These segments will in 
many cases occur only in exclamation morphemes, animal calls, words 
borrowed from foreign languages, and the like. We may create a separate 
economy of phonemic elements out of these segments, and note their 
limitation to a small or otherwise identifiable group of morphemes. 


7.5. Result: Classes of Complementary Segments 

The elements of our utterance are now phonemes, each being a class of 
complementary se.zments-per-environment. We henceforth write our ut- 
terances with phonemes, after listing the segment members of each, and 
thereafter disregard the differences among the members. 

The occurrence of a phoneme represents the occurrence of some mem- 
ber of its class of segments, each member being environmentally defined. 
Whenever the phoneme appears we can always tell from the environ- 
ment which segment member of the phoneme would occur in that posi- 
tion (i.e. we can always pronounce phonemic writing). Conversely since 
complete overlapping is avoided (fn. 1-4 above), whenever we are given 
a segment in an environment we can always tell in which phoneme it is 
included (i.e. we can always ^vrite phonemically whatever we hear). 
Phonemic wTiting is therefore a one-one representation of what was set 
up in chapter 4 (and 6) as being descriptively relevant (i.e. contrastive, 
not substitutable) in speech. Phonemes are more convenient for our pur- 
poses than our former segments, since there are fewer of them, and each 
has a wider distribution. Our elements now no longer necessarily repre- 
sent mutually substitutable segments; only the segments in any one en- 
vironment are mutually substitutable (free variants). ^^ 

Appendix to 7.21: Tabulating the Environments of a Segment 

Our work is simplified if we record the environments of each segment 
in a way that permits immediate inspection. 

^^ It should be clear that while the method of 7.3 is essential to what 
are called phonemes, the criteria of 7.4 are not essential 'rules' for 
phonemicization, nor do they determine what a phoneme is. At a time 
when phonemic operations were less frequently and less explicitly carried 
out, there was discussion as to what had to be done in order to arrive at 
'the phonemes' and how one could discover 'the phonemes' of a language. 
Today we can say that any grouping of complementary segments may 
be called phonemic. As phonemic problems in various languages came to 
be worked out, and possibilities of alternative analysis were revealed, it 
became clear that the ultimate elements of the phonology of a language, 
upon which all linguists analyzing that language could be expected to 
a^ee, were the distinct (contrasting) segments (positional variants, or 
allophones) rather than the phonemes. The phonemes resulted from a 
classification of complementary segmental elements; and this could be 
carried out in various ways. For a particular language, one phonemic ar- 
rangement may be more convenient, in terms of particular criteria, than 
other arrangements. The linguistic requisite is not that a particular ar- 
rangement be presented, but that the criteria which determine the ar- 
rangement be explicit. 


For each segment, e.g. [e] or [r'] (alveolar flap), we can arrange a table 
of the following type: 













i setting 

i boating 



On either side of the segment in question we place the segments which 
occur around it, going tentatively as far, say, as the first vowel on either 
side, or up to the end of the utterance. For convenience, we start with 
short utterances, but vary their types, so as to notice if the segment is 
limited to particular types of utterance in any respect. We include in the 
environment any contours which seem to be regularly present in all or 
certain types of the environment of our segment, and any features such 
as the preceding loud stress ' and following weak stress in the case of 
[r']. When our material does not seem decisive, i.e. does not show that 
some particular features are always present in the environment to the 
exclusion of certain other features, we can extend the recorded environ- 
ment in the table to include a larger stretch of each utterance. 

Appendix to 7.22: Tabulating Environments by Segments 

A more tiresome, but in some cases rewarding, method for discovering 
what sets of mutually complementary segments can be formed, is to ar- 
range the segments in a chart, so that we can tell by inspection which 
segments are complementary to which other. We list the segments along 
one axis and the environments along the other, and check the environ- 
ments in which each segment occurs.^^ 

The chart can readily become unwieldy because the environments of 
one segment often cut across those of another. E.g. since [r^] occurs only 
after ['V] and before [weak stressed V], [jl, [qi], or [i'], while [k] occurs next 
to front vowels in general, and [K] next to back vowels, we would have 
to list the environments shown in the lower chart on page 74.^° 

^'J As in the case of k, G indicates back g, G indicates front g. C- any 
consonant segment, and C^ certain consonants not including t or r or j;. 
For clarity, this chart disregards certain distinctions even within the few 
environments listed. 

'° I.e. we have to break the total environments of each segment into 
their largest common denominator. C indicates j, m, j". C" indicates con- 
sonant segments other than C" or r. 








o— C 



s — ac 


s — o 











































e /->./ i e /-~<rr 


V— 'sc 

V— ?— 'C ?— 'C" 






i 1 1 ^ 

V 1 V 

V V 

' V 1 V ! 


! 1 

1 i 

v/ i 1 1 V 

I 1 
1 i 

1 1 


i I 1 

1 ] 


To avoid such confusing proliferation of environments^^ we may have to 
leave some segments out of the chart, adding them in an appended list, 
but keeping them in mind whenever we use the chart. 

Once this chart is completed, we can tell at a glance, for any environ- 
ment, what segments occur in it. For any segment, we can tell at a glance 
what segments are complementary to it : e.g. in the first chart, [t] is com- 
plementary to [t], but [K], [k], and [k] each contrast with (are distinct 
from) [t].3- 

Appendix to 7.3: Phonetic and Phonemic Distinctions 

Note that it does not matter if more or fewer distinctions are recognized 
at the start between sounds which are not in the same environment. It 
would have been possible to recognize as phonetically different (e.g. as 
to length) the various [ow]s in bow, bowl, bone, bode, both, boat, boatman, 
sailboat, etc. However, in 7.3 they would all have been grouped into one 
phoneme, since they are mutually complementary. We would have ob- 
tained the same phonemic result if we had phonetically recognized only 
one long [ow:] in bow, one medium [ow« ] in bowl, bone, bode, and one short 
[ow] in both, boat, boatmen, sailboat — or if we had failed to notice any 
difference among all these [ow]s. 

What does matter is that we recognize all the regular phonetic distinc- 
tions which occur in each particular environment, such as the three 
lengths or transitions of [ay] in minus, slyness, sly Nestorian.^^ Suppose 
we had failed to distinguish between [n] (= /nd/) of binding and [n] of 
tanner, since the two are not distinguished (do not contrast) in either of 
those exact environments; and suppose we had written both of them 
as [n], thinking that they could be substituted for each other (as the [ow] 
of bowl and of bode can be, without noticeable distortion and without 
obtaining a different utterance thereby). Then at some point we might 
come upon an environment in which [n] and [n] are distinguishable pho- 
nemically: two whole utterances where the two segments have the same 

^^ The number of environments distinguished in this second chart 
would be even greater if we choose to recognize that [k] after [e] diflfers 
slightly from [k] after [i] and so on. 

^^ We say that any two distinct (non-equivalent) segments which have 
at least one environment in common, i.e. which are not complementary 
throughout all their environments, contrast. More exactly, any two dis- 
tinct segments which occur in the same environment (in the same col- 
umn) contrast in that environment. They do not contrast in another en- 
vironment in which only one (or neither) of them occurs. 

" Cf. Chapter 8, fn. 17 below. 


environment {We're banding it. We're banning it.). If such pairs did not 
occur, i.e. if all utterances which differed as to [n n] also differed in some 
of their other sediments, we might fail ever to notice the [N/n] distinction; 
but this would, by definition, not affect ourability to identify utterances.'^ 
Note that our method does not depend upon pairs (4.23) to yield the pho- 
nemic distinctions. The phonemes are formed from the repiular differences 
noticed in each environment. Paired utterances or paired ])arts of utter- 
ances are needed only to leveal linguistic non-equivalences which we had 
failed to notice.'* 

.\ppendix to 7.4: The Criterion of .Morphemic Identity 

Xo knowledge of the morphemes of the language was assumed in chap- 
ter 7, since the morphemes will later be defined in terms of the phonemes. 
Frequently, when we have to choose which of two segments to include 
in a phoneme, it happens that the choice of one of them would make for 
much simpler phonemic composition of morphemes than would the choice 
of the other. E.g. [t] and [p] are each complementary to [t*"]; which shall 
we group with [t'>]? If we associate [t**] with [p] in one phoneme /T/, 
and [p''] with [t] in another /P/, we would have /Teyk/ for take but 
misPeyk for iriistake, Pazes / for possess, /disTazes/ for dispossess. 
This would mean that later, when we set up morphemes, we would have 
/Teyk/ and /Peyk as two forms of one morpheme, the latter occurring 
after /s/. It is clearly preferable to group the segments [t^] and [t] to- 
gether into /t /, so that there should be a single morpheme /teyk/ hav- 
ing the same form after both # and /s ; this makes for a simpler de- 
scription of the morpheme take. 

'■* This is so because while we get our original differentiations by seek- 
ing identical parts, we keep only those differentiations which distinguish 
different utterances. Our tests demand only that different utterances, i.e. 
sequences, be distinguished somewhere by the elements we set up (op. 
4.3). It does not matter if the utterances differ in, say, two respects (or 
two places), if neither of those two differences ever appears elsewhere 
regularly as the sole difference between two utterances; for we can then 
consider the two differences as constituting only one difference, (which 
extends over two unit lengths) or disregard one of them, or note both. 

^ Other considerations too (such as phonetic or distributional sym- 
metry of phonemes) may make us suspect our previous work and look for 
distinctions which we might have missed ; but only pairs can exactly force 
us to do this. E.g., if in Moroccan Arabic we find an emphatic phonemic 
counterpart for every dental phoneme except /d/ we might check back 
on all our utterances containing d to see if some of our d ' segments 
might not actually be emphatic, contrasting phonemically with non- 
emphatic /d/. 


We can generalize this as follows: suppose we have two utterances, 
YA and XB (/ take, and It's a mistake.), and we wish tentatively to con- 
sider A [f'eyk] and B [teyk] as two occurrences of one morpheme 'A' 
(take). If the only difference between A and B is that A contains segment 
a [V'\ where B contains segment b [t], we simply group a and b into one 
phoneme /a/.^^ As a result of this, B gets the same phonemic form as A 
(both containing /a/), so that the morpheme 'A' now has only one form 
instead of two.^' 

Since a major purpose in determining the phonemes is in order later to 
identify the morphemes, a phonemic arrangement which makes for 
simple identifications of morphemes would be a convenient one. However, 
since we cannot as yet identify the morphemes, but can only guess at 
them, any assignments designed to satisfy this criterion could only be 
tentative at this stage. Later, when we identify the morphemes, we may 
find that some of those having two phonemic forms could be reduced to 
one phonemic form by reassigning their segments to other phonemes. For 
instance, if we had assigned the [t] which occurs in /s — ■/ to /d/ we would 
have /disdeyst/ for distaste and /teyst/ for taste.^^ We can give one pho- 
nemic form to both these occurrences of the morpheme taste by reassign- 
ina; the |t] of /s — ■/ to /t/. Having done this, we must go back and 
change the phonemic composition of all morphemes which contained 
the [t] segments: we had written stay /sday/, but must now change it 
to /stey/. 

The criterion of morpheme identity is not necessary for the carrying 
out of the phonemicization operations of chapter 7. Restricting ourselves 
to the information which our previous operations have given us, we can 
group segments into phonemes at the present stage of our analysis. In 
most cases the morphemic considerations will turn out to yield the same 

■^'' Only, of course, if a and b are complementary throughout. If they 
are not, we must accept A and B as distinct utterances until we grouj) 
morpheme variants together (chapter 13). l'>.g. in knifing and k7iives we 
would like to consider knife and knive- as one morpheme, but cannot do 
so because segments [f] and [v] contrast elsewhere, as in fat, vat. 

" It is essential that A and B occur in different environments (}' — 
and A' — ■), since otherwise a and b could not be com])lementary in these 
utterances: a occurs next to )' but not next to A', b next to A' but not 
next to y. Hence when we see morpheme 'A' written with ])honeme 'a/ 
we know what segment the phoneme indicates, according as the environ- 
ment of 'A' is either )' or A'. 

■'*'' This e.Xiimple will apply only to those dialects of English in which 
/s-d/ does not occur, i.e. where disdain is pronounced /diz'deyn/, etc. 


phonemicization as resulted from the purely phonological considerations 
of 7 (cf. the example of Swahili in the Appendix to 7-9). In some cases 
we may pet more phonemes than we woultl if we used our knowledge of 
the morphemes of the language: e.g. we might have /p'', t*", k*", , and un- 
released /p, t, k/ as well as released /p', t', k'/. But the difference be- 
tween the phonemic system based on knowledge of the morphemes and 
that not based on such knowledge would only be one of convenience'*'^ for 
morphemics; as elements of linguistic description and subjects for fur- 
ther analysis the two sets of phonemes would be equivalent. 

^' Convenience primarily in avoiding occurrences of two ])honemic 
forms for one morpheme. 


8.0. Introductory 

Tliis procedure introduces junctures as a factor in phonemicization, 
but only, of course, to the extent that this is possible without knowledge 
of morphemes. 

8.1. Purpose: Eliminating Restrictions on Sets of Phonemes 

We reduce the number of phonemes, and simplify the statement of re- 
strictions upon the environments in which they occur, by considering 
those restrictions of environment which apply to large numbers of pho- 

In the first approximation toward phonemes, as we obtain them from 
the operation of chapter 7, we may find many which occur in identically 
limited environments. Thus on the basis of chapter 7 we would have to 
recognize at least two sets of vowels, distinguished chiefly by length and 
type of off-glide. 1 There would be /ay/ of minus and /.\y/ of slyness, /ey/ 
of playful and /sy/ of tray-ful, and so on.'^ Members of the shorter or less 
drawled set do not occur at the end of an utterance. Such general limita- 
tions of occurrence affecting one of two parallel sets of tentative pho- 
nemes lead us to ask whether the limitation may not be avoided and the 
two sets somehow made into one. 

8.2. Procedure: Defining Differences between Phoneme Sets 

Two or more parallel sets of tentative phonemes, such as the 
/ay, ey/ and /Ay, Ey/ sets, cannot be combined into one set because they 

' This was presented in detail in George L. Trager and Bernard Bloch, 
The syllabic phonemes of English, L.\ng. 17.225-9 (1941). Juncture in- 
dicators (without the name) occur in Edward Sapir and Morris Swadesh, 
Nootka Texts 237 (1939). Cf. also Z. S. Harris, Linguistic structure of 
Hebrew, Jour. Am. Or. Soc. 61.147 (1941). Some of the features of junc- 
tures are discussed under the name Grenzsignale by linguists of the 
Prague Circle: see N. Trubetzkoy, Grundzuge der Phonologic, 241-61 
(Travaux du Cercle Linguistique de Prague 7, 1939). 

2 Differences in length between, say, [ay] of minus and the shorter [ay] 
of mica are not phonemic, since the environments differ. In the cases 
under discussion here the environment following the vowel does not dif- 
fer, so that /ay/ and /.\y/ must be considered phonemically distinct. 
(It is the environment following the vowel that correlates with vowel 
length in all other English cases. A few complete pairs may also be found, 
where the whole environment is identical, but the two vowel lengths 



rei)ies(Mit liistinct segments in identical environments. However, they 
could be combined if there were a technique for altering the environ- 
ment of one of the sets so that its environment should no longer be identi- 
cal with that of the other : if every /Ay, and /Ey/ had some environmental 
difference as against every /ay/ and /ey/, then /ay/ would be com- 
plementary to /Ay/ and the two could be i)ut into one jjhoneme, and so 
for /ey and /Ey/. Any such alteration would have to be controlled and 
reversible; otherwise writing which included this alteration would no 
longer be a one-one representation of the descriptively relevant features.' 

This alteration is effected by taking the features which distinguish the 
two sets of tentative phonemes, and setting them up as the definition of 
a new phonemic element, called a juncture. That juncture occurs with 
the set which had the features that have now been assigned to the junc- 
ture. Thus if the difference in length, off-glide, and vocalic quality be- 
tween /Ay, Ey/ and /ay, ey/ is now represented by the juncture /-/, 
then the tentative /Ay/ is now replaced by the new /ay-/ which is 
defined as the old tentative /ay/ plus the differences represented by 
/-/. But since these differences are those between /ay/ and /Ay/ it 
follows that the new /ay-/ is equivalent to the old /Ay/. 

There are several advantages to the use of the juncture. First, it is 
possible to replace the original two sets of phonemes by one set, plus the 
juncture which is used whenever the corresponding set would have oc- 
curred. Second, it is possible to include in the juncture not only the 
features of a particular parallel set of our first phonemic approximation, 
but also the phonemic distinctions of other parallel sets of phonemes 
which occur in comparable special positions. Thus the English /-/' can 
be used to express not only the phonemic difference between /Ay, Ey/ 
etc. and . ay, ey,' etc., but also the aspiration of the first /t/ in night-rate 
as compared with nitrate.* Third, in addition to serving as indicators of 
phonemic differences, the junctures can also serve as indicators of speech 
boundaries (e.g. intermittently present pause). This is possible because 
one of the chief occasions for setting up junctures, as will be seen below, 
is when one set of phonemes occurs at speech boundaries while its parallel 
set does not. 

^ The alteration would be phonemic, since the environment of a pho- 
neme is com])osed of the phonemes around it. 

* Cf . Trager and Bloch, op. cit. 225. In the tentative phonemes of chap- 
ter 7 we would have had to distinguish these two forms as /nayTreyt/ 
night-rate and /naytreyt/ nitrate. By using the juncture, which had not 
hitherto been defined in a way that would affect either /t/ or /T/, we 
write /t-/ for /T/, obtaining /nayt-reyt/ and /naytreyt/ respectively. 


8.21. Matching Sets of Tentative Phonemes 

The simplest approach to setting up junctures is to watch, in the 
phonemic approximation of chapter 7, for a set of phonemes which never 
occurs at the end (or at the beginning) of an utterance, while a parallel 
set of phonetically somewhat different phonemes occurs both there and 
within the utterance.^ E.g. in the sets of tentative phonemes /p', t', k'/ 
and /p', t', k'/, tlie slightly aspirated /k'/ which we hear in market never 
occurs in utterance final position, whereas the unreleased or released-but- 
not-aspirated /k'/ of What a lark! does occur there. This fact does not 
suffice to put these tentative phonemes together into one phoneme, be- 
cause they contrast in other positions:^ [aym'gowii] tu 'mark'attu'dey.] 
{I'm going to market today.) and [aym'gowii] tu 'mark'attu'dey.] (/'m go- 
ing to mark it today.). However, because the first set does not occur in the 
environment / — #/ (utterance final), we may decide to say that /k'/ 
plus /#/ substitutes for /k'/, /pV+/#/ substitutes for /p'/, etc. 
That is, the tentative /k'/ and /k'/ are now members of one phoneme 
/k/, [k'l being the member which occurs before #. To get around the fact 
that both [k'] and [k'] occur in some identical environments within ut- 
terances, as in the examples above, we then extend # so that it is not 
only a mark of utterance end but also a 'zero' phoneme which occurs 
after /k/, wherever that phoneme is represented by its member [k'l 
(whether within or at the end of utterances). Then I'm going to mark it 
today becomes /aym'gowiT)tu'mark#attu'dey./, and lark becomes 
/lark#/, while market is /markat/. Now [k'] no longer contrasts with 
[k'l anywhere, since there is always a /#/ after [k'].'' Whenever we see 
/k#/, we know it represents the segmental element [k'], and when we 
hear the sound represented by [k'] we write it with the phonemic se- 
quence /k#/. Furthermore, we will often find that the points at which 
junctures like /#/ are introduced within utterances are also i)oints at 
which intermittently present pauses are occasionally made in pronounc- 
ing the utterance. 

^ Parallel means, in general, that the second set has the same number 
of phonemes as the first, and that the differences among the iihonemes 
of the second set are identical with the differences among the j)ho- 
nemes of the first. 

^ In certain American pronunciations. 

^ In this case our rearrangement is useful because it will later apjiear 
that whenever the segment [k'] occurs there is a morphological boundary 
following it (a boundary whic-h also occurs at utterance end), so that , #/ 
becomes a mark of that boundary. Cf. the Appendix to 8.2. 


The same treatment can be accorded to sets of tentative phonemes 
which fail to occur at other recognizable pauses, phonetic breaks, and 
ends of contours in speech, whereas parallel sets do occur there. 

In general, if we have segmental elements (or tentative phonemes) 
o', h', c' which do not occur next to utterance end, or do not occur at some 
particular type of contour end or pause; and if we can match these with 
the parallel a, 6, c which occur in these boundary positions (as well as 
in other positions, where o', h' , c' contrast with them), we mny represent 
a' as /o- , b' as /6- , etc., where -; is a new zero phoneme or phonemic 
sequence boundary. 

8.211. Syll.\bific.\tion fe.^tures. In many cases there are large 
parallel sets of tentative phonemes, in which the corresponding members 
of jihonemes from each set differ in what are called features of syllabifica- 
tion. Such are the differences between the second elements of analysis, 
a name, and an aim, or the second elements of attack, a tower and at our. 
Here, instead of speaking of three phonemically different , n elements, 
or three /t/ elements, and instead of speaking of one set of elements 
/n, t/ etc. plus various syllabification rules, we can speak of one set of 
elements plus one juncture , -' which may occur before or after the 
element or not at all: analysis /aenaelisis/, a name /ae-neym/, an aim 
/sen-eym '. 

8.22. Replacing Contours by Junctures 

In some cases it is possible to show that a contour need not be indicated 
if a juncture is written, because its presence can be recognized from the 
occurrence of the juncture. This may happen when we take a contour 
which has occurred over whole utterances, and wish to identify it even 
when it occurs over a length which is imbedded in a longer utterance. E.g. 
in Swahili there is loud stress on the penult vowel of every utterance: 
waxmli 'two' (speaking of people). But there are also loud stresses else- 
where in the utterance. Furthermore, corresponding to almost every 
stretch (enclosed in an utterance) which contains a penult stress, e.g. the 
parts of waliku^aivanawdkewavnli 'two women came', we can get an 
identical whole utterance (i.e. we can get waliku%a 'they came' and 
wanawdke 'women' as separate utterances).* Therefore we insert a zero 
phonemic boundary mark /#/ after every post-stress vowel, and say 

* This fact gives us the assurance that there is a morphological bound- 
ary after the post-stress vowel. Any other hint that a morpheme 
boundary occurs at a point which can be operationally related to a pho- 
netic feature would also do. 


that stress occurs automatically (non-phonemically) on the penult vowel 
before /#/.^ We now write waliku^a:^wanawake:^wawili:^ (or we use 
phonemic space instead of #) and know the position of the stress from 
the position of the # (which is also the position of intermittently present 
pause and of utterance-end silence). 

Similar treatment can be accorded vowel and consonant harmonies, 
stress and tone contours, phoneme tempo, etc., so long as the range of 
their effect can be shown to be automatic with respect to some points in 
the utterance.'" 

8.221. Periodicities of segmental features. This applies not only 
to the contours of chapter 6, but also to any other feature or sequence of 
phonemes which is restricted in respect to simply divisible portions of 
the utterance length. Thus Moroccan Arabic sfdn^ 'doughnut', bdrd 
'wind', kt9bt 'I wrote', xddma 'work' all have the pronunciation of the 
string of consonants phonetically interrupted (by consonant release 
plus /a/) at every second consonant counting from the last. In contrast 
zbdl 'hill', hrdd 'cold', sdwwdl 'he asked', ktdh 'he wrote' kdthdl 'she wrote', 
all have the pronunciation of the consonant sequence phonetically in- 
terrupted before every second consonant counting from one after the 
last (i.e., counting from the juncture after the last consonant). These 
two types of short Moroccan utterances could be distinguished by the use 
of two junctures, say - at the end of the former and = at the end of the 
latter: /sfn3-, brd-, ktbt-/ for .s/^ns, hdrd, ktdbt, and /i,h\ = , brd = , 
swwl = , ktbt = / for 263/, brdd, sdwwdl, katbdt. We now consider longer 
utterances in which there may at first appear no simple regularity in 
respect to /a/: Ibdrdbrad 'the wind is cold'. It is possible to divide such an 
utterance into successive sections, within each of which the distribution 
of the /a/ is regular, either of the /-/ type or of the / = / type. The 
utterance would then be written: /lbrd-brd = /. It is not necessary to 
write /a/, since the occurrence of /a/ is now automatic in respect to the 
two junctures : The [a] is no longer phonemic but is included in the defini- 

^ We must make sure that it is actually automatic, i.e. that given the 
position of /#/ or other juncture, we can predict the position of the 
stress or other phonetic contour. 

'" In going from hearing to writing: The points in the utterance are 
marked on the basis of the end-points of the effect which we are treating. 
We write a zero phoneme (juncture) at that point, instead of marking 
the effect under discu.ssion over the whole stretch between points. And 
in going from writing to speaking: we can tell what the efTect under dis- 
cussion is, and over what stretch it applies, by seeing what junctures are 
written, and at what points they are j)laced. 

84 sTiu'crrHAL M\(;risri('s 

tion of the junc'tures, which also servo to indicate points of intermittently 
present pause." 


distribution of the contour features (including segmental features like the 
Moroccan [o] as well as suprasegmental features like tone) determines 
whether it is possible to replace the contour feature by the juncture with- 
out further information. Thus in Swahili, where all utterances end in 
vowels, we can uniquely replace any length such as V C V C V C V (or 
VCCV) by V C V # C V C V # (or VCCV#). But if some utterances 
ended in a vowel and others in a consonant, it would not be determined 
whether a sequence such as V C V C V C V should be replaced by 
V C V # C V C V # (as above) or by V C V C # \' C V #.'2 In such 
cases, the contour is still automatic in respect to the juncture, but the 
juncture is not automatic in respect to the contour. In order to replace 
the contour by a juncture we would need additional information, such 
as where the points of intermittently present pause in the utterance are. 
Obtaining such information will in many cases involve the morphological 
techniques of chapter 12. 

8.223. Partial dependence of contour on juncture. In some cases 
the various segments (or contours) whose occurrence is limited to the 
neighborhood of junctures do not become automatic, but remain phone- 
mic even after the juncture is inserted into the phonemic sequence. This 
happens when the segments or components vary in a way that does not 
depend upon the neighboring juncture alone. For instance, the acute 
accent in classical Greek occurred only on one of the last three vowels 
before word juncture, but one could not always tell, merely from the 
place of the juncture, exactly where the accent occurred.'^ Writing the 
juncture here would not obviate the use of the accent. 

" And of certain morphological boundaries. The juncture /-/ can, of 
course, be indicated by an empty space (between words); = would then 
have to be separately marked, as a different kind of space between words, 
e.g. a hyphen. Or else, when the /- / juncture is marked by an empty 
space, the ' = / juncture can be marked by a/ before the final consonant 
plus empty space after the final consonant: /ktbt for ktabt 'I wrote', 
/ktbat for kdtbal 'she wrote'. This would mean that in effect we mark 
only one juncture (space), but consider pre-final-consonant [s] as pho- 
nemic, all other [a] being automatic (two consonants before space or/9/). 

'^ For an example of this type, see the Hebrew case in Jour. Am. Or. 
Soc. 61.1-18-54 (III 1.4, III 3.16-8, IV 2.1) (1941). 

'' Choice of position for the accent, in respect to the juncture, remains 
even after a descriptive analysis such as is given in R. Jakobson, Z zagad- 
nien prozodji starogreckiej, in Kazimierzowi Woycickiemu (Wilno 1937). 


This also happens when more than one feature is bounded by the junc- 
ture, with nothing in the phonemic structure of the utterance to tell us 
which of these features has occurred in any particular case. Thus both 
the /./ and the /?/ contours occur over the interval between utterance 
junctures — both 020 and 0123 over You're coming — but the mere plac- 
ing of the utterance juncture, say, /'yuwrkamir)#/, would not indicate 
which of these contours had occurred. In all such cases it is possible to 
make only one of the contours or segments non-phonemic with respect to 
the juncture, and to differentiate the others by a phonemic mark in addi- 
tion to the juncture.'* Alternatively, if every juncture is accompanied 
by one or another of these features, we mark the feature and let the 
boundary or silence, which would be indicated by the juncture, be indi- 
cated directly by the mark of the contour or other feature:'* we write 
/yuwrkamiT)./ and /yuwrkamii]?/. 

8.3. Result: Group of Similarly Placed Features 

In addition to the segmental and contour elements, there are now also 
juncture elements. These last are defined so as to represent the difference 
between some segments and others (or between some contours and their 
absence), as well as such features as intermittently present pause. The 
juncture elem -nts are important as constituting part of the environment 
of phonemes, even though they differ from the traditional phonemes; for 
by virtue of their presence in the environment a phoneme may be de- 
fined as representing a different member segment than it does when the 
juncture is not present. The juncture must be understood as having 
■phonemic status, however, since the environment of a phoneme has been 
defined as the phonemic elements around it.'^ 

'* We would be doing this in English if we set up /./ as the juncture 

marking utterance end, and allowed the statement intonation ( 20) 

to be non-phonemic, automatically indicated by the occurrence of /./ 
without any contour mark. The other contours would then be indicated 
by additional marks added on to the /./: e.g. addition of /"^ / for question 
intonation, /' ' for exclamation intonation, etc. Since these contours occur 
only over whole utterance intervals, the contour marks would never oc- 
cur without the /./ juncture: /?/, /!/, etc. 

'•'■ We do this in the usual English orthography, when we write /./ 
or /?/ at certain points, each mark indicating a particular contour, but 
either mark also indicating a morphological boundary and a possible 
point of silence. 

'* The one-one character of phonemic writing is not lost by use of 
phonemic junctures. In terms of the last paragrajjh of 8.21, when we 
see 'a-X / we know that it corres])onds imiqucly to segments [a'A']; and 


Hy the sotting up of the junctures, segments wliicli had i)roviously con- 
trasted may now be associated together into one jjlioneme, since they 
are comi)lementary in respect to the juncture. Other segments and con- 
tours, which are periodic in respect to the juncture, may he considered 
non-phonemic and incUided in the definition of the juncture. 

Although the explicit use of junctures is relatively recent, the funda- 
mental technique is involved in such traditional linguistic considerations 
as 'word-final', 'syllabification', and the use of space between written 
words. When a linguist sets up the phonemes of a language, he does not 
stop at the complementary elements of chapter 7, but coalesces sets of 
these complementary elements by using considerations of juncture. 

8.4. More than One Juncture 

The fact that one phonemic juncture has been recognized in a language 
does not preclude the recognition of additional independent phonemic 
junctures. Thus it has been shown that in English we must recognize, 
aside from the contours, two phonemic junctures: internal open juncture 
and external open juncture." The basis for this is as follows. There are 
many segments which we can assign to particular phonemes only by 
saying that whenever they occur a phonemic juncture is present : [a :y] is 
represented by /ay#/ as in tie, [k'V] by /k#V/ as in mark it,^^ [p^] by 
/#p/ as in possess, etc. The one juncture /#/ serves as differentiating 
environment for all these segments in their respective phonemes. How- 
ever, there are other segments than these, which we would wish to assign 
to the same phonemes: we cannot assign the [ay] of slyness to /ay/ be- 
cause it contrasts with the [ay] of minus, nor can we assign it to 'ay^ / 
because it differs from the [a:y] of sly which has been thus represent ed.'^ 
We therefore set up a new phonemic juncture /-/ which occurs after 

when we hear segments [a'X] we know they correspond uniquely to 
/a-X/. ['sla'ynas] is uniquely phonemicized /'slay-nas/, and /'slay- 
nas/ is uniquely pronounced ['slaynas]; while ['maynas] is uniquely 
phonemicized 'maynas/, and /'maynas/ uniquely pronounced ['maynas]. 

'^ George L. Trager and Bernard Bloch, The syllabic phonemes of 
English, L.^NG. 17.223-46 (1941). See also Bloch and Trager's Outline of 
Linguistic Analysis 47 (§ 3.7 (1)). We may also use 'open juncture' for 
Trager and Bloch's 'internal open juncture' ; 'word or phrase juncture' for 
their 'external open juncture.' 

'* When no [s] precedes the [k]. 

•' Compare the increasing lengths of [ay] before /nas/ in It's miniks 
forty. His slyness fortunately worked. The sly Nestorian monks. 


morphemes within a word or phrase, and assign the segment sequence 
[a'y] to the phonemic sequence /ay-/-^° The same new juncture serves 
as the differentiating environment which enables us to include other 
elements within one phoneme. 

These different junctures may also delimit the lengths of distinct con- 
tours. Thus the Swahili /#/ indicates the distribution of stress, but an- 
other juncture would be needed to indicate the distribution of intonations 
(e.g. question and assertion) in a long utterance where various of these 
intonations follow each other. 

Appendix to 8.2: Junctures as Morphologic Boundaries 

The great importance of junctures lies in the fact that they can be so 
placed as to indicate various morphological boundaries. For exami:)le, 
replacing Swahili V C V by VCV# is particularly useful because the 
V following V is regularly the end of an independent morphological ele- 
ment, now marked by the #. Similarly, when English [k'] is represented 
by /k#/ (while [k'] is represented by /k/) the # is thereby regularly 
placed at the end of a morphological element. 

However, things do not always work out so nicely. In German, we find 
[t] but not [d] before # ([bunt] 'group,' [vort] 'word'), while [t] and [d] 
occur in identical environments within utterances ([bunde] 'in group', 
[bunte] 'colored,' [vorte] 'in word'). If we insert # after every [t], and 
then group [t] and [d] into one phoneme, we would find that we are writ- 
ing # in the middle of morphemes {e.%. /d#ayl/ Teil 'part'). We could 
still phonemicize [t] as /d#/, i.e. use the /#/ to indicate that a preced- 
ing /d/ represents the segment [t], but many of the occurrences of this 
/#/ would not correlate \Vith morphological boundaries. 

In the case of the English /-/, each point at which it is set will be 
a minor morphological boundary; i.e. the phonemically equivalent seg- 
ments differentiated by /-/ always occur at morpheme boundary. 
However, not every morpheme boundary will be marked by /-/; in 

^'^ Instead of speaking of junctures as differentiating the environments 
for otherwise contrasting allophones, it is possible to speak of them as 
phonetically distinguishable types of transition between successive seg- 
ments in an utterance; so, for example, Bloch and Trager, Outline of 
Linguistic Analysis 35 (2.14 (3)). We then recognize in each language one 
less phonemic juncture than the mutually different types of transition. 
Thus in English we have noted three types of transition, but only two 
phonemic junctures. The remaining type of transition (e.g. that between 
(ay] and [n] in minus) is non-phonemic: it is automatically indicated by 
the juncture-less succession of phonemes. 


l']nglish many boundaries occur without phonemic junctural features. 
For example, the [ey] of playful and the [ey] of safe are both phonemicized 
/ey/, while the [ey] of tray-full is phonemicized /ey-/. Both the /ey-/ of 
tray-full and the , ey/ of playful occur at the end of morphemes, while the 
ey of safe does not. The only restriction is that whenever /-/ occurs 
there is a morpheme boundary; the converse does not hold. 

Many of the junctures set up in chapter 8 without reference to mor- 
phologic boundaries turn out nevertheless to come precisely at morpho- 
logic boundaries. This is due partly to the fact that utterance-end, pauses 
(including intermittently present pauses), and ends of contour lengths, 
all occur in the great majority of cases at morphologic boundaries : almost 
all utterances, intonations, etc. stop not in the middle but at the end of 
a morpheme. It is also partly due to the fact that in many languages 
there are features which extend over morphemes or over particular types 
of morphological stretches. Thus in English almost every word spoken in 
isolation has precisely one loud stress. If we note the number of loud 
stresses in an utterance, we will have an approximation to the number 
of words in it. In a language like Swahili, where the stress is 'bound', we 
can go beyond this and investigate what division of the utterance is such 
that the stress would be regular within each division. ^^ Since the stress 
occurs regularly on the penult vowel of the word, including the last word 
of the utterance, the only division in w hich all stresses (including the last 
one of the utterance) would be regular is the division after the post-stress 

The agreement between the operation of chapter 8 and morphological 
boundaries is, furthermore, due in part to the partial dependence between 
phonemes and morphemes. Suppose there is a certain difference (e.g. non- 
release) which occurs in the final segment of many morphemes, as com- 
pared with otherwise similar (but released) segments when they occur 
within morphemes. Then since various morphemes end in various pho- 
nemes, it will often be the case that these different (non-released) final 
segments will be members of various phonemes. There will thus be several 
phonemes occurring at the ends of morphemes which have a consistent 

^' Experience shows that this technique is particularly reliable. If we 
notice a contour which is automatic in respect to the end of utterances 
and also occurs elsewhere in the utterances, it is a safe bet, even without 
knowing the morphemes or points of morpheme boundary, to place junc- 
ture phonemes throughout the utterances, at such points so that each 
occurrence of the contour will be automatic in respect to these points in 
the same way that the last occurrence is automatic in respect to the end 
of the utterance. 


difference as compared with an equal number of phonemes which do not 
occur at the ends of morphemes: there will be non-released [t'l, [k'], etc., 
as compared with released [t], [k], etc. If, without knowing about mor- 
phemes, we seek for parallel sets of phonemes as defined in fn. 5 above, 
and if in addition we note that one of these sets occurs at utterance-end 
(which is a special case of morpheme end) and the other set not, there is 
a good chance that we will come upon the differences which occur at 
morpheme boundaries. 

In contrast to this, we may find certain segments which we could hard- 
ly group into one phoneme without having some knowledge about mor- 
phemes. Such setting up of junctures can be best performed when we have 
some information concerning the distribution of two segmentally dif- 
ferent individual morphemic segments ([t'^eyk] and [teyk]), and also 
some information concerning the distribution of other individual mor- 
phemes which we can use as models: only if we know, say, that a mor- 
pheme manage occurs both after # and after mis (in mismanage) would 
we want to have the two different segment-sequences [f'eyk] take and 
[teyk] of mistake identified as occurrences of just one morpheme take, 
which, like manage, would then occur both after # and after mis}'^ 

In much linguistic practice, where phonemes are tentatively set up 
while preliminary guesses are being made as to morphemes, tentative 
junctures may be defined not on the basis of any knowledge that par- 
ticular morphemes are worth uniting or that their distribution equals 
that of some single morpheme; but only on the basis of suspicions as to 
where morpheme boundaries lie in given utterances. 

^^ For a more general statement of this consideration in setting up 
morphemes, see chapter 13. If we wish to be completely orderly in our 
work, we would not recognize at this stage any criterion of morphemic 
identity, except as the personal intuition of the particular linguist. We 
would assign the segments to phonemes on the basis of the preceding 
criteria, plus any considerations of chapter 8 which can be objectively 
ajjplied. Then, when we set uj) mori)hemes in chapter 13 we would stop 
to reconsider our phonemic assignment of segments and see if we cannot 
simplify the membership of some morphemes by revising our original 
assignment (see 14.6). The revised grouping of segments into phonemes 
would, of course, be the one used in any full grammar, and it would be 
noted that this grouping is used for the convenience of our morphemes. 


9.0. Iiilroduclory 

This j>rocedurc breaks up some segments into two elements, each of 
which is assigned to a separate phoneme. The effect is to regularize the 
distribution of jihonemes. 

9.1. Purpose: Eliminating Exceptional Distributional Limita- 

Or, more exactly, to increase t he freedom of occurrence of cxcei)t ioiiaily 
restricted phonemes. 

In many languages we will find, after carrying out the operation of 
chapter 7, that one phoneme or another does not occur in jiarticular en- 
vironments in which other j)honemes do, even \vhen those ])honemes are 
in general similar to it in distribution. This results from the fact that in 
some cases we may be unable to group segments into phonemes in a way 
that would satisfy the criteria of 7.4, because there are too many or too 
few distinct segments recognized in a given environment, or because two 
segments which we would like to group together happen to contrast. 

We would like to eliminate some of these exceptional restrictions not 
by modifying our operational definition of a phoneme (7.5), nor by chang- 
ing the criteria which we seek to satisfy, but by performing a further 
operation, if possible, on the restricted segments in order to make them 
amenable to those phonemic groujiings which would satisfy our ])ref- 

9.2. Procedure: Dividing the Segment 

We take the segment with whose ])honemic membcrshij) we are not 
satisfied, and reconsider what, in the stretches of speech in which that 
segment occurs, constitutes the segment proper (member of our ])ho- 
neme) and what constitutes the environment. 

Suppose we have previously segmented a stretch of speech in such a 
way that A is a segment and the rest of the length is its environment B: 
e.g. let church be represented by the segment [c] (our A) and the en- 
vironment [arc] (our B). We now reconsider whether the whole of [c] 
should be our se'^ment, or whether we might not keep only the first part 
of it as the segment proper, leaving the rest of it to be added to the en- 
vironment instead. W^e cut A into two segments, .4' and .4^: [c] = a back 



[t] plus a front [s].' As soon as we have done this, we have changed the 
segment-environment relation. If A is regarded as consisting of A^A^, 
the new A^ does not have the same environment B that A had, for the 
environment of A^ is A^B, and the environment of A^ is A^—B. Thus 
the environment of the new [t] is /# — sV/, and the environment of 
the new [§] is /t — -V/; while the environment of [c] had been /^ — V/. 
Since the segment-environment relation was basic to the operation of 
chapter 7 (each phoneme contained not more than one segment member 
per environment), this redivision permits us to rearrange some of the 
phonemic grouping. Whereas formerly A contrasted with every other 
segment that occurred next to B, now A^ contrasts only with those seg- 
ments that occur next to A'^ + B. Formerly [c] had contrasted with /t/ 
and with /s/ (since all these occurred in /# — V/: cheer, tear, shear), 
and so could be included in neither phoneme. But [t] does not contrast 
with /t/ (since /t/ did not occur before [§] or /s/); nor does [s] contrast 
with /s/ (since /s/ has no member after [t], or after the /t/ in which we 
are about to include [t]). If therefore, instead of writing cheer as /6iyr/ 
we write it as /xsiyr/, we find that there is no /tsiyr/ distinct from this 
/xsiyr/; hence we can use the simpler writing /tsiyr/. More briefly, we 
could have said in the first place that there is no /tsiyr/ distinct from our 
original /ciyr/, so that we could have immediately replaced the /c/ here 

by /ty. 

As soon as we have obtained A' and A^ in place of the old segment A, 
we are free to include each of these new segments in any phoneme into 
which its environment will permit it to enter. In doing so, we merely 
repeat, for the new segments, the operation of chapter 7. E.g. [t] can now 
be included in ft/, [s] in /§/. The result is that we now have one phoneme 
less (the old /c/), and that two phonemes now have wider distribution: 
ft/ now occurs before /s/ as well as elsewhere, and /s/ after /t/ as well 
as elsewhere. 

The phonemic representation of a language may be simplified by 
means of this operation when the segment A cannot be yjut into any 
phoneme without disturbing the over-all symmetry, and when it is pos- 
sible to partition A into such segments A' and A^ as would fit well into 
the phonemes of the language. Assignment of A^ and A.^ to some other 

' This is possible because both the segment A and its environment B 
consist of the same type of constituent : segmented stretches of speech. 
We are merely changing the point of segmentation which we had fixed 
for this stretch in chapter 5. 


phoneme^ should yield a more symmotrieul or otherwif^e convenient pho- 
nemic stock than assij^ning the original A to some phoneme. 

9.21. Special Cases' 

In the resegmentation of A, the portions or features of speech repre 
seated by A' and A'^ may be simultaneous instead of successive.' E.g. 
the flapped [n] segment of /'peyNir)/ painting (in some American pro- 
nunciations) occurs only in /'V — V/. It contrasts there with /n/ (as 
in paining), with /t/ (which has the member segment [r'] in that posi- 
tion: rating), and with all other phonemes. We now divide the [n] into 
two segments: articulatorily these may be called alveolar nasal continu- 
ant and alveolar flap. The nasal segment or feature then occurs in 
['V — flap V], and the flap in ['V nasal — -V]. If we include the flap segment 
in /t/, we find that the nasal segment is complementary to /n/, since 
we previously had no /n/ in /'V — lY/. And if we include the nasal seg- 
ment in /n/, we find that the flap is complementary to /t/, since we 
previously had no /t/ in /'Vn — -Y / . We have th'us been saved from hav- 
ing to recognize /n/ as a new and highly limited phoneme, and have 
eliminated from /n/ and /t/ two limitations of environment of a type 
which did not occur among phonemes having generally similar distribu- 
tions to theirs.'' 

Similar considerations lead to representing English syllabic [f] as a 
sequence of /a/ a"nd /r/; /'starir)/ for stirring, as against /strit)/ string} 

An example in which a whole group of phonemes is involved is given 

2 The redivision of tone-bearing vowels into separate vowel phonemes 
and tone phonemes may be considered a special case of this rephonemi- 
cization. This is done in 'tone languages', where the sequences of tones 
do not show a limited number of contours as is required in chapter 6. 
However, the division of, say, high-pitched [a] into /a/ and /'/ (high 
tone), and [e] into /e,' and /'/, and so on, is based not on any exceptional 
distribution of a particular tone-bearing vowel, or of the tone-bearing 
vowels in general. Rather, it is based on the convenience of separately 
describing the vowels of a sequence and its tones (see Appendix to 10. 1-4). 

3 Division of these segments into simultaneous parts could not have 
been efficiently carried out at the beginning, before we had performed the 
phonemic grouping of chapter 7 because we would not have known which 
segments would turn out to be very different in their distribution from 
any other segments. Now, we are performing individual reconsiderations 
within an already existing tentative phonemic system. 

* There is also a morpheme-identity consideration, since prior to this 
reconsideration we would have considered /peyN/ in painting and 
/peynt/ in paints as two phonemically different forms of one morpheme. 

* Leonard Bloomfield, Language 122.3. 


by Chao,* who notes that in the Wu-dialects in China there is a group 
of breathed vowel phonemes parallel to the regular vowel phonemes, and 
then analyzes each breathed vowel as /voiced h/ phoneme plus a regular 
vowel phoneme.'' 

9.3. Result: Dependent Segments as Allophones 

The operation of 9.2 affords the opportunity to regroup the component 
parts of certain segments into different phonemes, so as to satisfy the 
criteria of 7.4 more fully than the requirement of complementary rela- 
tion among the old segments would permit. It extends the range of defi- 
nitions which segments could have : whereas our previous segments were 
independent successive portions of utterances, except for the extracted 
contour features, our reconsidered segments may now be successive por- 
tions which are not independent (e.g. the [§] portion of [i]), or else simul- 
taneous features (e.g. the flap out of [n]), or zero (junctures).* 

At this point we have reached the end of phonemic analysis as it is 
usually performed. The phonemes of 7-8, modified in some individual 
cases in 9 by reconsidering some of our segmentation points of 4.4 and 5, 
are the phonemes of the language as usually worked out by linguists. 

9.4. Sequences of Segments 

The operation of 9.2 is equivalent to establishing contrasts among se- 
quences of segments instead of among single segments alone. ^ The nasal 

® Op. cit. (chapter 1, fn. 2 above) p. 372. 

' The following not infrequent situation is also a special case of re- 
segmentation of a segment for purposes of rephonemicization: We may 
find that two segments are almost always complementary in environ- 
ment: e.g. [s] may occur only before [a, o, u], [s] only before [i, e]. We 
would then phonemicize [sa] as /sa/, and [si] as /si/, saying that [s] is 
the member of /s/ before [i, e]. However, we may find a very few utter- 
ances which contain [sa]. Rather than rescind our previous phonemiciza- 
tion, we may salvage it by phonemicizing [§a] as /sia/. To do this, we 
must make sure that no other /sia/ representing a segment sequence 
other than [§a] (e.g. a sequence [sia]) occurs. 

* The present operation is thus a rejection of the operation of 5.2. The 
rejection does not vitiate the previous results of 5 because it is carried 
out under controlled conditions and after other operations (those of 6-8) 
had intervened. The joining of dependent segments in 5 was performed 
for all segments. The phonemic separation of dependent segments in 9 
is performed only for those few segments which are found, after 6-8, to 
have a distribution exceptionally different from that of the other seg- 

^ The contrasting of sequences, rather than single segments, occurs also 
when we decide the point of phonemic difference between two pairs 


ttii]) [n], beforo it was brokoii up in l).21, (.'ont rusted in /'V — V/ with each 
single sejcnient. but it was even then complementary to the sequence 
nt . The operation of 7 permits any segment, or sequence of segments, 
in a given environment to be phonemicized as any sequence of phonemes 
which does not otherwise occur in that environment. We could phonemi- 
cize (ray, composed of the segments [trey], as / tlney/ since the phonemic 
sequence /tin/ does not otherwise occur in that position; but there 
would be no point to doing so unless the criteria of 7.4 could be better 
satisfied thereby. In 7 it had been assumed that the operation of group- 
ing complement aries would be performed only on single segments. We 
now see that advantages result from extending the operation to apjily to 
sequences of segments, with the application to single segments in 7 be- 
ing merely a special case of the application to .sequences. 

9.5. Reduction of the Phonemic Stock 

The operations of 7-9 are designed to reduce the number of linguistic 
elements for a given language, and to obtain elements whose freedom of 
occurrence in respect to each other was less restricted. What methods 
are used on what segments, and in what way the methods are applied, 
depends on the segments of each language — their definitions as speech- 
feature representatives, and their freedom of occurrence. 

For many purposes, it is very convenient to reduce the phonemic stock, 
to simplify the segmental interrelations within each phoneme, and to 
broaden the distribution of the phonemes. The current development of 
linguistic work is in part in this direction.'" However, any degree of re- 
duction and any type of simplification merely yields a different, and in 
the last analysis equivalent, phonemic representation which may be 
more or less suited to particular purposes. 

Appendix to 9.2: Considerations of Symmetry 

The linguist may, however, decide against such broadening of the 
distribution of old phonemes if the occurrence of the phoneme in the 
new environment conflicts with general distributional statements which 

(4.23). Furthermore, it is this contrasting of sequences that gives us the 
freedom to pin the phonemic difference between writer and rider on the 
middle consonant (7.43) rather than on the vowel. (The sequence in this 
case is the vowel plus the middle consonant which follows it.) 

'"Cf., for example, Z. S. Harris, Navaho phonology and Hoijer's 
analysis, Int. Jour. Am. Ling. 11.239-46 (1945); The phonemes of 
Moroccan Arabic, Jour. Am. Or. Soc. 62.309-318 (1W2). 


he could have made about groups of phonemes. E.g. before [<S] was broken 
up we were able to say that there occurs no sequence of #, stop (/p, b, t, 
d, k, g/), spirant (,'f, v, 6, 3, s, z, s, i/). After we make [c] a member of 
/ts/ and [5] of /di,/, we must omit /s, i/ from the preceding statement, 
but we then have to note that s, i/ still do not occur after /p, k, b, g/. 
Similarly, in dialects which do not contain the sequences /sy/ and 
which pronounce soon /suwn/, sue /suw/ rather than /syuw/, the pho- 
neme , S/' does not contrast with the sequence /sy/ (since /sy/ does not 
occur). It is therefore possible to consider the sounds represented by [s] 
to be composed of two members, one being the member of /s/ before /y/ 
and the other the member of /y/ after /s/ : we would write sue /suw/, 
shoe /syuw/, shift /syift/, shrimp syrimp,', ash /asy/. If we consider 
how this affects our phonemic distribution statements, we find that be- 
fore we reinterpreted [s] as ;sy/, the phoneme /y/ occurred in" 

/#— \7, /V— I /, /C— u/ (C = /k, g, p, b, f, V, m, h/) . 
Now, however, /y/ occurs in 

/#-V/, /\-l/, ,/'C'-u/. ,'#s^;/, /Is-l/ . 

V * 

The last two environments also indicate the newcomers to the range of 
environments of /s/. Our /s/ now occurs in 

and in 

in addition to its previous environments. The changes in range of en- 
vironment for /s/ and /y/ are not particularly happy ones, /y/ had a 
peculiar distribution before; it now has an even more peculiar one: note 
especially the environmental restriction to /s/ and /r/. /s/ also had a 
distribution different from that of any other phoneme, but it was one 
which involved classes of phonemes which had other distributional and 
sound-representation features in common: e.g. the class C = /p, t, k/, 
which occurred in such environments of /s/ as 

/#— C^V 
and / — C^ — , C^ — CV." Now we have /y/ in /syrimp/, so that we must 

'^Symbols above and below each other are mutually substitutable: 
s — V represents s — r and s — V. Commas may be read 'or'. 

^'^ Meaning that initial /s/ occurred before Cr, CI, Cw only if C is 
/p, t, k/ (in the case of /I/ C is only /p/, in the case of /w/ C is only 
/k/: spring, string, scroll, splash, squish. 

" Meaning that /s/ occurs only with /]>, t, k, in clusters of the form 
/sks, kst/ : a.s'A;.s, axed. 


now rostato tlio iMivironnuMit of , S/ tis 

/#-C^i/ _ 
where C = p, t, k, y ; and we have /y/ inserted in clusters of the tyj^e 
/Cs— C,'. 

If we now review the considerations for and against reinter{)reting [s] 
as /sy/, we find as an advantage the elimination of one phoneme /S/, 
and as a disadvantage the complicating of the distribution of the pho- 
nemes /s and , y/. As an advantage, again we have a consideration of 
morpheme identity. When morphemes which end in /s/ occur before a 
morpheme beginning with /ya/ with zero stress, we find [s] instead of 
[sy]: admissible, admission. If we could say that [s] was a member of the 
sequence sy/ we would not have two phonemic forms for each of t hese 
would-be morphemes: we would write /ad'misibal, ad'misyan/.''' 

Appendix to 9.21 : Junctures as a Special Case of Resegmentation 

All phonemicizations which involved junctures depended, in the last 
analysis, upon the reconsideration procedure of chapter 9. 

Phonemicizing segments (or tentative phonemes) bX as /a-X/ in- 
volved the following steps: since aX and bX both occur, b contrasts with 
a, and so must constitute a different phoneme. However, we resegment b 
into b' plus juncture /- ', where the juncture represents the difference 
between b and b'. Then b' (in environment / — --X/) is complementary to 
a (in / — X/), and is grouped with a in the phoneme /a/. bX is b' + junc- 
ture -1- X, which is phonemically represented by /a-X/. 

When a contour component is phonemicized as an automatic feature 
of a juncture (see chapter 8, fn. 14), the following steps are involved: 
The contour is extracted by the procedure of chapter 6, and has pho- 
nemic status. We may now reconsider the composition of the contour 
(as if it were a segment) and say that it consists of the contour (e.g. 
stress) plus zero at a point which can be determined from the contour 
(e.g. at its end). The point is then given phonem.ic status as a juncture, 
and the contour becomes merely the sound-feature definition of the pho- 
nemic juncture.'* 

'■•An analogous discussion could be held for reinterpreting /6/ as 

/tsy/ or /ty/. 

'* In some cases several contours, e.g. stress and intonation and vowel 
harmony, may together be automatic in respect to the same juncture. 
This procedure then applies to each of them separately or to all of them 


A Sample Phonemic Analysis Prepared with the 
Collaboration of Nathan Glazer 

1 . The segments 

1.0. Introduction 

1.1. Stress and tone features 

1.2. List of consonants 

1.3. List of vowels 

2. The phonemes 

2.0. The criteria for setting up phonemes 

2.1. Grouping of segments on these criteria 

2.2. Phonemes with exceptionally limited distribution 

2.3. Non-phonemic status of dependent segments 

2.4. Breaking phonemes up into a sequence of other phonemes 

2.5. Identifying i with y, u with w 

2.6. Stress phoneme 

2.7. Automatic sequences of stress and pitch 

2.8. Summary of phonemes and allophones 

I. The Segments 
1.0. Introduction 

All sounds we found it useful to distinguish in Swahili are represented 
below by segments.^ Each symbol below represents many different sounds 
which could be distinguished from each other if we wished to make our 
phonetic distinctions more detailed. We know of no phonemic distinc- 
tions not represented here by distinct segments. 

^ This analysis is based on the speech of Adballah Ahamed, a native 
of Grande-Comore, who went to school in Zanzibar from the age of 13 
to 15, and lived in Zanzibar for five years after he left his native island 
at the age of about 17. 

The investigation was carried out with the support of the Intensive 
Language Program of the American Council of Learned Societies, to 
which we are indebted for making this work possible. We are also glad 
to express our thanks to Dr. George He. zog who gave us valuable sug- 
gestions in the phonology. 

^ In almost all cases the symbols used below have the values given 
them in B. Bloch and G. L. Trager, Outline of Linguistic Analysis (1942). 



The table of sound types gives the following information: the symbol 
for a sound ; description of the segment in question, if it is not clear what 
the symbol represents;' the environments, in terms of other segments, 
in which the segment in question occurs; examples of the segment in 
each environment distinguished; and, for each environment, the ranpe 
of sounds which may vary freely with the sound in question, if such a 
range is noteworthy.'* 

1.1. Stress and Tone Features 

The mark ' stands for any stress and tone louder and higher than 
others in the utterances; absence of ' indicates the weakest stresses and 
lowest tones (both to be called zero) in the utterance. The ' thus indi- 
cates the position rather than the physical description of stress and tone 
other than zero. Certain sounds occur only in the neighborhood of ' and 
others occur only in the neighborhood of zero; therefore ' and zero are 
included as differentiating environments in the following table. No seg- 
ment occurs regularly ne.\t to one non-zero grade of tone or stress as 
against another; therefore the difference between these will not be given 
until later. In the symbolic listing of environments in 1.2 and 1.3 we 
use ■ to indicate regular lack of stress. Where neither ' nor ' occur in an 
environment, presence or absence of stress is not significant. 

^ Sounds heard once or twice, which we were not able to get back in 
later repetitions, are not listed. Since our informant Ls acquainted with 
other Swahili dialects, as well as with Arabic, we assume that these were 
forms as pronounced by him when speaking other dialects. 

^ In the column of environments, C stands for any of the segments of 
1.2; V stands for any of the segments of 1.3; # stands for the beginning 
or end of individual utterances; — stands for the segment in question. In 
some statements, the use of the collective terms C and V obscures certain 
limitations of distribution. Thus, in some cases we will say that V occurs 
in a certain position, e.g. VpV, even though we do not have examples of 
every vowel in that position. We will do this, however, only if the vowels 
which we do not have in that position do not correlate with any feature 
of the environment, leading us to suspect that the absence of these 
members of the V class is due merely to the paucity of our material. 
When we have no reason to suspect this, we list precisely the vowels 
which occur in that position. 

Parentheses around a symbol indicate that it sometimes occurs and 
sometimes does not in the given position. An asterisk before an environ- 
ment indicates that only one morpheme (or rather, only one utterance 
aside from repetitions) illustrating that environment occurs in our ma- 



1.2. List of Consonants 

Seg- Environ- 
memt ment 

p' (strong aspiration) 
# — ^V p'embeni 

V — ^V inap'aa 


Free Variants 

m— V 
V— uV 


p (medium aspiration)^ 
V — V' \vai)i 

m — V mpumbavu 

V — u V amelipya 

*■;' — i ^' IPPJ!!. k'l'pia 

t' (strong aspiration) 
#— V t'atu 

V— V k'it'anda 

m — V mt'oto 

in the corner 


it soars 





he was paid 



f (almost interdental, with labialized aspiration and labialized 

V— w \ 

fwayeni carry ye 

wamefwlfwa they called ns 

varies with t, alveo- 
lar and unaspirat- 
ed, before penult y: 
wameletya 'they 
were brought ' 

m — w V 


t (medium aspiration) 

V — V atakyenda 

m — V mto 

n — ^V nta 

carry him 

he will go 



* We hear varying degrees of aspiration in the same morphemes when 
they are repeated individually, and when they are heard in connected 
speech. Different statements of degree of aspiration are derived from the 
two types of material. For example, in words repeated in isolation, heavy 
aspiration is heard after the stressed syllable (fut'a or futa 'smear'). We 
have given the statements based on material drawn from connected 



Sco- Environ- 

ment MEKT 


r.— V 



s— ^' 



*f— \- 



♦§.— ^• 



\ (post -alveolar^ 


;— Qa 



s — -ga 



1. — ^oa 



Free \ariantb 

k' (strong aspiration) 

# — V (V represents front vowels) 

V— \- 








to leave some- 

varies with q in par- 
ticular mor- 
phemes: ''aqili 

k' (strong aspiration, farther back than k) 
# — ■ V(V represents back vowels) 

m — V 



^ > 



to look 
to destroy 
he will die 

varies with q in par- 
ticular mor- 
phemes: mqali 

k (medium aspiration)* 

V — -V makelele 

in — V myanamke 

r. — V bar.kis 

s — V mask in i 

poor man 

k (medium aspiration, farther back than k)* 
V — V ruka leap! 

ip — V rnkalimani interpreter 

m — y V inkye in-law 

r.— V mar.kabu ship 

varies with q in par- 
ticular mor- 
phemes: maqa- 
biiri 'graves' 



Seo- Environ- 

ment MENT 


Free Variants 

6 (implosive) 




V— V 






V— y V 






6. (released \v 

ith very short 


*i— 1 



*a— d 




m— V 



m— y V 



d" (implosive) 




V— V 






*fx— V 



r.— V 



varies with d"': 






to go out 

(almost interdental, with labialized r) 
n — wV amesind^va he was conquered 





V— V 



to taste 




varies with g^ 
snapped): mg'^eni 




g (farther 


back than g) 




divide ye 



> > 

to divide 


> > 


varies with g^ 

? > 

rj guru we 







Seo- Environ- 
men't ment 


EXAMI' 1,1:8 




V— \' 


to wipe 

N— \' 






*a — s 



*a— t 








two (chairs) 

V— V 


you were born 

N— V 






varies with veiipe* 

B (bilabial voiced spirant) 

*V — V eBea dodge! 



speak ! 

V— V 


I read 

m — V 












varies with sy: 


to pray 

syahibu, kusyala 



it's cooked 

s- (long s) 




varies with sV : nusu, 




c (short voiceless dental stop 

released into s, ts) 

n— V 






V— V 



in— V 



V— u V 


it was sunk 

* These free variants are unquestionably forms of different dialects. 
The i after v (as in viombo, vieupe) was only heard in the first few- 
months of work, and then disappeared. The replacement of c with s oc- 
curred much more rarely in the later stages of work than in the earlier. 



Seg- E>rvinoN- 



s. (released with very short vowel) 
*u— t bus.ti coat 

*a — k kas.kazi wind 

Free Variants 




V— V 


to be born 



old person 

n— V 


I began 

V— y V 


they were sold 

n— yV 


name of an island 

.hort voiceless stop released into z, dz) 




varies with mzaze 
varies with dy except 




after m: k'udyua, 

V— V 


to know- 


m— V 



n— V 



n— yV 


it was broken 



in front 

varies with s in some 

V— V 


> > 

to take 

morphemes : sam- 



take him 


V— y V 





if it happens 

V— V 


to eat 

m"— V 



n— V 


I was 

^b.— i 




with very short 





^a— z 


olive oil 

^a— X 



'e— g 






V— V 


to jump 

m'— V 


return ye 



Seo- Env 




Free Variants 

r — Continued 




V— uV 



*f.— i 



r. (rolled or released with very 

short vowel) 

V— k 



V— t 







V— d 





see under each C for examples 

V— V 






N, N (labiodental nasal) 







rn'^ (b realease) 






nj (strongly sn 

apped release) 






I called him 

m,N (syllabic; 

N only before f 








varies with im: imzi 

V— cv# 


I don't get up 

V— myV# 


to wring 




he saw me 

varies with n (half 



to build 

or fully unvoiced) 





initially, when un- 





stressed and be- 



I began 

fore voiceless vow- 



to break 

els : ncehele 'fly' 

n— V 









Seg- Environ- 
ment MENT 

n — Continued 

Free V'ariants 

V— yV 


he was seen 

varies with ji after 
a and u: k'unya, 
k'ujiya 'to drink'; 
mkanye, mk&jiye 
'drink ye!' 


(d release) 








varies with t) (voice- 
less): rjk'ima 




to wait 

m— g 



wait for him 


and ij (syllab 

ic; 13 only before k, g) 



o > 





varies with in : infii 






V— V 


to Shave 

m— V 


a tree 

V— yV 



varies with n after a 


m — yV 


drink ye! 

and u ; see under n 


haw a 


varies with x in par- 

V— V 


to move 

ticular mor- 

m— V 



phemes: xadi^i 



he goes 


V— V 






(let crmincd 

V— V 


I think 

varies with '5 in par- 




ticular mor- 
phemes : ]nna^dni 


V— V 




7 all 


m— V 


1 fooled him 



Sko- Environ- 

ment MtNT 



FllKK Vahianth 

■'' (pharyngeal 

voiced spirant) 

V— a 



to teach 

varies with glottal 




stop 'and zero in all 

m — ^a 


ext ravagant 

positions: k'u'ali- 

1— a 

betel "^izaib 

sultan's house 

misa, k'ualimisa 

a — i 



'?>■ (post-alvcoh 

ir -6) 


> • 

good deed 

varies with '5 in all 





varies with k in all 





m— a 



1.3. List of Vowels 




e— e# 
i— a# 


V— V 







° > 










that it be 


to stay 




let me 

he cries 

write him 

little table' 


to understand it 

i (short, unstressed, non-syllabic) 
*p — a rnpia new 

*6 — a 6iasara business 

•■f— a 
''m — a 
V— eV 







varies with zero 





Seg- Environ- 
ment MENT 


Free Variants 

y (shorter, not 

as high as full y' 


i — ^0 


if I see "1 

1 — u 



a — i 


you will look - 
for it 

varies with zero 

a — e 


be seated 





I'm not 



that it be 



they were eaten 



he's not drunk 






he is drunk 



it isn't 



that it be 

a— a# 





he doesn't marry 



take him out 

varies with zero 



to marry 

varies with zero 



I don't know 

varies with zero 
^isizui, siiu^i 


u e# 


take him 

varies with zero: 

mcuku^e, mci 


u— a# 



varies with "' 
zero: cuku*a, 














\y (witli lips (h' 

awn forward) 

t'— V 


carry ye 
ho is called 

nd'— V 


1 hey are con- 

y (non-syllabic 





he was paid 



Sko- Environ- 

ment MENT 



Khke Vabiantb 

y (non-syllabic 


k— V 



> > 







mb — V 







it was built 










drink ye 

s— V 




it was cooked 

f— V 





it was sewed 

z— V 


it was sold 

nz — V 


it was broken 









one goes 

varies with u''': 

" (shorter, less 

lip-rounding th; 

an for w) 

i — u 



varies with zero and 

e — u 


he built 

varies with zero 

a — o 


I have 

(1 (( u 

a — u 


I didn't forget 

It (1 u 

u — i 


one bends down 

U 11 II 

and y-. hu^inuwa 

u — e 


one goes 

varies with zero 

u — a 


I don't believe 

« II II 

u — 


one goes out 

ti II II 

u — u 


to create 


o (short, unstressed o) 

s — a 



varies with y 






Seo- Environ- 

ment MENT 





give it 

C— C, ex- 


to pay- 

cept— ^C 

V— C;C— V 


V— V 


he calls 


to weep 


he kills it 

I (lower and shorter thUn i) 

C— C 


to answer 







c— ^c 


to conquer 

ii (rounded i) 

C— i 


to steal 






C— C, ex- 


go down 

Free Variants 

cept — mC 
V— C wanaiime 

u (lower and shorter than u) 
c — C ucaf u 

#— °C tjmba 

v-^mC ameunda 

he built 

varies with u: kuiba 

The environments of the remaining vowels are more complicated than 
those given above and cannot be simply shown in tabular form. 

A seemingly continuous series of e's is distinguishable from high to 
low, with a number of separate influences affecting articulatory height. 
These influences are: 

( 1 ) In e((C)C)e, the first e is opposite in height to the second e : p'^^mbe 
'corner'; t'e^mbe^^y^^ni 'walk ye!' 

(2) e is high before ((C) C)a#: l^'^ta 'bring'; pe^l^^^ka 'send.' 

(3) e is low before # or ((C) C)V where Vis other than e or a: ippe^ 
'give him'; w^'^ye^ 'he'; ame'^unda 'he built.' 

(4) After (C)w and (C)y the e is high; the effect of w overrides effects 
1-3: kw('!^li 'true'; ye'^ye^ 'he'; fahamwe^ 'remember.' 


A seemingly continuous series of a's from front to back is distinguish- 
able, as in the case of e, with a number of independent influences oper- 

(1) After and before the consonants t, q and S, the a is pronounced 
very far back: qaSi 'judge'; fii^ila 'a good deed'; xatoari 'danger'. 

(2) After (C)\v, the a is pronounced quite far back: hi'nva 'these'; 
wana 'people'. 

(3) After labial, labiodental and velar consonants, the a is central: 
baxati 'luck'; papa 'shark'. 

(4) After dental consonants, the a is slightly fronted: S9sa 'now'; 
kuandama 'to follow'. 

(5) After h and y, the a is very fronted; hawa 'these'; mayayi 'eggs'; 
hay a 'these'. 

(6) Final position overrides the effect of the preceding consonant , mak- 
ing the a somewhat farther back. 

Two o's only are distinguishable, a high and a low: 

(1) In o((C)C)o, the first o is opposite in height to the second o. 

(2) o is low before # or ((C)C)V, where V is other than o : kico^co^ro^ 
'alley'; ko^ndo^o^ 'sheep'; o^na 'see'. 

All vowels may be half or fully unvoiced finally. 
Vowels are longer before nasals than before other consonants: pe"mbe 
'corner'; peke 'self; mu"me 'man'; miike 'woman'. 

2. The Phonemes 

2.0. The Criteria for Setting up Phonemes 

The segments represented by the symbols of the preceding sections 
may be grouped into phonemes by applying the criterion that no two 
segments included in one phoneme ever occur in the same environment 
unless they vary freely (in repetitions of an utterance) in that environ- 
ment. As frequently happens, this criterion alone does not suffice to 
yield a unique grouping into phonemes: e.g., f could be combined with 
either t, t', k, k', g, d, N, x], n, t], z, v, B, S, ?i, t, x, y, '^, q, y, or 1, but not 
with all of them, since, e.g., t and k and g contrast. The particular grouj;- 
ing of Swahili segments presented below is achieved by application of the 
following additional criteria : 

(1) If two segments vary freely with one another in every position in 
which they occur, they are grouped in one phoneme. 

(2) If two segments vary freely in one environment, and only one ap- 
pears in another environment, they are grouped in one phoneme, so long 


as the difference between the two environments is stateable in terms of 
the other segments (not in terms of morphemes). 

(3) If several sets of non-contrasting segments (i.e. segments in com- 
plementary distribution) can be selected in such a way that the differ- 
ence between the segment in environment a and the segment in environ- 
ment b of one set is identical with the difference between the segment 
in environment a and the segment in environment b of every other set, 
we recognize each of these sets as a phoneme (see 7.42). 

(4) If the sum of the environments of two or more non-contrasting 
segments a, b is identical with that of some other phonemes, a and b are 
grouped in one phoneme. We try to avoid grouping sounds into one pho- 
neme if their total freedom of occurrence is restricted by segments 
which do not appear in the environments to which any other phoneme is 
limited. I.e. we want the restrictions on occurrence of one phoneme to be 
identical with those of other phonemes (7.43). 

(5) If two segments having different environments (i.e. non-contrast- 
ing) occur in two morphemic segments which we would later wish to con- 
sider as variants of the same morpheme in different environments, we 
will group the two segments into one phoneme, provided this does not 
otherwise complicate our general phonemic statement. Our assignments of 
segments to phonemes should, if possible, be made on the basis of criteria 
1-4, since 5 introduces considerations drawn from a later level of analysis. 

We will not have to deal here with criterion 1, since in our list of seg- 
ments we represented free variants by one symbol. Criterion 2 enables 
us to make the following combinations into tentative phonemes (already 
made in the segment list by putting the rarer variant in the Free Variant 
column) : f, t — > ft/ in — y V (read : segments f and t are assigned to 
the phoneme /t/); d', d —>■ /d/; n, n — > /n/; 13,13-^ /t)/; Ji, n -^ /ji/ in 
; — V; z, z — > /z/; g, g^ ^ /g/; s", sV — > /sV/ before #. 

Therefore it is the new evidence admitted by criteria 3-4 that is most 
important in deciding the grouping of segments into phonemes. Below 
are listed all the necessary groupings, with the reasons for them; in some 
cases several groupings of segments will be made for one phoneme. 

2,1, Grouping of Segments on These Criteria 

P, P' -^ /P/ 
t, t' -^ /t/ 
k, k' -^ /k/ 
k, k' -> /k/ 


(3) The lueiubeis of each of the proposed phonemes differ analogouslj' 
in corresponding environments: unaspirated and aspirated. As between 
various combinations of aspirated and unaspirated, considerations of 
symmetry call for the grouping as given above. 

(4) Since no segments except the aspirated stops above are comple- 
mentary to the unaspirated, the only alternative to phonemic grouping 
of such pairs is to have each aspirated or unaspirated voiceless stop con- 
stitute a phoneme by itself. These phonemes would then occur only in 
certain stress positions. Stress would then become a phonemic environ- 
ment, whereas it is not a limiting envii-onment for any other tentative 
phoneme. We thus have reasons of environmental symmetry as well as 
economy in number of phonemes for grouping aspirated and non-aspirated 
stops together. 

(5) Any other grouping would require morphophonemic statements, 
since, e.g., p alternates with p' as changes in the morpheme's environ- 
ment yield changes in its stress. 

t^ t' - A/ 
d^ d -^ /d/ 

(3) The respective segments are analogous. 

(4) The alternative grouping, k' and t' or d' (k' does not occur be- 
fore w, so does not contrast with f, d'), would yield a phoneme which 
never occurs before a, o. Since other tentative phonemes are not limited 
in distribution before vowels, this would introduce a new phonemic en- 
vironment . 

(5) Final t and d of a morpheme ai-e replaced by t^ and d"' when a suffix 
beginning with w follows. 

k, k^/k, ;k-, k-^ /k/ 

(3) Differences among members analogous to those of /g/ (see below). 
This grouping is preferred to grouping the complementary k, g into one 

(4) No other segment except g is complementary to k. The remaining 
alternative, keeping k, k as separate phonemes, would yield phonemes 
occurring only before i, e, or before o, u, a kind of limitation not required 
for any other phoneme. 

g, g ^ g/ 

(3), (4) Analogous to k above. 

b. b ^ /b/ 

(3) Differences among members analogous to those of phoneme d (see 
below). This grouping is more symmetrical than grouping b with some 
Arabic segment which doesn't occur after m. 


(4) No other segment is complementary to b. The alternative, to 
keep 6 as a separate phoneme, would introduce a phoneme not occur- 
ring after m, which is an environmental restriction found only in some of 
the Arabic sounds (below) whose validity in Swahili is uncertain. 

(5) Morpheme-initial b is replaced by b when the m- prefix precedes 
the morpheme. 

(f , d -^ /d/ 

(3) Analogous to b above. 

(4) Separating these two would yield a phoneme d which does not oc- 
cur after n. While many consonants do not occur after n, all dentals do; 
furthermore, all other consonants which occur after r also occur after n 
(these are t, k, g), whereas this would give us d after r phonemically dis- 
tinct from d after n. 

m, N, nj, m'' — > /m/ 
m, N, m'' — > /m/ 

(3) See under n and s below. 

(4) Separation of any of these would yield phonemes occurring only 
or never before f, v, ui, ue, 1, r, which do not otherwise appear as dis- 
tributional limitations for phonemes. Grouping them all together gives 
us a phoneme with no limitations in distribution before consonants as 
well as vowels. This does not introduce a phoneme with a new range of 
distribution, since vowels also occur before all consonants and before 

(5) The prefix m- appears in all four of these forms, depending on the 
initial phoneme of the following morpheme. 

n, T), n** —> /n/ 
n, T), n'^ -^ /n/ 

(3) Analogous to m above: nasal homo-organic with following con- 

(4) Grouping n in this way yields a phoneme which occurs before den- 
tal and palatal consonants as well as before vowels. Grouping the seg- 
ment 1] with g instead would yield two phonemes, /n/ occurring before 
dentals and vowels, and /g/ before palatals and vowels. 

(5) A morpheme t)- appears both as n and as t) depending on the 
initial phoneme of the following morpheme. 

rn, m -^ /m/ 
n, n — > /n/ 

(3) The members of each of the proposed phonemes are analogous in 
corresponding environments; syllabic initially before consonants (and in 
the case of m, before final consonant except, b), non-syllabic otherwise. 


(4) These phonemes introduce a new phonemic environment ; they oc- 
cur in the positions of vowels and consonants. There is no analog in the 
distribution of other phonemes. 

(5) Initial rji and i\ in morphemes are replaced by m and n when pre- 
fixes are added. 

c, s — ♦ /s, 

(3) The difference between nc and ns may be considered analogous to 
the difference between m''r and mr. Each is the result of closing off the 
nasal passage before opening the oral passage in moving from n to s, or 
m to r. 

(4) Since c occurs only after n, where s does not, this grouping makes n 
occur before all dentals except § (later, even this limitation will be 

i, I ^ /i/ 

(3) Analogous to u, below. This grouping is selected rather than 
grouping i with u. 

(4) Every alternative grouping except i and u (or i and u) yields a vow- 
el phoneme whose distribution is limited by factors of stress and the fol- 
lowing consonant cluster. These limitations do not occur for other 

(5) In a given morpheme, i varies with i as the stress shifts with change 
of the environment of the morpheme. 

u, u — > /u/ 

(3), (4), (5) analogous to i above. 

ii, u — > /u/ 

(4) The only alternative would be to keep these as distinct phonemes, 
since no other segments are complementary to either of these. This would 
yield vowel phonemes with distribution limited to particular positions. 

(5) Before i, final u in the morpheme ku- is replaced by ii. 

y, Q,.w -^ /y/ 

(3) Partial support for this grouping comes from comparing the inter- 
segment relations in the /u/, /o/, /w/, and /i/ phonemes. 

(4) The alternative would be to group w or o with some consonant, to 
almost all of which these are complementary, since none of them occur 
after t, d, or t, and few consonants occur after s. That would yield, how- 
ever, a phoneme which would occur after only two or four of the con- 
sonants, namely after t and d, t and s. Our tentative /y/ has a simpler 
distribution: after all consonants. 

(5) A single morpheme appears in both forms. 


y, w ^ /w/ 

(3) Analogous to y below. 

(4) y, as it results from the grouping above, is complementary to no 
other consonant (except y and possibly some of the rarer Arabic seg- 
ments), since both y and all the consonants occur after m. Grouping y 
with w gives /w/ a distribution comparable to that of m: w occurs after 
all C, m before all C. 

i, y ^ /y/ 

(3) Analogous to /w/ above. 

(■4) This gives y a distribution comparable to that of n : n occurs before 
dental and velar consonants, y after labial and labio-dental consonants. 
Setting apart the dental-velar group implies recognition of the separate- 
ness of the remaining labial group. 

Vowels ranging from e'^ to e — > /e/ 

(3) Analogous in part to a (as to dependence on w and y), and in part 
to o (as to dependence on position within a sequence of segments which 
are being grouped into the same phoneme). 

C4) Separating any of these segments from the others would yield 
phonemes in which w and y, and position in a sequence of e's, are the 
limiting environments. 

(5) Within a morpheme, e of one height is replaced by e of another, if 
its position in the sequence is altered by the presence of a suffix contain- 
ing e, or if a suffix containing w is present. 

Vowels ranging from ^ to ae -^ /a/ 

(3) See under e above. 

(4) Separating any of these segments would yield phonemes in which 
a particular group of consonants, or position before pause, are the limit- 
ing environments. 

(5) Within a morpheme, back a is replaced by front a, etc., according 
as suffixes are or are not present, etc. 

o, — » /o/ 

(3) , (4) Analogous to e above as far as dependence upon environment 
of y)honemically identical vowels goes. 

2.2. Phonemes with Exceptionally Limited Distribution 

Additional phonemes must be set up for a few segments of limited dis- 
tribution which occur in words borrowed from Arabic, constituting a 
recognizable semi-foreign vocabulary in Swahili. x varies with h when- 
ever X occurs, and S with '6, q with k, and "^ with ' (glottal stop) or zero. 


However, h, 6, k, and zero also occur in utterances (morphemes) in 
which they do not vary with x, S, q, and ? respectively. The difference 
between the utterances in which the members of each pair vary, and 
those in which there is no variation cannot be stated in terms of segments 
or phonemes, but only by a list of the utterances (morphemes) involved. 
We therefore recognize x, S, q, "^ as phonemes, and note that every mor- 
pheme which contains any of them has a variant form with h, S, k, or 
zero respectively in their place. It is probable that the morpheme vari- 
ants with X, 3, q, "^ occur only in the speech of relatively more educated 
Swahili speakers.' 

t, 5, d, 7 also occur only in words borrowed from Arabic and having 
perhaps the status of semi -foreign vocabulary in Swahili. They do not 
appear to vary with other segments such as occur in native Swahili mor- 
phemes, and must be recognized as separate phonemes. 

From other studies, however, it appears that 15, 6, and y would be 
phonemes in the speech of non-educated Swahili speakers, but that X 
would not appear there. 

2.3. Non-phonemic Status of Dependent Segments 

Segments and components which are dependent on particular phone- 
mic environjnent, i.e., whose limitations of distribution can be stated in 
terms of the presence of other phonemes, do not have phonemic status. 
On this basis, "^ and ^ between vowels can be eliminated. We state that 

' It is possible to set up a single phoneme /'/ to indicate the dif- 
ferences between h and x, 3 and 3, k and q, zero and ' or "^ respectively. 
Then x = /h'/, 3 = /S'/, q = /k'/, ' or ? = /'/ (when not after h, 3, 
or k). Words like hawa 'these', w^hich do not contain /'/, would never be 
pronounced with the /'/-effect (i.e. would never have x instead of h, 
and so on). Words like h'adifli 'story', which contain /'/, would some- 
times be pronounced with the /'/-effect (as xadidi) and sometimes with- 
out it (as hadi^i). The /'/ is thus an intermittently present phoneme 
(see Appendix to -4.3), i.e. its presence in an utterance indicates that 
some but not all the repetitions of that utterance will have the segmental 
distinctions which it represents. This /'/ occurs only in morphemes bor- 
rowed from Arabic, and may be said to indicate a learned or 'foreign' 
pronunciation of these morphemes, as against a native pronunciation 
without /'/. 

It may be noted that the new phonemic /'/ does not occupy a unit 
length of its own after h, 6, and k. In this respect it is similar to the /t/ 
of painting (9.21) and to the components of chapter 10. However, the 
basis for setting up the '/ was not an ordinary simplification of distri- 
bution as in 9.21, nor a sequential dependence as in 10, but a desire to 
isolate those phonemic features which occur only intermittently in vari- 
ous repetitions of an utterance. 


in medial sequences of vowel phonemes, there is often a slight ^ glide 
between two vowels if the first is e or i, if the second is i, and in aa, ae; 
and a slight '^ glide if the first is o or u, the second u, and in ao. Thus, in 
certain positions (iu, eu, oi, ui) both glides are heard : ki^'umbe, ki^umbe. 
Certain other factors influence the presence of these glides. A y or w ad- 
jacent to one of the vowels will reduce the corresponding glide: we^upe, 
but nyeupe or nye'^upe. Stress may also be of some importance, but it is 
difficult to disentangle the various dependences at this level of analysis. 

On the basis of dependence on phonemic environment, it is also pos- 
sible to eliminate most of the full w's and y's that occur in pre-final posi- 
tion. In the list of segments, two types of full w and y are distinguished 
in pre-final position: those that vary freely with zero (analiya, fiukuwa) 
and those that do not (huyu, iwe). In the pre-final position, y in free 
variation with zero occurs after e and i, and after a, before i, e, a, and 
sometimes o; w appears after o and u; after a, before o and u, ™ is heard. 
If y and w occur in other positions, they do not vary freely with zero. 
Distributionally, the less full w's heard in ao and au belong with the full 
w's heard after o and u. 

On grounds of phonetic symmetry, the ao position could be put with 
either the automatic y group or the * glide group, since we have ngayo 
and nga'^o 'shield'. However, there is also a contrast between nna'o and 
nnayo ('I have', used with substantives of different classes). This posi- 
tion is similar to the medial vowel clusters in which ^ or ''' can be heard. 

A statement of the various glides heard between vowels enables us to 
dispense with ^ and ^, and with most y and w in pre-final position. They 
are now dependent on the surrounding vowels. 

Parallel to the y and w glides between phonemes are the short vowel 
reletises between consonants, which are indicated in the list by a dot after 
h, §, 1, r. They are variously heard as ' and ", their quality being deter- 
mined by the succeeding vowel. For example: bar"gumu, bar"g6wa, 
sir'kali, tar'tibu. We state that in consonant clusters, where the first 
member of the cluster is not a nasal, a vowel-colored release is heard be- 
tween the consonants. If the vowel following the second consonant is 
() or u, the release is "; if it is a, e, or i, the release is '. 

2.4. Breaking Phonemes Up into a Sequence of Other Phonemes 

We can eliminate some tentative phonemes by considering the seg- 
ments which compose them to be simultaneously composed of the mem- 
bers of a sequence of other phonemes, to which sequence the tentative 
phoneme is complementary (chapter 9). Of coui-se, it is necessary to find 


the saiup support for these steps as we found for the steps in which we 
combined segments; otherwise we could, for example, break every vowel 
into a sequence of consonants which did not occur together. 

B -> /vyx/ 

This step is of advantage in that it adds v to the consonants that occur 
before w, giving v the same environment f has, and w the same environ- 
ment y has. 
ji -* /ny/ 
i -* /dy/ 
-^ /ky/ 

This grouping eliminates four phonemes, on grounds of distributional 
and phonetic symmetry, and morphophonemic simplicity. 

Distribution: The distribution of /y/ is limited after consonants, in 
that it only occurs after labials and possibly labiodentals. This will 
broaden its distribution, which will now approximate that of /■wf. On 
the other hand, this grouping introduces the new cluster /yw \ as in the 
new /mnywe/ for ippue 'drink ye!'; but this can be eliminated by 
equating /y/ and /i/ (see below.). ^ 

Phonetic: ji is very close to ny; z varies freely in many positions with 


Morphophonemic: The widely appearing class mark prefix /ki/ has 
the form c before vowels /kiti kizuri/ Tme chair', but /kiti ceupe/ 'white 
chair'. When we Avrite /kyeupe/ the morphophonemic change is /ki/ 
to /ky/ instead of /ki/ to /c/, which may ])arallel other morphophonemic 

2.5. Identifying i ttilli y, u iiilh w 


u, w— * /u/ 

Support for this grouping is largely mor})hological: there is alterna- 
tion bet wen i and y, u and w in many morphemes, and an important 
l)art of morphophonemics is eliminated if we phonemically identify i with 
y, and u with w. 

For the most part, they are complementary: y and w occur in "■• — V; 

^ This step also eliminates the reason for i, y ^ /y/ given under (4), 
as the distribution of y is no longer complementary to the distribution of 
n, if we break p into ny, etc. However both steps have sufficient justifi- 
cation on other grounds. 


i and u occur in v — C and C — #. However i and u also occur in c — V, 
where they contrast with y and w. 

To compare spellings before the identification is made (first column) 
and after (second column) : 

V — V: (1) fwe 'that it be'; yiie 'kill it' /lue/, /iue/ 

C — V: (2) kwa 'for'; kiia 'to be' /kua/, /kua/ 

(3) kufikya (for kufica) 'to hide'; kusi- 

kiya 'to hear' /kuffkia/, /kusikia/ 

(4) kyiimba (for ciimba) 'room'; kiumbe 

'creature' /kiiimba/, /kiumbe/ 

(5) syona (for sona) 'sew'; sioni 'I don't 

see' /siona, /sioni/ 

Stress distinguishes the vocalic, and non-vocalic segments in examples 
1, 2, and 3. To the environments in which i and u are vocalic, we there- 
fore add c -^ V, and to the environments in which i and u are conso- 
nantal we add c -^ V (where the /i, u/ is unstressed). 

However, in the new spellings for 4 and 5, we lose the distinction 
between the 6 and s segments which had been phonemicized as /ky, sy/ 
(2.4 above) on the one hand, and the sequence of C plus the semi- 
vocalic member of /i/ which occurs in C -^ V : in /siona/ there is no way 
of knowing whether the first two phonemes indicate the segments si 
or the segment s (which had been phonemicized /sy/). Rather than revert 
to y and w in order to distinguish these contrasting segments, we intro- 
duce a new phoneme, written ', which occurs only above i and u in 
the position C — V, and which makes i and u consonantal. So far, we 
need it only after /k, d, s/, to represent the sounds c, i, s. We write 
tentatively: /kiumba for the segments cumba but /kiumbe/ for kiumbe; 
/siona/ for sona but /sioni/ for sioni. 

2.6. Stress Phoneme 

It is possible to reassign the component of stress so that we no longer 
require a stress phoneme. Non-zero stress occurs on the penultimate 
syllabic of every utterance, and on various other syllabics (but never 
on two successive syllabics) within the utterance. We now break utter- 
ances into parts such that stress position is penultimate in each of these 
parts (such parts will turn out to be 'words' in the morphology). We do 
this by placing a juncture mark # immediately after the vowel follow- 
ing the stress vowel. Then, instead of writing over V when it occurs 
before ((C) C) V# (V representing the syllabics /a, e, i, o, u, m, n/), 


we now state that stress occurs automatically on the penultimate V in 
respect to #. Stress thus becomes depentient on a phonemic word junc- 
ture #. Determining, the jioint of juncture on the basis of the i)!ace of 
stress is possible because all words end in \, so that we have no problems 
lus to whether a C following the post-stress (word -final) V is part of the 
next word or not ; it always is. 

However, the identification of i and y. u and w in 2.5 above required 
jihonemic dist inct ion of st ress in t he environment , as a basis for not mark- 
ing the difference in syllabicity. If we cease marking the stressed vowel, 
we shall be writing both lue and iue as /iue/, which we would pronounce 
only as iue. We therefore require a way of indicating when i, u in c — V# 
are not stressed, and incidentally not syllabic. This need can be met by 
extending the use of the consonantizing phoneme ' of 2.5. Its distribu- 
tion is now over i and u, in some utterances, in the environments C — V, 
— V#. and — VV# (the latter is necessary for cases in which we have 
to use two unstressed semivowels before the final vowel; e.g. ifinywe 
'drink ye'). We now write /iiie#/' for fwe, .'iue#/ for iue; kua for kwa, 
/kua/ for kua; /kufikia/ for kufica, /kusikia for kusikiya; /kiumba/ 
for cumba, /kiumbe/ for kiumbe; /siona for sona, /sioni/ for sioni ; and 
we write mniiie/ for mnywe (mpwe).^ 

The phoneme ' is thus defined as indicating, in certain positions, non- 
syllabic segment members of /i, u/. It overlaps in effect but not in dis- 
tribution the semi-vocalic positional variants (members) of /i, u/ when 
they occur in other positions. We may say that /i, u/ have semi-vocalic 
members in the positions stated above (2.5) and under '. This phoneme 
is thus equivalent to bringing back the i-y (and u-w) distinction, but it 
has the advantage over keeping i and y in different phonemes in that it 
distinguishes them only in stated positions, which are precisely the posi- 
tions where i and y do not alternate. If i and y were phonemically dif- 
ferentiated in the other positions, the two phonemes would alternate 
there morphophonemically. 

Stress is not phonetically uniform. Over longer utterances, certain 
stress contours are distinguishable. However, any stress may, independ- 
ently of contour, be raised between one and two levels of tone (for the 
levels see 2.7 below) when the word is emphasized. In 2.7, emphatic 
stress is extracted. We therefore recognize a loudness phoneme " which 
may occur in the position of any stress, i.e. on any penultimate V. 

' When there are # plus two nasals before a vowel, it is the first which 
is vocalic. The word-bounding # were omitted in the above examples, 
except for final # '^ the first two. 


2.7. Automatic Sequences of Stress and Pitch 

While position of loud stress in an utterance is dependent on word 
final, and is thus non-phonemic, the varying degrees of loudness and 
height of tone which occur on penultimate vowels are not automatically 
defined by #. Utterances are occasionally distinguished from one an- 
other by contours of tone alone. Below we mark the tone-stress sequences 
by raised numerals after the stressed vowels; we distinguish four con- 
trasting levels of tone a^d accompanying loudness, 1 being the lowest 
and weakest. 

umetoke'a nyumba^ni Are you coming from home? 

umetoke^ nyumba'ni You are coming from home. 
The most frequent contours are the following: 

3 2 (Question, with question word at end of utterance) 

utali'pa ni'^ni What will you pay? 

4 2 (Question, with question word at beginning of utterance) 

wa^pi umeku^la Where did you eat? 

1 3 (Question without question word) 

ye'ye pamo'dya nasi'ye Is he with us? 

4 1 (Command) 

ndyo'^o ha'pa Come here! 

2 1 (Statement) 

nimeku^la mgahawa'ni I ate at the coffee house. 

All these contours can be varied by the occuri-ence of emphatic stress " 
on any stressed vowel. 

The lengths of these contours are the lengths of minimum utterances. 
Utterances contain a succession of one or more of these contour lengths. 

2.8. Summary of Phonemes and Allophones 

The phonemes of Swahili are /p, t, k, b, d, g, f, v, s, z, 1, r, m, n, h/ 
(referred to as C), /a, e, i, o, u/ (referred to as V), ' (non -syllabic), 
" (loudness), # (word divider, usually written as space), and tone-levels 
I to 4. 

C occurs after #, V, and /m/, and before V. All C except /I, h/ also 
occur before /u/. 

t, k, d, g, s, z, 1, r, n,/ occur after /n/. 

/t, d, k, g/ occur after /r/. 

/t, k, 1/ occur after /s/. 

/k, d, s, n/ and rarely, /p, b, f, v, m/ occur before /!/. 

Each of the following sequences occurs uniquely, each in some one 


moiplieino of Arabic origin: sx, Ig, Iz, Im, Ix, fs, ft, fr, bd, bl. It may be 
that other Swahili speakers do not have these consonant clusters. 

Below are listed the chief members of each segmental phoneme with 
the environments in which the j^honeme is represented by that member. 
If the last member in a phoneme group has no environment for it in the 
h\st column, then it occurs in all positions of C or V respectively (as 
given above for C, and at the head of the vowel list for V) except those 
in which other members of that i)honemc occur. 






#-, -V 

— V 




— u 






# — e, i; — e, i 
— e, i 

# — a, 0, u; — a, 6, u 
— a, 0, u 




m — 



— u 

d (varies 


with dO 

n — 







— e, i 

— a, o, u 

free variant in m — 





(for , vu/ see below) 




n — 











— f, V 

— l.r 
— u| 

'" In this column ' will indicate stress and " regular lack of stress (not 
the /" ' non-syllabicity phoneme). These are not phonemic but are used 
here for convenience, instead of listing the conditions in respect to which 
they are automatic. For x, y read x or y. In the segment column ' indi- 
cates aspiration, not the back-of-mouth /"/ phoneme of Arabic words. 


Phoneme Segment Environment 

m #— C; — C(V)V#, when C is 

not /b/ 
/n/ T) — g, k 

n** — r 


N h 

Certain segments are represented by a sequence of two phonemes. We 
list phoneme sequence, segment, and the environment in which the se- 
quence represents that segment. 



V occurs after #, C, and V, and before #, C, and V. For VV, there 
are examples of every vowel occurring next to every vowel. 


environment of C 


« « « 


« « « 


u a u 


limited to one morpheme 

s. (varies freely with s" ) 





#-V, C-, V- 


1 . 


#-V, C-, V- 


a> > 

— t, s, q;t, s, q— 


(C)u-; #u- 


p, b, m, k, g, f, V— 


t, d, s, z, 1, r — 

< <a 


> (overriding 

effect , 


making all i 

i's further 


/e/ e'' — ((C)C)V, where V is u, o or i; 

• e^ -((C)C)*;i- 

'^ u — (Overriding effect, raising 

height of e regardless of other 
In e((C)C)e, the first c is opposite in height to the second. 


() o^ — ((C)C)V, where Vis not o; — # 

In o((C)C)o, the first o is opposite in height to the second. 

It is necessary to recognize, for the speech of our informant, a group of 
phonemes with limited C distribution. These occur in morphemes bor- 
rowed from .\rabic. They are (on the basis of fn. 7 above) : jB, t5, 7, t, '/• 

The first four of these occur in # — and — V. None occur before /u/; 
r I and 7/ occur after m. /'/ occurs after h, k, <S, and is an intermittently 
present phoneme. It is probable that only /9, S, 7/ are phonemes in the 
si)eech of most Swahili speakers. 

Members of zero: Non-phonemic segments defined by zero in stated 

w occurs occasionally in jw — V#/, /o — V#/. 

" occurs occasionally in /u — V/, /o — V/, /a — °/, /a — :#/. 

y occurs occasionally in /i — ■¥#/, /e — V#/, /a — a, e, i#/. 

' occurs occasionally in /i — V/, /e — V/, /a— re, i/. 

" occurs occasionally in /C — C "J , where first C is not /m/ or /n/. 

' occurs occasionally in /C— Ca, e, i/, where first C is not /m/ or /n/. 

Stress occurs on penultimate V before # (with '\' not included in V). 

Emphatic stress occurs in position, i.e. en stressed V, and is 
written ". 

Four phonemic levels of tone occur on stressed V, arranged in contours 
or tone morphemes. 


10.0. Introductory 

This procedure breaks the usual phonemes up into long components so 
as to yield new phonologic elements, fewer in number and less restricted 
in distribution. 

10.1. Purpose: Replacing Distributionally Limited Phonemes 

We seek to express the limitations of distribution among phonemes, 
and to obtain less restricted elements. 

Even after the adjustments of chapter 9, we will find in most languages 
that various groups of phonemes have no members in various environ- 
ments : e.g. vowels will not occur in some positions, a group of consonants 
will not occur in another.' It would be convenient to develop a compact 
way to indicate these restrictions, and to bring out the similarities among 
the various limitations upon various groups of phonemes. 

Furthermore, it would be convenient for many purposes to replace the 
phonemes by a system of elements which would have no individual re- 
strictions upon their distribution.- Such extension of the freedom of oc- 
currence of our elements is impossible with the phonemes which we have 
been using, since the operations of 7-9 have gone as far as the phonemic 
contrasts of the segments permitted. The phonemes were set up so as to 
be the least restricted successive (and in some cases simultaneous) ele- 
ments representing speech. Therefore, the only possibilities for further 
analysis lie in the direction of changing our segments.^ The chief oppor- 
tunity which we can now find for changing our elements is to consider 
each segment as susceptible of analysis into simultaneously occurring 
component elements.^ 

' The operation of chapter 9 removed the exceptional limitations of 
distribution of individual phonemes. The operation of chapter 10 will in 
most cases remove or reduce the limitations of distribution of whole 
groups of phonemes. 

^ For many purposes, of course, phonemes will remain the most con- 
venient representation of speech. 

^ Something of this kind had already been done in 8-9, as when it was 
decided that instead of considering the nasal flap [n] as a troublesome 
single segment we would consider the nasal eh^ment a segment occurring 
in the environment of the Hap, and the flap element a separate segment, 
occurring in the environment of the nasal. 

'' For the sound-feature considerations of simultaneous features see 
N. S. Trubetzkoy, (Jrundziige der Phonologic (Travaux du Cercle Lin- 



10.2. Pror«>(lurf: Phonemes Occurring Together Share u Cloni- 

We diviile phonemes into simultaneous components in such a way that 
phonemes occurring with eadi other have a component in common.' 

What we seek is not a division into components for their own sakes, 
but an expression of phonemic restrictions. Given a phoneme, we know 
that certain other phonemes occur ne.\t to it, and certain ones do not. 
The phoneme is therefore not independent of its environment. We seek 
these dependences of phoneme on environment, over short stretches,^ 

guistique de Prague 7, 1939); R. Jakobson, Kindersprache, Aphasie und 
allgemeine Lautgesetze (1941); Charles F. Hockett, A system of de- 
scriptive phonology, L.\ng. 18.3-21 §5.31 (1942). For the distribu- 
tional considerations leading to long components, and for the methods 
employed in various situations, see Z. S. Harris, Simultaneous compo- 
nents in phonologj', L.\ng. 20.181-205 (1944). For a new field of possibili- 
ties in componential notation, along the lines of chords in musical nota- 
tion, see Charles F. Hockett, Componential analysis of Sierra Popoluca, 
Int. Jour. Am. Ling. 13.258-267 (1947). 

* As will be seen below, this affords an expression of the limitation in 
distribution among the phonemes: if x occurs with y but not with z then x 
is to that extent limited in distribution (limited to occurring with y as 
against z). The componental indication of this is to say that x has a long 
component in common with y but not one in common with z (i.e. there 
is a long component one part of which occurs in x and another part of 
which occurs in y, but there is no long component shared by x and z). 
Stating the occurrence of long components is thus equivalent to stating 
limitations of phonemic distribution; but the long components can be 
dealt with much more conveniently than the statements about distribu- 

* Long dependent sequences are generally too complicated to be repre- 
sentable by components. I.e. expressing the limitations of distribution 
of a phoneme in respect to long environments would not in general yield 
new elements with greater freedom of occurrence. The limitations in re- 
spect to long environments are utilized in chapter 12, in setting up mor- 
phemes. One case, however, in which long components are established 
over long stretches is the extraction of contours in chapter 6. There we 
dealt with the limitation of distribution of, say, high and low toned vow- 
els throughout an utterance, and e.xpressed the limitations by saying 
that all the vowels in the utterance shared in a single long component 
(a contour of various heights of tone), each vowel in the utterance bear- 
ing its respective portion of the contour. The difference between chap- 
ters 10 and 6 is comparable to that between 4.22 and 4.21 : in each case, 
the two different sections apply the same fundamental operation. But 
just as it would have been difficult to know where to apply the substitu- 
tion test of 4.22 if we had not first carried it out on repetitions of an ut- 
terance in 4.21 (see chapter 4, fn. 8), so we would have been lost trying 


and will express them by long components extending over the length of 
the dependence (phoneme and environment). Since these long compo- 
nents express the dependence, they themselves will not be subject to it, 
as will be seen below. In this way we will at one time express the re- 
strictions and also obtain elements which are themselves less restricted.^ 

The basic technique, then, is to note what sequences of phonemes 
do not occur, i.e. how each phoneme is restricted so that it does not occur 
in certain environments. These non-occurring sequences are matched 
with sequences that do occur. If phoneme X occurs with Y {XY occurs), 
but does not occur with U {XU.does not occur), we say that there is a 
restriction on X (its distribution is limited so as not to include / — U/), 
and that X is partially dependent upon Y (since / — Y/ is one of the 
limited number of environments in which X occurs). It is this partial 
dependence which is expressed by long components. 

The general operation is as follows: Suppose we have, say, four pho- 
nemes, X, }', W, U, which are such that the sequence XF occurs, and 
the sequence \VU occurs, but the sequence XU does not occur. Then we 
exti-act from the sequence XI' (or from X and Y separately) a single long 
component a which is common to both X"^ and Y. We now say that WU 
does not contain this component, and that the sequence XY consists 
of the sequence WU plus the component a. The component a is defined 
as spreading over the sequence XY, i.e. as having a length not of one 
unit segment but of two, and it is this definition that expresses the limi- 
tation of distribution of the phoneme. For it is now no longer necessary 
to say that XU does not occur: X contains a, and a extends over two 

to decide how to break down our original segments. The extraction of long 
(contour) components from whole utterances, which was relatively easy, 
enabled us to group our segment-remnants into relatively few phonemes. 
And the limitation of distribution of these phonemes shows us how to 
e.xtract smaller long components which will escape these limitations. 

' Determining what are the independent successions of phonemes is 
similar to the operation of chapter 6, but the different conditions here 
lead to different methods of application. Since the contours of 6 were 
dependent sequences over whole utterances, the number of successive 
.segments was usually too great to make a detailed check of all depend- 
ences among the segments; instead, we sought those components for 
which only a very few sequences occurred, and by experience turned 
primarily to tone components. In the present case, we have no such guide 
as the preponderance of tone and stress among utterance contours: al- 
most any feature may occur in short dependent sequences. On the other 
hand, the sequences over which we seek dependence are conveniently 
short: two phonemes, three, and the like. 


unit lonpths; therefore a extends over the phoneme following A', and if 
r follows .Y we obtain not a simjile C, but U + a (which we define 
as }'). The length of the long component a may become clearer if the 
component is symbolized instead by a bar extending over its length.* 

If XY occurs E.g. since /sp/ occurs in English 

and AT does not occur and /sb/ does not occur 

and WU occurs, and /zb/ occurs, 

we define A'l' = WU, we define /sp/ = /zb/, 

X^W, /s/ = /z/, 

y = u. /p/ ^ /b/. 

The long component is defined as the difference between XI' and WU. 
In terms of articulation, we may say that the difference between /sp and 
/zb / is one of the voicelessness or fortisness.' 

10.3. Properties of Components 

10.31. Various Lengths in Various Environments 

The number of unit lengths over which a component extends may vary 
in different environments. E.g. if English extends over all successive 
consonants (up to # or vowel), it will occur over one consonant in /zey/ 
say, and over two in zdey/ stay.^" 

* For a somewhat different approach : if phoneme U does not occur in 
environment A" (e.g. /h/ does not occur after s ), we select a phoneme )' 
which does occur there (e.g. p, as in spin), and a phoneme W in whose 
environment U does occur (e.g. /z' in asbestos). Then we say that the 
sequence XY ( sp ') contains a long component which stretches over two 
unit-lengths, and which WU ( zb, ) does not contain ( sp/ contains 
voicelessness or fortisness, lacking in zb ). We can also say that when 
this component is exiracted from A'}', the residue is WU: when we add 
the voicelessness component (which we mark with ) to zb we get 
/sp/, i.e. /zb/ = /sp/. 

^ The operation of 10.2 thus enables us to select the feature of speech 
which is di.stributionally relevant in the distinction between p and /h/: 
it is that feature which we can say is also represented in the distinction 
between /s/ and /z/. That feature would be voicelessness or fortisness 
rather than aspiration, thus supporting our assignment of the unaspirated 
[p] segment which occurs after /s/ to /p/ rather than to /b/. 

^" When A'}' = WU, if X also occurs by itself, we may say that A' = 
W and that the bar component extends over the next unit length but 
without effect since there is no segment there. In many cases A' also oc- 
curs in the environment of segments which we do not wish to analyze as 
being dependent on A", i.e. XZ occurs where we do not wish to analyze Z 
as equalling some other phoneme V plus the bar component. This may 
be the case when there is no V to spare (such that A']' does not occur) 
or when there is no convenient distributional connection between Z and 


10.32. Various Definitions over Various Segments 

The speech-feature definition of a component may vary over different 
parts of its length. E.g. if extends over all successive consonants, it is 
defined as representing voicelessness only over what we may call stops 
and spirants. When it extends over a cluster which includes /r, 1, m, n, 
\v, y/, it is defined as indicating zero (i.e. no phonemic difference) over 
those phonemes: if we write /zdrey/ for stray and /drey,' for dmy, we 
have /r/ = /r/ whereas /d/ does not equal /d/." 

If a language has consonant or vowel clusters of fixed length, a long 
component could indicate the limits of the cluster by extending over it, 
and having in the last unit-segment of the cluster the definition of 'clus- 
ter end.''^ 

the proposed V. E.g. /s/ (our X) occurs next to all the vowels (our Z); 
but there are no phonemes which don't occur with /s/ and which we could 
identify as vowels minus the component (there is no V such that 1' = Z), 
because all the phonemes which don't occur next to /s/ have already 
been matched up with consonants that occur next to /s/:/p/ = /b/, 
etc. In such cases we say either that the bar component stops when it 
gets to Z, so that XZ = WZ (without the bar extending over), or else 
that the bar has zero effect over Z, so that Z = Z and XZ = WZ. It 
may be noted here that the environment of a component is not only the 
phonemes or components next to it, but also the components (or seg- 
mental remnants) with which it is simultaneous. 

" In this case, /r/ indicates different particular segment members of 
the /r/ phoneme, since the members of /r/ which occur immediately 
before or after voiceless consonants (and which would therefore have 
the extending over them) are devoiced toward their end or beginning. 
If components are extracted directly from the various segments, without 
going through a prior complete grouping into phonemes, the partial de- 
voicing would be the definition of when it is over [r]. In general, it is 
not essential that the speech-feature representations of a long component 
be identical in all portions of its length. It is essential only that the speech 
feature represented in one portion be limited in its occurrence to the 
occurrence of the sp>eech feature represented in the next portion of the 
component. It is, of course, easier to recognize this limitation when the 
features in question are identical, i.e. when the long component records 
the presence of an observed speech feature such as voicelessness through- 
out its length. 

'^ The fact that English has morpheme-medial (ilusters like /rtr/ (par- 
tridge) but never like /trt/, could bo expressetl by saying that all l*]nglish 
consonants contain a long component which extends over all successive 
consonants (within a morpheme), and which is defined to indicate 
'vowel' when it is preceded by any continuant which is in turn preceded 
by a stop. I.e. any unit segment over which this component extends, and 


10.33. Extension of a i.ornpoitent 

The succession of unit segments over which a comjxment is defined 
may be calleil its extension (or domain, or scope). 

Long components may extend from one juncture to another. Thus if a 
Navaho word ha.^ any of the phonemes /§, i, t, 3, 6'/ it will have none of 
's. z, c, 3, c', and vice versa: zas 'snow', §5."2 'joint', 31-ca h 'he is big', 
''ai-d'a"h 'it has fallen in the fire'. We extract "^ as a component extend- 
ing over all the phonemes between any two word junctures, and de- 
fined as indicating tongue blade approach (to the alveolar ridge) on 
/s, z, c, 3, c'/ and zero on all other phonemes. Then we have '#zas#/, 
/#sa-zv#', #3i-cah#/, ,'#'?az-c'^"hv#/." 

It is of course convenient if the extension of a component is from one 
juncture to another, for the statement of the boundaries of the com- 
ponent is then simpler.'* 

which is preceded by stop + continuant, can only indicate some vowel. 
Such a component, included in all the segments of the morpheme, would 
admit consonantal indication to the segment which follows the /r of 
curtain, or those which follow the t/ of ostrich or of partridge (the latter 
two are preceded by continuant + stop). But it would require the .seg- 
ment which follows the tr of mattress to indicate a vowel; hence the 
sequence /tr/ + consonant will not occur. 

'^ See Harry Hoijer, Xavaho Phonology 11-4 (University of New Me.x- 
ico Publications in Anthropology 1945). The domain of the "^ component 
is only within word boundary; compare #ca''aszi"'*#bi70sigi'-'44^/ 'a 
yucca, whose spines . . .' where the '^ affects the s of the second word, 
but not the phonemes of the first word (for the form, see Edward Sapir, 
Navaho Tex-ts, edited by Harry Hoijer 46 (1942)). E.xtracting the ''•' elimi- 
nates some morphophonemic statements, since we have members of the 
same morpheme appearing with and without "^ depending on the presence 
of '^ elsewhere in the word: dez-ba'' 'he has started off to war'; dez-'^ki 
'they have started off.' We write #dez-ba"'# and #dez-'^a'z^#, . 
The morphophonemic considerations give the preference for place of writ- 
ing the '" mark, since although the '^ extends over the whole word, it is a 
phoneme of the last morpheme in the word which contains any of /s, z, 
c, 3, c'/. (I.e. the phonemes of this set which occur earlier in the word as- 
similate to the "■' or lack of "^ of the last morpheme in the word.) 

^* A similar case occurs in Moroccan Arabic, where a word containing 
/s/ or /z will not have /s/ or z within it, and vice versa. We extract ^ 
as a component extending over all the phonemes between any two word 
junctures, and defined as indicating tongue-curving on /s, z/ and zero on 
all other phonemes. Then we have /# iams#, 'yesterday', #^suf# 
for the previous phonemic suf/ 'see', , ^t'^ssr'zam^/ for our previous 
/§§r'J8m/ 'the window', #''sft#''ssr'z9m#iams#/ for phonetic 
[saft assar'iam iams] 'I saw the window yesterday'. Since the component 
is set up to express a limitation in phonemic distribution, and since it is 


10.4. Complementary Long Components 

Various long components may be found to be complementary to 
each other, and may then be grouped into one long component in a man- 
ner analogous to the Appendix to 6.5 and to 7.3. 

This is involved in the very frequent cases of a component which is 
present in a whole class of substitutable phonemes. Thus one component 
was extracted from the /sp/ of spill, etc. (where there is no /sb/); an- 
other component was independently extracted from the /st/ of still, 
etc.; and a third from the /sk/ of skill, etc. The environments of these 
three components (in this case, the segmental remnant occupying their 
second unit length) are complementary. It is therefore possible to group 
the three components into one component element (indicated by a bar), 
marking that element by an identical bar in all the cases: /sp/ = /zb/, 
/st/ = /"zd/; /sk/ = /zi/.'^ 

A slightly less trivial grouping of complementary components is seen 
if we extract from English clusters (primarily in morpheme-medial 
position) a component of voicelessness. This is possible because mixed 
voiced-and-voiceless clusters (of consonants which have voiced-voiceless 
homo-organic pairs) do not occur here: we have /bd/ in hebdomadal and 
/pt/ in apt, but no /bt/ or /pd/ within a morpheme. It is therefore 
possible to analyze /pt/ as /bd/. This component is complementary to 
the one which differentiates /sp/ from /zb/, etc., and may therefore be 
grouped with it. We thus obtain a voicelessness component the domain 
of which is all consonant clusters which do not cross morpheme boundaiy. 

Any English component which extends only over vowels, or over con- 
sonant clusters which do not contain consonants which are members of 
the voiced-voiceless homo-organic pairs, or over clusters which always 
cross morpheme boundary, would be complementary to the voicelessness 
component and could be grouped with it, and marked by the same bar. 

defined to extend over the whole succession of phonemes any of which 
are involved in this limitation, it follows that the occurrence of a com- 
ponent in one domain is independent of its occurrence in any other do- 
main. Thus /#'^sft#^ssr'zam#iams#/ contains three successive do- 
mains of ^. In the first two domains "^ occurred, and was independently 
noted ; in the third it did not. 

'•■^ It is advisable to establish the identity of the three original com- 
ponents in this way, even though they all have the same speech-feature 
definition, because we cannot test the substitutability of these com- 
ponents after the manner of 4.22 since these components are only features 
of segments, not whole segments. 


10.5. Koducing Vi'hole Phonemic Stocks into Components 

Given the stock of phonemes, each with its limitations of occurrence, 
for a particiihir language, we can proceed to extract the long components 
by asking what sequences of phonemes which occur can be matched with 
non-occurring sequences.'* From each such occurring sequence, or from 
each of the more general tyjies (where a whole series of occurring se- 
quences is matched with a corresponding series of non-occurring se- 
quences), we extract a long component. The phonemes from which the 
component has been extracted have thereby lost part of their speech- 
feature definition; and two phonemes which were previously differen- 
tiated only by this feature are now identical. If we originally had four 
phonemes /s, p, z, b/, and if we extracted the voicelessness component 
from the sequence /sp/ (and from /s/ and /p/ when they occur without 
each other''), so that /sp/ = /zb , we no longer have four elements 
but three: /z, b, /. 

The number of post-extraction phonemic elements, i.e. components 
and segmental remnants (which may be termed residues) is thus smaller 
than the number of original phonemes. 

When we have expressed by means of components the restrictions 
upon distribution of all the phonemes of a language, we may find that all 

'* We may approach the problem by asking what speech features (or 
sequences of speech features) are such that if they occur over one pho- 
neme (in a particular position) they will always occur over its neighbor, 
too (or over some farther-removed phoneme). E.g. in English consonant 
clusters within a morpheme, if one component is voiceless so will the 
others be (not counting those which have no voiceless counterpart); but 
if one of them is a stop the others will not necessarily be so (there is /pt/ 
in apt, but /ft/ in after). Therefore, voicelessness will be representable by 
a long component, while the stop feature will not. 

In seeking which limitations of distribution may best be expressed by 
components, it is often convenient to begin with the more obvious limita- 
tions of clustering, vowel harmony, and the like. Useful signposts may be 
found in relations between the morphemic alternants of chapters 13 and 
14, such as are included under the terms morphophonemics, assimilation, 
and dissimilation. 

" One of the major difficulties in deciding whether to extract a com- 
ponent is the requirement that if we extract a component from the se- 
quence /XV/ by saying that it equals /WU/, we must e.xtract it from 
/X/ and from /!'/ even when they are not in this sequence. I.e., we must 
always replace /X/ by /IF/ and /!'/ by /U/. We can do this with the 
aid of such techniques as are mentioned in 10.31-2 and fn. 10 above; but 
it will often be hard to decide how much is gained or lost from writing 
/X/ as /W/ even when it occurs alone. 


or most of the phonemes have been reduced completely to components, 
each component representing one or more of the distributional charac- 
teristics of the phoneme. English /s/ may no longer be component 
plus residue /z '. Rather, /s/ may be a simultaneous combination of 
various long components, and /z/ may be another combination; with /s/ 
and /z/ differing from each other at least in that /s/ contains the voice- 
lessness component while /z/ does not.'* 

In many cases it may be impossible to express all the restrictions in 
terms of components. Some of them may conflict with each other in such 
a way that componental treatment of one precludes the other. In this 
case a number of special statements would have to be made about the 
restrictions on certain phonemes or residues (or combinations of com- 
ponents) which remain after the components have been extracted. 

10.6. Result: Components of Various Lengths 

We now have a group of new elements, long components formed from 
the phonemes on the basis of their restrictions. These elements represent 
features of speech, and have the length of more than one unit segment, 
but not necessarily of a whole utterance. Combinations of components 
])Ius their residues, or unit-length sections of combinations of these com- 
ponents alone, equal our previous phonemes.'^ 

A single component often supplants several phonemes. E.g. the use of 

eliminates /p, t, k, f, 6, s, s/, since /s/ = /z/, etc. Each component 
eliminates at least one phonemic restriction, since it is on that basis 
that the components are set up. E.g. the wTiting /zbin/ for spin and 
/sez'bezdaz/ for asbestos eliminates the need for the statement that /b/ 
does not occur after /s/: there is now no /s, p/; and /z, b/ occur freely, 
with free to occur or not occur over them. The gain, of course, lies in 
the fact that if " occurs at all it must occur over the whole sequence 
(of one or more consonants). The new requirement of having to state the 
length of a component (not only in number of unit lengths but in terms 
of explicit domain) is the cost of eliminating the phonemes or restric- 

The usefulness of componental analysis is not that it yields a new, and 

'** For the reduction to components of the consonant stock of a lan- 
guage, see the Appendix to 10.5. 

'^ I.e. particular combinations of component-marks (in stated environ- 
ments) identify particular phonemes, and the components are given such 
speech-feature definition as will make the coincidence of their representa- 
tions equal the speech-feature definition of the particular phoneme which 
they identify. 


more complicated, method of indicating each phoneme (as a particular 
combination of components), but rather that it yields a system of base 
elements in terms of which the distribution of descriptively distinct 
sound features can most simply be identified. The test of usefulness of 
the analysis is that phonologic statements about utterances should be 
much simpler when couched in componental terms than in phonemic 
terms. This is possible because of the way components have been set up. 
The components replace not only the phonemes but also the limitations 
of phonemic dLstribution. They do so by the manner in which they are 
defined. The phonemes had been defined to represent particular segments 
in particular environments, the relation of segment to environment being 
always the same: the occupancy of a unit length within a succession of 
unit lengths. In contrast, the long components are also defined to repre- 
sent particular residues or components in particular environments, but 
the relation of element to environment is no longer the same in all cases: 
the element may be any number of unit lengths, and it occurs simul- 
taneously as well as sequentially in respect to other elements. In this 
way, the definition of each component expresses the distributional rela- 
tion of one element to the environmental elements, which is thus elimi- 
nated from further discussion. 

Components are therefore useful primarily when they are fully defined 
as to their various lengths and domains in various environments, and 
when utterances written componentally take full advantage of all ab- 
breviations permitted by the component definitions, rather than spell 
out the successive phonemes in componental representation. For ex- 
ample, if — is defined as a cluster-long devoicing component, we do not 
have to specify its length in each environment, since the length is deter- 
mined by the environment: if it is more convenient, we can as readily 
write /sez'be zda z/ as y'sez'bezdaz/ for asbestos.^° 

^^ Components can also be so set up as to make phonemically different 
alternants of one morpheme turn out to be componentally identical. 
When written componentally, then, the morpheme does not have differ- 
ent alternants, and a morphophonemic statement is thus avoided. An 
example of this is seen in fn. 13 above, where the basis for identical com- 
ponental writings is the fact that a component of one morpheme is so 
defined as to extend over another morpheme which itself does not con- 
tain the component. (The domain of the Navaho "^ is the word, and since 
de'z has the alternant de'i only when the last morpheme of its word con- 
tains V, it is possible to leave the occurrence of "^ in de'^ unmarked, thus 
writing it identically with de'z). For somewhat different cases, see Z. S. 
Harris, op. cit. in fn. 4 above, pp. 195-6. 


Appendix to 10.2: Phonemic Status of Long Components 

The methods of chapter 10 show that the long components, like the pho- 
nemes, are determined not on the basis of any absolute considerations, but 
relatively to each other. The components may indeed be viewed not as new 
elements, but as symbols for relations among phonemes, much as pho- 
nemes are symbols of relations among segments. When we supplant the 
stock of phonemes of a language by a smaller stock of long components, 
we have in effect broken down the distributional interrelations (mutual 
restrictions) of each phoneme into partial restrictions (in respect to par- 
ticular other phonemes) which are independent of each other and the 
sum of which constitutes the total limitations of occurrence of that 
phoneme (in respect to all other phonemes).^' 

The original grouping of segments into phonemes was designed to 
express the contrasts among the segments. Distinct phonemes were to 
represent contrasting segments. However, we often find that there are 
fewer contrasts in one position than in another: /p/ contrasts with /b/ 
after # but not after /s/ (pin, bin, spin). This is the source of the re- 
strictions upon phoneme distribution. It is therefore a step forward to 
redefine phonemes in such a way that /A/ is distinct from /B/ only in 
the environments where [A] and [B] contrast. This was done, for ex- 
ample, in the Appendix to chapters 7-9, section 2.6 (paragraph before 
last), where /y/ was redefined as /i/, and was thus distinguished from 
/i/ only in those positions where [y] and [i] contrast (these being the only 
positions where /'/ was defined). Such redefinitions are readily obtained 
by means of components: Swahili /d/ and /r/ are distinguished compo- 
nentally only in those positions in which they contrast (fn. 27, 29 be- 
low); Navaho /z/ is not distinguished from /i/ in the same word (10.33); 
/p/ is not distinguished from /b/ after /s/ (10.2). 

The setting up of long components is equivalent to making distribu- 
tional analyses of sequences of segments, rather than of single segments. 
If unit length had not been determined in chapter 5, and if we had been 
willing to deal with any segments of any length, we might then have 
considered restrictions of succession among these segments, and have ar- 
rived at long components supplanting the varied original segments. This 
would have been a far more complicated task than following the series 
of intervening procedures presented here. 

^' Just as the phonemes of a language could be described as marking 
the independent distinctions among utterances (4.31), so the long com- 
ponents of a language can be described as marking the independent re- 
strictions among these phonemes. 


>\ hen all the phonemes of a language have been expressed as combi- 
nations of components, the components constitute a distributionally pre- 
ferred set of b.'isic elements for linguistic description. Defining /z, b, 
d, / (or a reduction of these residues to components) as the new ele- 
ments, in the place of /z, b, d, s, p, t/ makes for a simpler spelling of forms 
like asp and asbestos: /aezb/ and /sezbezdaz/ instead of /sesp/ and 

The components differ from the phonemes (or residues) both in the 
variety of their lengths and in the fact that various numbers of them may 
occur over any one unit length. In ^v^iting utterances by means of pho- 
nemes, or in discussing the distribution of phonemes, only the sequence 
of phonemes mattered: it followed from the definition that every pho- 
neme occupied only one unit length and that in each unit length only one 
phoneme occurred. In the case of components, there is a choice of meth- 
ods of combining (e.g. that every unit length shall have not more than 4 
components over it). If we state that in a given language all combina- 
tions of the components occur, we must specify within which method 
of combining this holds. 

Appendix to 10.5: Component Analysis of Swahili 

As an example of how the whole phonemic stock of a language may 
be supplanted by a smaller stock of less restricted components, we con- 
sider the phonemes of Swahili as obtained in 2.8 of the Appendix to 7-9. 
The representation below is only one of many possible ways of analyzing 
the phonemes into components. '^^ 

The list of phonemes given there contains 25 segmental phonemes (5 of 
them vowels, and 5 of the consonants being restricted to words of Arabic 
origin), 1 juncture, and 6 suprasegmental phonemes. 

The suprasegmental phonemes are the result of extracting tone and 
stress features from all the segmental phonemes (in particular, from the 
vowels), and consonant features from two of the vowels in certain posi- 

The major privileges of occurrence of the segmental phonemes are: 

All consonants occur in /# — •/, /V — /, / — V/. 

"^^ The particular distribution and number of Swahili consonants (ex- 
cluding most of the Arabic ones) makes a complete componental analysis 
clearer and easier than in many other languages. In the following anal- 
ysis, geometrical marks are used instead of alphabetic or numerical 
marks, not out of any attempt at a 'visible speech' writing but only in 
order to show the varying lengths of our components, indicated by 
lengthening the geometrical marks. 


All except /0, (5, t, '/ occur in /m — /. 

ft, d, k, g, s, z, 1, r, n/ occur in /n — /. 

/t, d, k, g/ occur in /r — /. 

/t, k, 1/ occur in /s — /. 

Unique clusters (in single Arabic words): /sh', Ig, Iz, Im, Ih', fs, ft, 
fr, bd, bl/. 

All vowels occur in /#— /, /C-/, /V— /, /— #/, /— C/, /— V/. 

The major limitation in the freedom of occurrence of segmental pho- 
nemes is in consonant clustering.'-^^ Therefore, we first seek a representa- 
tion for the consonants and their limitations of distribution. Of the 20 
consonant phonemes, it is convenient to omit from first consideration 
the two (t, ') which are probably restricted to the relatively few speakers 
who know some Arabic. Of the remaining 18, all but two, /6/ and /S/, 
occur after /m/. We consider first the remaining 16. All of these^'' occur 
as second members of a cluster, but only 4 of them occur also as first 
members, hence we may best consider the clusters in groups depending 
on what consonant is the first member. There are four such groups, with 
/m, n, r, s/ as first members respectively. For each of these 4 cluster 
types we will want to have a long component (or a combination of long 
components) which will e.xtend over the whole cluster. Since there are so 
few first members, we can have these 4 consonants marked by the long 
components alone, without residue,^^ while the consonant which follows 
them in the cluster is marked by the component (or combination of 
components) which indicates the first member, plus a differentiating 

vSince /m/ occurs before all 16 consonants, it should have a component 
in common with all of them. However, we want the components not only 
to express the clusters which occur, but also to differentiate the pho- 
nemes. A component which occurred over all 16 consonants could not 
serve to differentiate one from the other. Therefore, we mark /m/ by 
the component zero,^^ defined to extend over the whole consonant clus- 

" E.g. there is no limitation in /* — V/, where every C occurs; but 
there is great limitation in /* — C/, where only four consonants occur. 

^•' The only phoneme restricted to Arabic words which remains in 
this group is /-y / . 

'^^ This does not mean that each of these four will be marked by a dif- 
ferent single component. Some of them can be marked by special combi- 
nations of the components which singly mark the others. 

^•^ Writing this zero component by a space between letters will not con- 
flict with the space usually printed between words, for in the present 


tor: componeutal /V V/ represents /VmV/, and componcntal /V CV/ 
represents /VmCV/. 

Since /n/ occurs before about half of the IG consonants, we represent 
/n/ by just one component, and say that that component extends over 
the whole cluster in which it occurs: /V V/ represents /VnV/, and 
y CV ' represents / VnCV/- This bar component can eliminate 8 of 
the 16 consonants, since we can differentiate half of the 10 from the 
other half by use of it: e.g. we can write /p/ instead of /k/ . Since 8 pho- 
nemically different consonants occur after /n/,^'' we match these 8 (which 
include /n/) against the other 8, and say that the 8 which occur after 
/n/ include the bar component plus a residue; the 8 residues can be 
simply the 8 phonemes which do not occur after /n/. Then /n, s, z, 1, 
d, t, g, k/ will all contain the bar component. Following 10.2 we write 
/nk/ = /l)/; since np does not occur and /'mp/ occurs, and since 
we are writing /m/ componentally as zero, this is equivalent to saying 
that /nk/ = /mp/ + the bar component. Similarly /VnnV/ = /V V/. 
In componental terms there is no distinction between /p/ and /k/, or 
between /m/ and /n/, when these occur after /n/; since /p/ and /k/, 
or /m/ and /n/, are now identical except for the bar, and the bar is 
necessarily present in the position after /n/. This is as it should be, since 
/p/ and /k/ did not contrast, nor did /m/ and Jn/, in the environment 

Swahili analysis the phonemic juncture between words is marked not by 
space but by \. It maj' seem peculiar to use space not for word boundary 
but for a sound. However, our marks are phonemic, not phonetic, and 
are therefore selected so as to express phonemic relations. In Swahili, it 
is /m/ that has least environmental limitations (and is marked by zero), 
whereas word boundary has greater restrictions. 

^^ A problem arises here since /n/ occurs before 9 phonemes. However, 
it happens that /nr/ occurs only initially and never in our material be- 
fore /o/, while /nd/ occurs chiefly medially, with its only initial occur- 
rence in our material being before /o/. If this difference is not erased by 
later material, therefore, /nd/ and /nr/ do not contrast. In phonemic 
writing it is convenient to distinguish them, since /d/ and /r/ contrast 
otherwise. However, the analysis into components is designed to show 
exactly what sequences occur, so that it is permissible to identify /nd/ 
and /nr/ in this analysis, and thereby reduce the number of phonemes 
after /n/ to the desired 8. This reduction is supported by the consider- 
able similarity between /nd/, representing the segments [nd] and [nd'], 
and /nr/ which represents [n'^r]. We thus have /V d/ = /Vnd/ = 
[Vnd];/#— do/ = /#ndo/ = [#nd'o]; /#-de/ = /#nre/ = [ttn-^re]. 


We next consider clusters of /r/ plus consonant, /r/ occurs before 
4 consonants, all of which also occur after /n/. We therefore write pre- 
consonantal /r/ as a combination of 2 components ~ and say that this 
combination extends over the whole cluster.^* Then /t, d, k, g/, which 
follow /r/, must also contain this combination. Since /d/ does not occur 
before any other consonant, and since we have given the same com- 
ponental writing to /nd/ and /nr/, we can write /d/ as ™ and say that 
when these comjjonents are the first part of a cluster they represent /r/, 
and when they occur before a vowel they represent /A/ P If /v/ is repre- 
sented by the component / and /g/ by /, we say that ^ followed by 

^^ But " when not accompanied by does not extend. 

"^^ In keeping with the 10.2, /r/ should contain any component which 
is common to all the consonants that follow it. One of these components 
is , since each consonant which occurs after /r/ also occurs after /n/. 
Since the bar component differentiates the 8 post-/n/ consonants 
from the 8 non-post-/n/ ones, and since cluster-initial (pre-consonantal) 
/r/ has to contain this component, then pre-consonantal /r/ should be 
identified with one of the post-/n/ consonants. But /r/ is not listed as 
one of the phonemes which occur after /n/, since /nr/ has been com- 
ponentally identified in fn. 27 with /nd/; i.e. /r/ after /n/ has been writ- 
ten /d/. We now see that the /r/ which occurs as first member of a cluster 
must also be componentally equated to the /r/ = /d/ which occurs 
after /n/. This is possible because /r/ and /d/ contrast neither after /n/ 
nor before consonant. However, /r/ and /d/ do contrast after vowel or # 
or /m/, and in that position they must be written differently from each 
other. Therefore, we cannot write /r/ in /# — /, /V — /, /m — ■/ with the 
sign used for /r/ and for /d/ in /n — ■/, / — C/. All this is only an apparent 
confusion, due to the fact that in this case the components require a dif- 
ferent grouping of segments than did the phonemes. The grouping of [r] 
and [d] segments directly into components is relatively simple. If we 
write/-^/ for [d] in /#— /, /V— V/, /m— /, and /"/ for [r] in these 
same environments (where the two segments contrast), and if we define 
the component combination /"/ as extending over a whole con- 
sonant cluster, we may then write this same /"/ for the following seg- 
ments in the following environments (in none of which it contrasts with 
the previously defined /~/): for [r] in /V — CV/; for [d] in / — ■, 
~~ — / (i.e. after /n/ or after itself); for [d"'] in /# — -of; for [''r] in 
/# — -V'/ (V = vowels other than /of). Thus / / represents seg- 
ments which had been grouped into the /r/ phoneme; while /'^/ repre- 
sents segments which had been grouped some in the /r/ phoneme and 
some in the /d/ phoneme, but which were complementary in distribution. 

Phonemically, /r/ = /d/ after /n/ or before (-, while it equals /r/ in 
other positions. This partial overlapping in phonemics would have led to 
morphophonemic statements, since prefix n- plus -refu 'friend' would 
have had to be written /ndefu/ (pronounced n''refu). But no morpho- 
phonemic statement is required by our new writing, since followed 
by " would in any case be =^^^=^^ i e. /nr/ and /nd/ are identical. 


either or y is in either case '^ = /rg \ so that /rv/, which does not 
occur, cannot be written.'" -^ followed by zero is ^"^^, i.e. /rd/." 

/s/ occurs before 3 consonants, all of which also occur after /n,'. We 
write /s ivs tlic ctmibination ^ , which extends over the whole cluster, 
and wlucli must also be contained in the component writing for /I, t, k/. 
If V, is , and /I/ is / then " followed by / or / will in either 
case yield the sequence / =■ /si/, so that a distinct /sv/ can not occur 

in component writing. Since ~ before zero yields which should 

indicate the non-occurring sequence /ss/, we may define the two-unit 
to indicate the sequence /sh'/ which occurs in a single morpheme.'^ 

We now have three components, , "~, . extends over the whole 
cluster, and so do ~" and ' when they occur with it. These suffice to 
indicate all the limitations upon consonant clusters," but they do not 
suffice to difTerentiate all the 16 phonemes, if each component is to oc- 
cur not more than once over any unit length. Thus /t and /k/ must 
each include the components -^ because they each occur after , ^, ~. 
One of them must have a residue to distinguish it from the other. Simi- 
larly, /d/ and /g/ must each include ^, since each occurs after , =, 
but one of these two must have a residue to distinguish it from the other. 
Since the residue distinguishing /t, k/ is complementary to that dis- 
tinguishing /d, g/ (the former occurs with ~- and the latter with ~), 
we may use one mark for both residues. One unit-length residue, together 

'"Because of the extension of "over a whole cluster, /rg/ is~ + 
p? _ ~-p;^ g^^j 1^^ I ig ~ _|_ / = y Hence, /rv in components is 
identical with /rg/ and the distinction between the two (which does not 
exist, since /rv/ does not occur) cannot be made componentally. 

" I.e. /rd/ = /nn/ -|- '~ '; or mm/ -|- /™/. Clusters like /rn ' or 
/rm/ which do not occur cannot be written. 

''^ /'/ is one of the phonemes e.xcluded from the selected 16. 

" The clusters Iz, Ig, Im/ occur once each in the material on which 
this analysis is based. If it was found desirable to include them, we could 
define the combination /, which is necessarily the representation for 
/I/, as extending over a whole cluster (i.e. ; would extend over a cluster, 
but only when it occurs with ~). Then / -|- zero would yield the 
two-unit length '^ (which could be defined to represent /Im/ instead of 
/li/), and / + ~~ would yield Jp (which could be defined to repre- 
sent/lg/ instead of Ik/). No other distinct combinations would be pos- 
sible, since / plus anything else would equal one or the other of these 
two: e.g. / + 7 would merely yield ^^ over again. Therefore, /Iz/ 
could be taken instead of /sx ' as the definition of the two-unit-length 
'^^ noted above. 


with the three long components, will suffice to differentiate all 16 con- 

We can represent all 16 as different combinations of these four ele- 
ments (each occurring once or not at all over each unit length) : one pho- 
neme will be zero (no component), 4 phonemes will be one different com- 
ponent each, 6 phonemes will be composed of different combinations of 
2 components, 4 phonemes will consist of 3 components each, and one 
phoneme will consist of all four components together. This means that 
whatever combination of components we can make will represent some 
phoneme for each unit length. 

Starting with phonemes like /t, k/ whose components (except for one 
differentiating residue) are determined by the clusters into which they 
enter, and ending with phonemes like /v, b/ to which we can assign any 
combination which does not include , ™, ~ (since /v, b/ do not occur 
after these), we can identify the phonemes as follows:^'' 





























/m, n, r, s/ can now occur before any consonant ; but after /m/ all C re- 
main unchanged, after /n/ 8 of them become identical with another 8 
(so that only 8 different consonants appear after /n/), and after /r, s/ 12 
of them become identical with the remaining 4. 

We now turn to the vowels. Since only vowels occur before phonemic 
word juncture, we write word juncture with the component \ and say 
that when that component occurs alone it extends backward one space. 
Then we write, arbitrarily:^^ 

i ^ e \ a ^ 

" ^ o \ 

The remaining consonants in words borrowed from Arabic can be 
written by some of the remaining combinations of / and \ with the other 

" In this list d refers to /d/, or to any /r/ that is next to a consonant; 
r refers to /r/ in all other positions. 

^^ Before word juncture above would indicate /e/, since + \ = 7- 
In that position /e/ does not contrast with /n/, and can indicate / / 
here and /n/ elsewhere. Other combinations than these five (or ten, wi^h 
and without \) would not occur before \, even though the componental 
writing does not make it impossible to write them (as it made it impos- 
sible to write non-occurring consonant clusters). 


oomponeiils. They would be differentiated from tfie vowels by contain- 
ing / and from the post-/m/ consonants by containing \. 

We can now make a general statement about the sequences of pho- 
nemes, or rather of comliinations of comj)()nents, which occur in Swahili. 
Before or after a combination which has been defined as not extending 
beyond its unit length^*^ there occurs only a combination which does ex- 
tend (including zero), or a combination which includes \ and not j }'' 
Before a combination (including zero) which has been defined as ex- 
tending, there always occurs a combination including \ and not I }^ All 
other sequences occur. Furthermore, except for some consonantal com- 
binations of \ which do not occur, we have all combinations of com- 
ponents. With these limitations, therefore, all sequences of all combina- 
tions of the 4 components occur. This means that after a juncture or a 
vowel we may have another juncture or vowel, or any consonant. If the 
consonant is zero, or any extending combination, any other consonantal 
combination or a vowel may occur after it. If the consonant is other than 
an extending combination, only a vowel occurs after it.'^ 

Since these components identify phonemes, they represent features of 
speech and can be identified with articulatory movements or with fea- 
tures of sound waves. Their simultaneous combinations would be identi- 

^^ Extension over a cluster will now be defined as extension over any 
combination not including \. 

" Except for the unique clusters of /I, f, b/ + C, which occur each in 
only one morpheme. These may not occur among most Swahili speakers. 
It is possible to include the IC clusters in this system by using the meth- 
od of fn. 33 or otherwise, but this was considered undesirable since they 
are on a par with the other unique clusters, which have not been in- 
cluded in this system. 

'* I.e. a vowel. Even this last limitation upon the random occurrence 
of all sequences can be eliminated by various devices. We could add to all 
phonemes a component which would have consonantal value in the posi- 
tions where consonants occur and vocalic value otherwise. Or we could 
symbolize the components by numbers rather than geometrical marks, 
and define the values of the numbers in such a way that no sequence of 
consonants could contain more than the one non-extending combination. 

^^ It is noteworthy that zero (written as space), the absence of all 
components, indicates not juncture but /m/. Juncture is quite free in 
respect to what follows it (C, CC, or V), but is highly restricted in respect 
to what precedes it (only the 5 V phonemes), /m/ is followed by every 
phoneme except juncture (and most of the Arabic consonants); it is pre- 
ceded by V, m, or juncture. Had juncture been assigned the simplest 
mark (space), that mark could not be used to express restrictions of post- 
vocalic (i.e. word-medial) clusters. 


fied with the total articulatory movement or sound waves of the phoneme 
they represent. In particular: 

" " indicates unvoicing, except when alone or with / or \; all un- 
voiced segments contain " ; when alone ' indicates voiced velar spirant ; 
with y it indicates lateral; with \ it indicates non-front position. 

Zero indicates labial nasal. indicates in general retraction in mouth: 
with "^ it indicates palatal, otherwise dental position. With \ and not / it 
indicates far front or far back position. 

"^ indicates mouth closure, except that when alone it indicates the 
special closure of a flap, and with '" it indicates maximum mouth open- 
ing. With \ it indicates the least open position with which \ con-elates. 

/ indicates front position, except as retracted by other components. 

\ indicates considerable mouth opening, except when with \. 

Appendix to 10.1-4: Unit-Length Components; Tone Phonemes 

Quite independently of the setting up of long components as a new 
set of less restricted elements, it is possible to break each phoneme or 
segment into unit-length components. Such analysis results not fromthe 
purpose stated in 10.1 but from other and only indirectly related con- 

One of these considerations is the compound character of the sounds 
represented by our segments, whether these sounds are observed articu- 
latorily or acoustically. Various organs of the speaker are in motion 
while he pronounces any one sound, and the resulting sound wave can 
be described as the resultant of waves of various frequencies. It would 
thus be possible to set up elements representing individual movements of 
organs involved in speech, or simple waves of various relative frequencies, 
and identify the phoneme as a simultaneous combination of these ele- 
ments.*" Furthermore, such elements could be so defined that each pho- 
neme should not be composed of unique elements, but rather should con- 
sist of a different combination of a few out of a limited stock of these ele- 
ments. Thus, the English sounds represented by [v] and [z] normally in- 
volve vibration of the speaker's vocal chords, which is not the case for 
[f] and [s]; [v] and [z] have a feature in common which is absent in [f] 
and [s] and which is noticeable, for instance, in the fact that [v] and [zJ 

''" Various attempts have been made to represent these several articu- 
latory factors in a speech sound, without regard to phonemic analysis. 
Cf. for example, the analphabetic system in Otto Jespersen, Lehrbuch der 


fiin be lieard at much greater distances (everything else being equal) 
than [f] or [s]. 

Another consideration is the availability of simultaneity, in addition 
to successivity, as a relation among linguistic elements."" The possibility 
of having elements occur with each other is left open by the previous 
procedures (except in chapters 6, 10), where the operations involve only 
the relation of segments being next to each other (in determining inde- 
pendence, length, environment, etc.)- The consideration of elements 
among which there obtains the relation of simultaneity involves removing 
the limitation to one dimension from linguistic analysis. Removal of this 
limitation is all the easier in view of the ease of arranging letters on paper 
two-dimensionally, and of the ready availability of mathematical ter- 
minology for two-dimensional relations.''^ 

Identifying each segment as a combination of unit-length components 
will not in general eliminate the limitations of occurrence of the segments. 
However, when the same components are extracted from a group of 
phonemes, we can reduce the number of elements, since any number of 
different elements can in general be identified as different combinations 
of a smaller number of elements. ^^ In terms of speech features, we could 
represent by components such features as we can find in various seg- 

In certain situations, the extraction of unit -length components from 
segments reduces the number of distinct elements so greatly as to have 
become a regular practice in the setting up of linguistic elements. This 
has occurred chiefly for features like tone and stress which differ con- 
siderably from the other speech features (such as tongue position). ^^ 

"•• This was investigated explicitly by F. de Saussure in his Cours de 
linguistique generale, and by the linguists who followed him. 

*^ For the simplest transfer from one-dimensional to two-dimensional 

elements: if we can identify utterance A as ab (where letters written 
above and below each other indicate features which are simultaneous 

with each other), and utterance B as cd, we say that A and B are identi- 
cal in one set of their simultaneous components, namely xz. 

*^ We can take any elements. A, B, C, D, E, F, G, and define a smaller 
number of new elements m, n, o in such a way that A = m, B = n, 
C = o, D = 7n + n, E = 7n-\-o, F=n + o, G = 7n-\-n + o. 

** And which have come to be regarded as distinct because they appear 
as long components and morphemic contours in many languages. 


Thus a language may have several contrasting vowels, among which 
there are no dependences which can be expressed by long components, 
and which are differentiated in tongue position, tone, length, etc.: e.g. 
high tone e, ^, o; low tone e, ae, o, or e, a, d. Instead of assigning these 
to 6 different phonemes, it is customary to extract the tone difference, 
resulting in -4 phonemic elements: high tone ('), and toneless e, ae, o (low 
tone being marked by absence of ')^^ We then consider each segment to 
be the simultaneous combination of two unit-length elements: e is writ- 
ten e, while e is written e, and so on.*® Extraction of this component 
under these circumstances will not in general yield elements of less re- 
stricted distribution (e.g. ' on one vowel can not be used to determine 

*^ Languages in which such independent distributions of differently 
pitched vowels occur are called tone languages; and the phonemic tones 
which are extracted from the variously pitched vowels are often called 
tone phonemes or tonemes. For some examples of tone analyses, see 
Kenneth L. Pike, Tone Languages. Cf. chapter 9, fn. 2 above. 

■*•' Analysis of tone in this distribution is to be distinguished from that 
used in other types of distribution. In the present case, the tones occur 
independently of each other and independently of any other phonemic 
feature, except for the fact that each tone is restricted to occur with a 
vowel. The recognition that one segment may be analyzed as containing 
two elements (cf. also 9.2), therefore merely leads to setting up new 
componental (suprasegmental) phonemes indicating tone and limited to 
occurring over vowels. This limitation may be expressed, somewhat after 
the manner of 10.1-6, by regarding tone as a general vowel indicator, and 
using marks like e, a only to indicate differences in vowel quality. In a 
second type of distribution, successive tones (or degrees of stress) are de- 
pendent on each other, in that only certain sequences of them occur. 
(These various sequences may correlate with various morphological con- 
structions or with various meanings.) The independent sequences of such 
tones or stresses are set up as contours if they occur over whole utter- 
ances (including such utterances as single words: cf. chapter 6), and as 
long components if they occur only over stretches shorter than any ut- 
terance. The fact that these tones are also restricted to vowels (if they 
always are) may not be expressed in the case of a contour, although in 
the case of a long component the domain of the tone (if it is marked) may 
be used to indicate the position of vowels (or the vowels may be used to 
indicate the domain of the long tone component). Finally, there is a 
third type of distribution, in which the o(;currence of a particular tone 
or stress depends upon the position of a morphological boundary: e.g. 
every word end may have a loud stress on the second vowel before it. In 
this case, the tone or stress is used as the speech feature definition of a 
juncture, and usually the juncture is marked instead of the tone. (It is not 
necessary to give the tone any additional mark, if penult vowels before 
various occurrences of the juncture all have the same tone.) 


wht'tluT tlu' next vowol has '), but will in some cases simplify morpho- 
phoiuMnic statements.^'' 

The extraetion of unit -length components in the rare type of case de- 
scribed in 9.21 (where [n] was consitlered as simultaneous /n/ -f /t/), 
and tiie extraction of long components from whole utterances (chapter 6) 
or from shorter stretches (10.1-6), were so designed as to reduce the num- 
ber of elements and to reduce the limitations upon their random occur- 
rence relative to each other. However, the extraction of unit-length com- 
ponents which is described here would have the effect of reducing the 
number of elements, and of providing elements which can more con- 
veniently represent observed independent features of articulatory move- 
ment or of sound waves. 

Appendix to 10.1-5: Unit-Length Components of a V> hole Pho- 
nemic Stock 

Aside from such special cases as a group of tonally different vowels, the 
analysis into unit-length components is of interest to linguists only when 
it is carried out for all the phonemes of a language. Only then can the 
components be so selected as to yield the simplest set of new elements 
identifying and supplanting the phonemes. 

A preliminary to this supplanting of phonemes by a set of unit-length 
components is the classification of phonemes by the speech feature repre- 
sentations which they have in common. In this classification, each pho- 
neme is considered as representing a combination of articulatory or 
acoustic speech features (e.g. /p, may be considered to represent a labial 
position and a stop articulation), and a given feature is represented by 
several phonemes (e.g. /f/ may represent labial position with continuant 
articulation). Such classification becomes of interest to the descriptive 
linguistic analysis of a particular language only when it is based not on 
absolute phonetic categories (such as particular tongue positions, or even 
tongue position in general), but on relative categories determined by the 
differences among the phonemes of that language. ''^ 

*'' E.g. morphemes beginning with 6, ae, 6 in one environment may have 
variants beginning with e, ae, o in another; instead of stating this as three 
changes (e to e, etc.) we state it as one (' to zero). 

■*' The classification of phonemes in most traditional grammars into 
labial, dental, etc. or stop, spirant, trill, etc., and the like is usually based 
largely on traditionally accepted absolute categories. Nikolai Trubetzkoy 
and several other linguists of the Prague Circle paid much greater atten- 
tion to the relative differences as determined by the phonemic stock of 
the language in question. The important point of basing the analysis 


The center of interest is shifted from the phonemes of a language to 
their classification, when the relations of classification among the pho- 
nemes are studied; in such work the investigation is directed toward dis- 
covering what are the differences among the phonemes in terms of the 
relative speech-feature categories." However, the final stage of this de- 
velopment is the setting up of the relative categories as the new elements 
of the language, with the various phonemes identified as various simul- 
taneous combinations of them.^° 

The unit-length components (or the classifications of phonemes) are 
relative to each other in that they are based on the contrasts among seg- 
ments in each environment. The components are set up in such a way 
that various combinations of them express all the contrasts in a compact 
way. The contrasting phonemes are grouped in such a way that all the 
phonemes in one group may be said to represent some stated feature of 
speech which the phonemes in another group do not represent. The pho- 
nemes in the first group are then said to include, among other things, a 
component representing this feature. Then other groupings are made, 
cutting across the first one, and each leads to the extraction of a com- 
ponent common to the members of the group. This is continued until 
every phoneme can be differentiated from every other one in terms of the 
combination of components which it equals.^' 

upon relative considerations (cf. 2.1 above) is, however, most fully 
brought out in such work as Roman Jakobson's Observations sur le 
classement phonologique des consonnes, in Proceedings of the Third In- 
ternational Congress of Phonetic Sciences at Ghent, 34-41 (1938). 

*^ This work was done largely by Trubetzkoy, most fully in his Grund- 
ziige der Phonologie (Travaux du Cercle Linguistique de Prague 7, 1939). 
Analyses of this kind do not have to be done with the particular logical 
categories used by Trubetzkoy; and new developments in laboratory 
work in linguistic acoustics may yield much more e.xact informat ion than 
has heretofore been available. In any case, however, the comparison, 
classification, and componental reduction of the phonemes of a language 
is descriptively relevant only if it is based on relative considerations. 

^° This work has been done in ms by Roman Jakobson. 

^' It is for this reason that linguists engaged in such phonemic classifi- 
cation or analysis are interested primarily in binary contrasts (binary op- 
positions). lOach binary contrast between groups of phonemes can be ex- 
pressed by the (contrast between the) occurrence and non-occurrence of a 
particular component in the unique combination of components which 
will indicate each of these phonemes. Suppose we are able to state not 
that there are, say, four stop positions /p, t, k, q/ in a language, but 
that there is one binary contrast between /p. t/ and /k, q/, and another 


This extraction of components can be performed directly upon the 
contrasting segments in each environment, independently of how they 
are formed, and can indeed serve as a criterion for grouping comple- 
mentary segments into phonemes. This is so because once a {)articular 
component, in combination with others, replaces a phoneme, that com- 
ponent will occur in every position in which the phoneme had occurred, 
and in each of these positions it will indicate the occurrence of the speech 
feature it represents; these unit-length components would not be set up 
with various definitions in various positions (i.e. with positional variants) 
because the chief purpose in setting them up is to indicate which speech 
features occur characteristically every time a phoneme occurs. 

Thus in Danish [t], [d] occur in word initial, and [d], ['5] in word medi- 
al position. Since only two of these segments are contrasted in any posi- 
tion, only two phonemes need be set up (7.41). Each phoneme will have 
to contain some component which represents a speech feature charac- 
teristic of that phoneme in both positions. We. must therefore select a 
feature represented by both [t] and medial [d] as against [t5] and initial 
[d], or by [t] and [3] as against any [d]. The suggested solution is to group 
[t] and medial [d] into one phoneme, and to saythat this phoneme in- 
cludes a component representing some feature such as 'relatively stronger 
air pressure' (in the mouth): initial [t] may be said to contain this com- 
ponent as against initial [d] which lacks it, and so medial [d] as against 
medial [3].*^ Since the two [d] represent by definition descriptively substi- 
tutable segments, but are nevertheless differentiated here in their compo- 
nents, it follows that the speech features represented by the components 

between p, k and , t, q/. Then each contrast can be e.xi^ressed by a com- 
ponent, such that one contrasting group has the component while the 
other lacks it. In Travaux du Cercle linguistique de Prague 4.97 (1931) 
Trubetzkoy indicated such contrasts as a relation of a to a + 6, where b 
was called the 'Merkmal', or the differentiating element between two 
elements which were otherwise considered identical. (This b would be the 
component in the present case, where each Merkmal of phonemic con- 
trast is set up as a new component element indicating the phonemes 
which it differentiated.) If, in the example above, the first binary contrast 
is marked 1 (as against absence of 1) and the second contrast marked 2 
(as against absence of 2), then /p/ = I, /t/ = 1, /k/ = 2, q = zero 
(i.e. whatever other components it contains, it includes neither of these 
two): The speech-feature definition of component 1 will, of course, have 
to be something which is involved in the sounds represented by , p, t/' 
and lacking in those represented by /k, q, . 

^^ The example and the final solution were given by Roman Jakobson 
in a lecture at the University of Chicago in 1945. 


need not be statable in terms of absolute measurements, but may be 
relative differences in measurements, as compared with phonemes which 
are componentally identical except for the component in question. 

As may be seen from the operations of setting them up, the unit- 
length component analysis of a whole language differs in purpose, pro- 
cedure, and result from the analysis into long components. Combinations 
of the two techniques may be possible in some languages, if it is desired 
to set up elements which can express both the distributional limitations 
of 10.1-6 and the speech feature characteristics of this appendix. In any 
case, when an analysis into unit-length components is carried out, it is 
desirable to do so on segments from which the greatest possible number 
of restrictions on occurrence have already been removed, by the opera- 
tions of 6-9. 


1 1 . 1 . Purpose : Phonological Constituency of Utterances 

We represent every utterance in our corpus of data in terms of the 
phonological elements defined in 3 10. 

The elements which have been set up for a language have been defined 
in such a way that wlien a stretch of speech is represented by them any- 
one acquainted with their definitions would know what descriptively 
relevant speech features occurred in that stretch; i.e., he would be able 
to pronounce the written representation, to produce a stretch of speech 
descriptively equivalent to that which was originally represented by the 
writing. However, we may also wish to have a compact statement of how 
these elements occur in any utterance of the corpus, so that we can make 
general statements not only about the elements but also about the utter- 
ances which we represent by these elements. 

11.2. Procedure: Stating What Combinations Occur 
11.21. Not All Combinations Occur 

If all combinations of our elements occurred, there would be nothing 
to say except a listing of the elements and the statement that all combi- 
nations of them occur, with the specification that we would upon occasion 
find zero, one, or more of them simultaneously, and zero, one or more of 
these simultaneous combinations in succession, down to any number. 

However, it is almost impossible for all sequences of all simultaneous 
combinations of all the elements (in all degrees of repetition) to occur, in 
any language. Even if we can describe consonant clusters as any sequence 
of consonant phonemes (or any sequence of any combination of con- 
sonant components), there will still be a limit to the number of con- 
sonants in the clusters; and we may be unable to describe the vowels by 
equally unrestricted phonemes or components. And even if we can de- 
scribe all sequences of consonants and vowels as equaling all possible 
combinations of a number of elements, we will usually find that junctures 
occur only in restricted places (will there be utterances consisting of /d/ 
alone?), and that the contour elements are something else again. If "we 
need say nothing more than that every utterance consists of some non- 
contour elements and some contour elements, we have already a state- 
ment of limitations. 



11.22. Utterance Formulae 

Our statement of all the combinations of elements which occur in any 
utterance of the corpus is shorter than an actual list of all the utterances 
in it: first, because we do not distinguish between sequences which are 
composed of the same elements in the same order; and second, because 
all elements which occur in the same environment are included in the 
same general statement of occurrences, and may be indicated by the 
same mark. If each of /p, b, t, d, k, g/ occurs before each of /a, i, u/ we 
write the phoneme-class mark /S/ for any one of the six stops, /V/ for 
any one of the three vowels, and say that /SV/ occurs.' The statement 
that /SV/ occurs is then equivalent to the statement that /pa/ occurs, 
/pi/ occurs, /pu/ occurs, /ba/ occurs, etc. 

We now try to find a sequence of phoneme classes which is constantly 
being repeated, so that we can say that every utterance^ and the whole 
succession of utterances in our corpus is merely a repetition many times 
over of this one sequence. 

Thus for Yokuts it is possible to state the following formula:' 

mcvic)] cvicm 

where # indicates utterance juncture and any utterance contour over 
the preceding stretch, up to the next 4t; C any consonant, V any vow- 
el, . the length phoneme; items written above and below each other arc 
mutually exclusive (i.e. if one of them occurs the other does not);'' sec- 

' Using the one symbol S for all six consonants is, of course, quite dif- 
ferent from using phonemic symbols for the various segment members 
of a phoneme. Each member (allophone) is defined as occurring in a. par- 
ticular environment. By itself, the phoneme mark indicates all the mem- 
bers included in that phoneme. But when the phoneme mark occurs in a 
particular phonemic environment, it indicates only the particular seg- 
ment member which has been defined as occurring in that environ- 
ment: in the sequence /#peyr/ pair the /p/ phoneme indicates only the 
[p''] member of that phoneme. On the other hand, the capital letters 
which mark classes of phonemes in 11.22 indicate, in each environment, 
any one whatsoever of the phonemes which they represent: e.g. in the 
/SV/ example. The segments indicated by a phoneme mark never con- 
trast (occur) in any environment in which the phoneme occurs; the pho- 
nemes (and their respective membei's) indicated by a phoneme-class 
mark are precisely the ones which contrast in the environment in which 
the phoneme-class occurs. 

^ Except perhaps interrupted utterances, which would in many lan- 
guages be indicated by incomplete contours. 

' This summarizes the analysis in Stanley Newman, Yokuts Language 
of California, Chap. 3 (1944). 

* We might have used a new phoneme-class mark, say (", to indicate 
the occurrence of either C or ", but since these two occur in the same 


tions in paronthesis ( ) sometimes occur and sometimes do not; the 
section in square brackets [ ] occurs any number of times from zero up.* 
E.g. ki 'this', biivi.nelsc.nit 'from one who is made to serve.' Repeating 
this entire formula any number of times, and substituting for each rtiark 
any phoneme (or in the last analysis any segment) which that mark 
represents,*' we would obtain any utterance of Yokuts. Conversely, all 
Vokuts utterances can be represented by this sequence repeated the re- 
quired number of times. 

11.3. Result: A Representation of Speech 

We now have a summarized statement that all utterances in our cor- 
pus consist of such and such combinations of classes of such and such 
elements,^ the definitions of each element being given by the preceding 
operations, and the combinations which occur being indicated by the 
formula, diagram, or verbal statement. From this definite statement 
about our corpus of data in the language in question, we derive a state- 
ment about all the utterances of the language by assuming that our cor- 
pus can be taken as a sample of the language. We are thus able to make 
a compact and quite a general statement about what we have been ob- 
serving, namely the descriptively relevant speech features which occur 
when the language is spoken. 

Appendix to 11.22: Utterance Diagrams 

If the facts are too complicated for a formula, we can obtain a more de- 
tailed representation of an utterance by making greater use of the above- 
below relation of the marks.* E.g. that part of an English utterance up to 

position (enter into the same phoneme-class) only once, it seems simpler 
to represent their mutually contrastive, or mutually exclusive, relation 
by the above-below relation of the marks, which is not otherwise utilized 
in this formula. 

* I.e. it sometimes does not occur. 

^ If we substitute segments directly in the place of the class marks, we 
must add here: in that environment. 

' These phonemically identifiable elements, whose definition and whose 
distinctiveness from each other or equivalence to one another is given 
by the preceding operations, may be phonemes, junctures, contours, 
phonemic components, or the intermittently present phonemic features 
of the Appendix to 4.3. (These last may be pauses, as in 8.21, or com- 
ponents, as in fn. 7 of the Appendix to 7-9, and so on.) 

« Cf. fn. 4 above. 


and including the first vowel can be diagrammatically represented as 








— m — 




















» For those American English dialects in which tune is /tuwn/ (i.e. no 
initial /ty, dy, ny, sty/), when is /wen/ (no /hw/), and in which certain 


In this (lia}i;i:un the sequence of phonemes in any utterance (from # up 
to and includinjj; the first vowel) is indicated by a line going through the 
diagrani from left to right and never crossing a horizontal bar (except along 
the broken line in the diagram); the line may go up and down but may 
never go backwards (i.e. to the left). Thus /#hyu/ occurs {hue, heuristic) 
and /'#pyu/ (pure), /#gli/ {glimmer), /#spli/ {split), /spru/ (spruce), 
'#skwa/ {squire), /#e/ {elm), /:^sne/( snail), /#tu/ {too), /#t5e/ 
{then), etc. Any sequences of phonomes which are transversed, from left 
to right, by a line that does not cross the horizontal bars or go leftward, 
is a sequence which occurs at the beginning of some English utterance. 
And no English utterance exists but that its beginning can be repre- 
sented by one of the lines which may be drawn through this diagram.'" 

The diagram offered here is not entirely satisfactory. In the first place, 
many phonemes are indicated twice, once in the top section and once in 
the bottom. In the second place, the sequence /(s)kw/ is not permitted 
by the definition of the diagram and must be indicated by a special 
broken line. The improvement of such diagrams is largely a matter of 
ingenuity, although it can be reduced to procedural considerations. Es- 
sentially, of course, it is a correlation between the relation of phonemes 
in utterances and the relation of geometric boundaries. For this reason, it 
is desirable that each phoneme occur only once in the diagram, its geo- 
metric relations to all othei' phonemes being equivalent to its sequential 
relations to all other phonemes. It is often impossible to do this com- 
pletely in two dimensions, where the horizontal axis indicates time- 
succession, and the vertical axis mutual exclusion. Diagrams of this type 
may be useful both because they permit graphically rapid inspection, and 
comparison with analogous diagrams, and because they enable us to see 
immediately whether or not a particular sequence occurs; we test this 

foreign words like Pueblo and names like Given do not occur, /s, c, 5/ 
are taken here as unit phonemes. In the diagram, V represents vowels 
other than u, /sfi, sfe /{sphere, spherical) is omitted from the chart. 

"* For a somewhat different type of diagram, of the monosyllabic word 
in English, see Benjamin Lee Whorf , Linguistics as an exact science. The 
Technology Review 43.4 (1940). Whorf gives a chart which has some of 
the features of a formula and some of those of the diagram above. He 
does not try to have each phoneme occur only once, and obtains a neat 
representation by using commas as well as a vertical relation between 
sequences that are mutually exclusive, and plus signs as well as horizontal 
relation for sequences that occur after each other. 


by trying to draw a line through that sequence without breaking the 
rules of the diagram." 

The formulae and diagrams may be somewhat simpler when they de- 
scribe the combinations of components rather than of phonemes, because 
a larger proportion of the components will be similar to each other in 
sequential relations. However, graphic provision would have to be made 
for the fact that components can be combined simultaneously as well as 
successively. And the inclusion of the contour components will usually 
involve an addition to the formula or diagram independent of the rest 
of the representation. 

" It must nevertheless be recognized that diagrams of this type do not 
lead to any new results or symbolic manipulation of data. They serve 
only as compact summary statements of our results. 


12.0. Introductory 

This procedure divides each utterance into the morphemes which it 

12.1. Purpose: Phoneme Distribution over Longer Stretches 

I.e. stating the limitations of occurrence of linguistic elements over 
long stretches of speech. 

The argument which follows will attempt to show: first, that when the 
distribution of phonemes is considered over long stretches of speech they 
are found to be highly restricted; second, that we have established no 
method for stating simply what are the limitations of occurrence of a 
phoneme when taken over long stretches; third, that we can best state 
these limitations by setting up new (morphemic) elements in which the 
phonemes will have stated positions, the elements being so selected that 
we can easily state their distribution within long stretches of speech. 

The phonemes or components which have been obtained in chapters 
3-11 are elements in terms of which every utterance can be identified. 
They have been so selected that as many as possible of their combina- 
tions should occur in one utterance or another. 

However, even if we have obtained elements having no restrictions of 
occurrence in respect to their immediate neighbors, we will always find 
that there still are heavy restrictions, in respect to their farther neigh- 
bors. We find /tip/ tip, /pit/ pit, /sirjiT]/ singing, etc. We may say that 
almost any element will occur sometime before or after an /i/, both im- 
mediately and at any particular distance: /I/ occurs right after /i/ in 
/pil/ pill, and fifth after /i/ in /'irjkiWel/ ink-well. But we cannot say 
that every sequence of /i/, /I/ etc., over long stretches, occurs. We have 
the sequence /'irikwel/, and we can come close to another sequence of 
the same phonemes in /"welir)/ welling. But while we can get the sequence 
/'welii]/ in the environment consisting of the longer stretch /har-'ayz- 
war — ■/ Her eyes were welling, we can hardly get the sequence /'ir^kAvel/, 
in that environment, not to mention such a sequence as /welik/.^ 

' Similarly, we will hardly find in any English utterance the sequence 
/#kset'5aowv3r.#/ (presumably Cat the over.). The phoneme /i/, oc- 
curs between /w/ and /v/ in /sw — vdl/ swivel; but we will not find it 



Furthermore, the preceding operations have not even given us any 
simple method of discovering and stating limitations of occurrence over 
long stretches, such as that of /k/ after Her eyes were — ■. 

What type of further analysis can we perform that will enable us to 
treat these long-stretch restrictions? 

If we examine these restrictions, we find that in most cases they do not 
apply to particular phonemes singly, but to particular sequences of pho- 
nemes. Not only did /'irjlciwel/ not occur after /har-'ayz-war — / but 
even various sub-sections of /'welii)/ such as /'we/, /"eli/ did not occur; 
only particular sequences appear there: /'wel/ {Her eyes were well), 
/'kowld/ (cold), etc. The restriction on distribution of phonemes which 
is evident in this long stretch can therefore best be described as a restric- 
tion on sequences of phonemes: /'eli/ excluded, /'welit)/ and /'wel/ 

More generally, we are asking here how the occurrence of a phoneme 
varies as its total environment^ varies. And we find that for a change in 
the environment, we usually get not a change of our one phoneme but 
a change of a whole sequence of phonemes. If we ask how the occurrence 
of /e/ in /har-'ayz-war-'welir)./ changes when we add an /I/ to the 
environment at some particular point, we find that not merely the /e/ 
drops, but the whole sequence /'welit)/ is replaced by some other se- 
quence, e.g. /'abviyas/ in /har-'layz-war-'abviyas/ Her lies were obviovs. 

We therefore seek a way to treat sequences of phonemes as single 
longer elements. 

12.2. Procedure: Independent and Patterned Combinations 

We determine the independent phonemic sequences in each utterance 
as its morphemic segments. A necessary but not sufficient condition for 
considering an element to be independent in a particular utterance is if 
that utterance can be matched by others which are phonemically identi- 
cal with the first except that the element in question is replaced by an- 
other element or by zero. 

between /w/ and /v/ in the following environment: /Sa kaet jompt 
ow — var Sa muwn/ The cat jumped o — ver the moon. Clearly, there- 
fore, we cannot say that all sequences of our elements occur, except (in 
some languages) over very short stretches. If we want to be able to pre- 
dict what long sequences of our elements may occur in the language, or 
if we want to say exactly what long stretches of elements occur in our 
corpus, the statement of chapter 11 is insufficient. 

^ The term total environment will be used for environment over a long 
stretch of speech. 


In the following paragraphs we will recognize elements which occur 
by themselves, and will then divide longer utterances into the elements 
of which they are composed. In determining this new segmentation of 
utterances into longer elements, it will be found necessary: first, to carry 
the segmentation farther (i.e. down to smaller subdivisions) than would 
be morphologically useful; and then, to narrow the operation down, i.e. 
to reunite some of these smaller subdivisions into such divisions as can 
most conveniently have their distributional interrelations stated. 

12.21. Free Morphemic Segments 

Every utterance contains at least one morphemic segment (since the 
whole utterance can be substituted for another).^ In many languages 
there are relatively short utterances which contain only one segmental 
morpheme, i.e. only one morpheme composed of successive segmental 
phonemes, not counting any simultaneous contour morpheme: e.g. Y^es. 
Now? Come! Book. Connecticut. These are utterances which cannot be 
divided into more than one morpheme by the procedures below. ^ 

12.22. Upper Limit for Number of Morphemic Segments in an 

A phoneme sequence, say ''ruwm8r/ in That's our roomer, may con- 
tain more than one morphemic segment if and only if one part of the 
sequence occurs without another part, in the same total environment : 
/ruwm/ also occurs in That's our room; /ar/ also occurs in That's our 

Formulaically: If, in total environment — X, the combination AB oc- 
curs, and AD occurs, and CD occurs (where A, B, C, D are each pho- 
nemically identifiable portions of speech), then whether CB occurs or 
not,^ it is possible to recognize A, B, C, and D as being each of them 

* Except for a few gestural utterances, like Tut tut, which may not be 
considered utterances of the language. There are, of course, cases of in- 
terrupted utterances which break off in the middle of a morpheme. If we 
cannot immediately recognize the special status of these utterances, we 
may include these broken morphemic segments among our elements. 
Later, we will find that statements true of other morphemes are not true 
of, so that the interrupted morphemes will be treated as residues 
excluded from our regular description. In many cases, too, these residues 
will correlate with special contours (e.g. intonations of hesitation and 

^ Leonard Bloomfield, Language 161. 

' As to the status of CB: if CB does not occur and if C does not 
occur in — X, but EB and ED occur in — X (i.e. we have 


(tentatively, subject to 12.23) discrete morphemic segments in the en- 
vironment — A': the difference between B and D is established from the 
difference between ABX and ADX; the difference between A and C is 
given by matching ADX and CDX.^ 

This is a necessary condition for morphemic segmentation of utter- 
ances, because we clearly could not divide a phoneme sequence into two 
morphemes if in a given environment neither part ever occurred without 
the other; they could not then be considered independent parts. ^ How- 
ever, this is by no means a sufficient condition; if it were, it would permit 
us to take all phoneme sequences occurring in a given environment and 
containing a particular phoneme (e.g. hag, rug, hug in Where's the — ?,) 
and say that their common phoneme (in this case, /g/) is an inde- 
pendent morphemic segment. 

The criterion is not sufficient because it gives only the upper limit to 
the number of morphemic segments which we may set up in each partic- 
ular utterance. Each particular utterance cannot have more morphemic 
segments than the criterion of tentative independence permits. 

also EBX, EDX), we call B (and E) independent, as A and D are too, 
but C partially independent (since it is partially dependent on D). 

" B may also be zero. We may say that A — X is the environment or 
frame in which D can substitute for B or for zero. 

Some validation is required if we are to speak of frames, or of a part 
of one utterance occurring in another. For how do we know that the frame 
remains unchanged while various would-be morphemes are substituted 
in it (cf. the Appendix 4.22), and how do we know that when we test the 
/beriy/ of hoysenherry in the utterance hlueherry we have the 'same' 
/beriy/? It is necessary therefore to agree that a phonemic sequence will 
be considered unchanged as to its morphemic segmentation if part of its 
environment, in one utterance, is replaced by another stretch, making 
another utterance; that is, given the utterance A'}' (where A', }', Z are 
each phonemic sequences), if we substitute Z for A' and obtain the ut- 
terance ZV, we will consider that in these two utterances the two )''s 
are morphemically identical. We may call )' the frame in which X and Z 
are mutually substitutable. This statement says nothing about the 
morphemic content of the phonemic sequence )' in any utterance other 
than A)' and XZ. For a discussion of what constitutes the 'same' mor- 
pheme, see Y. H. Chao, The logical structure of Chinese words, Lang. 
22.4-13 (1946). 

' This consideration is similar to the criterion of phonemic independ- 
ence of segments (chapter 4), except that in the case of phonemic seg- 
ments the environment in question was usually the immediately neigh- 
boring segments: e.g. we might consider if [p) and ['] are mutually de- 
pendent in the environment # — V. Here, on the other hand, the environ- 
ment is usually the whole utterance. 


12.23. Loiver Limit for yttrnber of Morphemic Segments in an 
I tterance 

In many languages it will not prove convenient, from the point of view 
of eponomy of statement, to consider every independent sequence a 
morpheme, as in the case of the /g/ of 60^. For if we sought to state the 
relation? between such morphemes, we would find few, if any, broad 
generalizations.* It is therefore necessary to find some additional criteri- 
on, one which will say that under such and such conditions it does not 
pay to consider a particular phoneme sequence as a morphemic segment. 
We therefore restrict the application of 12.22 by saying that we will con- 
sider particular tentatively independent phonemic sequences as mor- 
phemic segments only if it will turn out that many of these sequences 
have identical relations to many other tentatively independent phonemic 
sequences. Given several such sequences, A, B, etc., we would accord 
morphemic status to vl, B, C, if, for example. A, B, and C, all occur 
sometimes after morphemes D, E, or F, but never after G or H, where 
D, E, and F, are precisely the only morphemes which occur in environ- 
ment X — (i.e. a D, E, F constitute a distributional class as against 
G, H). Meeting this criterion may thus prevent us from carrying out, in 
any given utterance, some of the segmentations which 12.22 would per- 
mit, but for which we cannot find other segments having similar distribu- 
tions. It thus reduces the number of morphemes recognized in each par- 
ticular utterance. 

12.231. For free forms (i.e. for forms which sometimes constitute an 
utterance by themselves) . As an example, we consider many sequences end- 
ing in /s/ (e.g. books, myths) in the environments My — are old, Take the — -. 
We match identical sequences without /s/ (book, myth) in the environments 
My — is old, Take the — . Clearly, the s, is independent both of the preced- 
ing free form, e.g. book, and of anything else in the utterance. We now find 
that almost every sequence which ever occurs after The — , The good — , 
The old — , etc. also occurs in the environment The — s, The good — s. The 
old — s, etc., whereas this is not true of sequences such as very which oc- 
cur in The — good. The — old, etc. We conclude that /s/ (or /z/) is not 
merely a very common phoneme (so common that countless sequences 
which don't end with s can be matched by otherwise identical sequences 
which do), but rather that the s is an element added on to any one of a 
positionally particular group of sequences. Hence both the bound s/ (or 

* For examples of such generalizations and the lack of them, see below 
and in the Appendices. 


I if) and the various free forms to which it is added are separate elements, 
or morphemic segments. The morphemic status of this /s/, however, does 
not extend to the /s/ of hox, even if we find the sequence without /s/ as 
in Bock, and even though we can match Vll take the box with I'll take the 
Bock, because box occurs in My — is old, rather than in My — are old. 

12.232. For bound forms (i.e. for forms which practically never con- 
stitute an utterance by themselves) . A more complicated case occurs in the 
pairs of words or utterances conceive-receive, concur-recur, confer-refer, etc. 
We never get ceive by itself, but every phoneme sequence with which ceive 
occurs, appears also with other bound forms' which in turn occur with one 
or more of the sequences with which ceive occurs: perceive, deceive; deduct, 
conduct; perjure, conjure; persist, desist, consist, resist, assist. There is thus 
discovered a family of initial bound forms (prefixes) and a family of non-in- 
itial bound forms (stems), between the members of which families this rela- 
tion in general holds. Each one of these prefixes occurs with several of 
these stems, and vice versa. This gives us a preview of the exact relation 
which may obtain among these sequences if we break them up in the 
above manner. On the basis of such a preview we decide that it is worth 
considering con-, re-, -ceive, -cur, etc. , as each a distinct morphemic segment."^ 

12.233. Summary. The criterion of 12.23 may thus be satisfied by the 
following procedure: Given a tentatively independent sequence of pho- 
nemes A (from 12.22) in a particular total environment, we seek some 
distributional feature which correlates with the distribution of this pho- 
neme sequence; i.e. we ask what other utterance position, or the neigh- 
borhood of what other tentatively independent phoneme sequence, 
characterizes all the sequences B, C which substitute for our given se- 
quence A, or all the sequences M, N which occur with (before, etc.) our 
given sequence A. If we find such, we define our given phoneme se- 
quence A, in the environments in which we have considered it, as a 
morphemic segment. The fact that a phoneme sequence is recognized as 
a morpheme in one environment therefore does not make it a morpheme 

' I.e. with other sequences which don't occur by themselves. See 
Bloomfield, Language 160. 

'"The two sets con-, re-, per-, etc., and -ceive, -cur, -sist, etc. are de- 
pendent on each other as sets; but any one member of the first set is not 
dependent on any one member of the second. When we segment an ut- 
terance, we find only particular members of each set in the utterance, 
not the 'class' as a whole. Therefore, we must regard the particular mem- 
bers of one set as being independent of each other and of the particular 
members of the other set (though not independent of the set as a whole). 
Hence, they are all .separate segments. 


in another environment (er is a morphemic segment in governor but not 
in hammer; the total environment in which it is a morphemic segment is 
e.g. The — is no good, but not The — ing stopped). 

This criterion of distributional patterning may be approached in a 
somewhat different way. Given the phoneme sequence /'boyliT)/ boil- 
/h^," we ask if it is to be analyzed as consisting of one morphemic seg- 
ment or two; and if two, where is the dividing line. To test whether there 
are two morphemes present, we take boiling in some environments in 
which it occurs (say, It's — now. I'm — it now.), and see if boiling can 
be replaced there by some other phoneme sequence which is partially 
identical (e.g. 'stapirj stopping) .^'^ We can now say that boil and stop 
are tentatively independent phoneme sequences, since we find them in 
the environment I'm — ing it now as well as in other environments (e.g. 
/'// — it.). Since there are many other phoneme sequences [take, etc.) 
which occur in these two environments and in precisely the other en- 
vironments of boil, stop, we consider the criterion of 12.23 satisfied and 
regard boil, stop, take, etc. as morphemic segments.^ ^ 

" To use the example discussed in B. Bloch and G. L. Trager, Outline 
of Linguistic Analysis 53. 

1^ If boiling can be replaced in that environment by any other phoneme 
sequence (including zero), then boiling as a whole is tentatively independ- 
ent in that environment. If the phoneme sequences which substitute for 
boiling are partially identical with it (e.g. stopping) we can say that the 
identical portions in the various replacements are part of the environ- 
ment, and that it is only the non-identical portions which replace each 
other. (In general, the environmental frame is that which is invariant 
under the various substitutions; the substituting stretches are those 
which differ for each substitution.) We can now go further and say not 
only that boiling is independent of its environment, but also that boil 
is independent of its environment, too (as is -ing). 

'^ Alternative divisions of boiling into morpheme segments would not 
equally satisfy the criterion of 12.23. If we seek substitutes for boiling 
which have some partial identity with it other than -ing, we will not 
find any whose non-identical parts enter, together with the part of boiling 
for which they substitute, into clear distributional classes. For example, 
we cannot substitute princeling or boys for boiling, since these do not oc- 
cur in I'm — it now. The partial identity of these is of no use in analyzing 
boiling, since these two are not distributionally equivalent to boiling in 
the first place. If we substitute trailing, we might try to say that trey/ 
trui- replaces /boy boi-. We might further say that /trey/ and /boy/ 
also replace each other in other environments so as apparently to satisfy 
the criterion of 12.23: five trays, five boys. However, other substitutes of 
/trey/ and /boy/ in I'm — ling it now will not substitute for them in 
five — s; /mey/ (from mailing) rarely (five Mays); /sey/ (from sailing), 
/se/ (from selling), /ka/ (from culling) never. Hence we cannot divide 


No matter how we go about the dividing of an utterance into its 
morphemes, this much will in any case be involved: The morpheme 
boundaries in an utterance are determined not on the basis of con- 
siderations interior to the utterance, but on the basis of comparison with 
other utterances. The comparisons are controlled, i.e., we do not merely 
scan various random utterances, but seek utterances which differ from 
our original one only in stated portions. The final test is in utterances 
which are only minimally different from ours. 

Having established in what way our utterance differs minimally from 
others, we choose that manner of distinguishing our utterance from the 
others which has the greater generality; i.e., we define the elements that 
distinguish our utterance in such a way that general things can be said 
about the distribution of those elements. •■ 

For example, note and notice have some environments in common and 
some not: e.g. both occur in That's worthy of — ; but only note occurs it 
A man of — ; and only notice in The boss gave me a week's — . Similarly, 
ivalk and tm/fced have some environments in common and some not: e.g. 
both occur in I always — slowly; but only walk in Fll — with you; and 
only walked in / — yesterday. The same is true of such pairs as talk-talked, 
go-went. But the important consideration for our purposes is that the 
environmental difference that applies to walk-walked also applies to 
talk-talked and to go-went; while the environmental difference in note- 
notice does not recur in other pairs. It is not merely that talk-talked, go- 
went occur in the same environments as were given for walk-walked 
above. Even when the environments of go-went differ from those of 
ivalk-walked, the difference between the environment of go and that of 
ivent will be the same as the difference between the environment of walk 
and that of walked: I'll go crazy with you; I went crazy yesterday (this 

boiling (in environments of the type mentioned) into hoi (boy) and ling, 
but only into boil and iiig. 

We do not always find such extreme cases as the adequacy of Iwil-ing 
compared with the inadequacy of boi-ling. It is therefore often con- 
venient to make the division into morphemic segments first in t he case 
of those utterances and parts of utterances in which the difference in 
adequacy among various alternative segmentations is extreme. The less 
obvious choices of segmentation can then be decided with the help of 
t he classes of morphemic segments which have already been set up. Even 
then, new data may lead us to rescind some of our previous segment a- 
t ions in favor of alternative ones which pattern better with the new data. 
C!f. Charles F. Hockett, Problems of morphemic analysis, Lang. 2'.\.'.\2\- 
343 (1947). 


tDtal environment does not occur with walk; but the difference between 
the two environments is will vs. yesterday, and this is the very difference 
we found for walk-walked). 

Hence, if we say that walked, talked, went each consist of walk, talk, go 
respectively plus an additional morpheme, we will be able to make 
broadly applicable statements about that morpheme. However, if we say 
that notice consists of Jiote plus some other morpheme, we will not be able 
to make such general statements about the new morpheme; it will always 
occur after note. We may therefore prefer to consider notice as one mor- 

12.3. Phonemic Identification of Morphemic Segments 

Nothing in the operations of 12.2 requires that the morphemic seg- 
ments consist of added phonemes, or phonemes in succession. It is only 
required that the morphemic segments be identifiable in terms of the 
phonological elements set up in 6-10, since the utterances which are here 
segmented into their component morphemes are represented in terms of 
these elements. ^^ In many languages, the carrying out of the operations 
of 12.2 leads to morphemic segmentations which are not mere sequences 
of phonemes.'^ 

'* It would be possible to add the requirement that each morphemic 
segment consist of a whole phoneme or an unbroken succession of whole 
phonemes, and to carry out the operations of 12.2 only in so far as they 
do not lead to segments which fall outside this restriction. However, such 
a restriction would in general yield small simplifications in the relation 
between morphemes and phonemes, at the cost of increased complexities 
in the morphological statements. 

'^ In many of the cases described below it would be possible to main- 
tain a simpler .segmentation, into consecutive phoneme sequences, at the 
cost of more complicated morphological statements later. However, the 
phonemic constituency of each morpheme is in any case a matter of 
detailed listing, and is usually subject to few regularities and generaliza- 
tions. Therefore it is in general convenient to include as many individual 
facts as possible about each morpheme in its phonemic constitution 
(since that has to be given individually for each morpheme), and to leave 
the general facts for the morphological statements (which will then be 
statements about groups of morphemes rather than about individual 
ones). Thus, in the case of the Hidatsa command (12.333), if we analyzed 
/cixic/ as /cix 'jump' plus /ic/ 'he did', we would have to discuss /cix/ 
individually on two occasions: once to state its phonemic constituency; 
and again, to state that when it occurs next to the 'he did' morpheme 
the vowel of that morpheme is /i/ rather than some other. There would 
be several forms for 'he did', and each would occur after particular stems. 
However, if we analyze /cixic/ as /cixi/ plus /c/, we mention /cixi/ in- 
dividually only once, and /c/ 'he did' only once, in each case giving the 


12.31. Contiguous Phonemic Sequences 

The vast majority of morphemic segments, in most languages, consist 
of phonemes in immediate succession: e.g. /ruwm/ room, /ar/ -er. 

12.32. Non-contiguous Phonemic Sequences 

In relatively few cases, the procedure of 12.2 leads us to set up mor- 
phemic segments consisting of non-contiguous phonemes, i.e. consisting 
of phonemes not in unbroken succession but interrupted by the pho- 
nemes of other morphemic segments. 

12.321. Staggered i'honemes. One type of such mediate sequences 
may be seen in the root-morphemes'* and vowel-pattern-morphemes of 
Semitic. In Arabic, for example, we have such utterances as kataba 'he 
wrote', kadaba 'he lied', katabtu 'I wrote', kadabiii 'I lied', ka'ddaba 'he 
called (someone) liar', kataba 'he corresponded', katabtu 'I correspond- 
ed', from which we ejftract the following as independent morphemic 
segments: k-t-b 'write', k-d-b 'lie', -a 'he', -tu T, -a-a- 'perfective', repeti- 
tion of second consonant 'intensive', -•- (i.e. added mora'' of length 
after the first vowel) 'reciprocal'. The phonemes of k-t-b and k-'d-b, and 
those of -a-a-, are staggered with respect to each other. 

12.322. Broken sequences. Another type of non-contiguous se- 
quence occurs in morphemic segments like the Yokuts na'as . . . al 
dubitative (with the verb morpheme coming between the two parts); 
whenever one part occurs the other also occurs.'^ 

12.323. Repetitive sequences. Non-contiguous sequences, repeated 
over a stated portion of the utterance, express what is often called 
grammatical agreement. If we consider Latin filius bonus 'good son,' 
filia bona 'good daughter,' we are led to the morphemic segments fili 

phonemic constitution. The second method does, however, involve a 
small cost: in the first method, we would say that /cix/ 'jump!' consists 
of the one morphemic segment /cix/; whereas in the second we would 
have to say that /cix/ 'jump!' consists of two segments, /cixi/ 'jump' and 
/drop moral 'command'. 

'* The term morpheme will sometimes be used for either morphemic 
segment or the full morphemes defined in chapter 13 if the difference be- 
tween them is irrelevant in the context, or if it is entirely clear from the 
context which of the two is meant. The morphemic segments of chap- 
ter 12 are the morphemes, or the alternants (variant members) of mor- 
phemes, of chapter 13. 

'^ A mora is a unit-length of vowel (e.g. a short vowel, or the first or 
second part of a two-unit long vowel). 

"* Stanley Newman, Yokuts Language of California 120. 


'luinuui offspring,' ban 'good.' The remaining phonemes in the first ut- 
terance, . . . us ... us, are not two independent sequences in the sense 
of 12.2; the two parts are clearly ilejiendent on each other and together 
constitute one broken morpheme, meaning male, essentially as in the case 
of 12.322. Similarly . . . a ... a is a single morphemic segment, mean- 
ing female.*^ In viclrix bona 'good victor (f .)' (as against victor bonus 'good 
victor (m.)') we have . . . ix ... o as a morphemic segment^" meaning 
female; the two separated parts of the morphemic segment need not be 
identical. '^^ 

In Moroccan Arabic we find bit kbir '(a) large room' and Ibit Ikbir 
'the large room' occurring in such total environments as had — dial 
zuia 'This is — ■ of my brother's' ; Ibit kbir 'The room is large' never oc- 
curs in such environments, nor (for many environments) does bit Ikbir 

" In many other words, e.g. hortus parvus 'small garden', niensa parva 
'small table', the meaning of these morphemes is, of course, zero, or in 
any case not male and female. 

^° Disregarding, for the purposes of the present discussion, the seg- 
mentation of the ix into ic and s, which would be based on comparison 
of other utterances (containing victricis, vidoris, etc.). 

2^ The discontinuous repetitive segments differ morphologically from 
the type of 12.322, as is clear from the conditions in w^hich they were set 
up. Thus, filius occurs together with bonus and also without bonus. 
When filius occurs by itself in an utterance, it is segmented into fili and 
us, so that in such environments the single us is itself a morphemic seg- 
ment (while in filius bonus the combined . . . ws . . . us is a single seg- 
ment). In contrast: we may, in the manner of 12.322, consider German 
ge . . . en and ge . . . et (in gefangen 'captive', geeignet 'suitable') as each 
constituting a discontinuous morpheme (the relation of the part ge to 
the parts en, et being similar to the relation of s to he, she in 12.324). It 
would then appear that there is a difference between . . . us . . . us 
and ge . . . et: for a single us occurs as a morphemic segment on its own, 
whereas ge never occurs as a morphemic segment by itself. However, 
when we recognize the single us as a morphemic segment, it is not by 
taking off part of the . . . us . . . 2is sequence and giving it independent 
status, but rather by segmenting us in environments where there is only 
one (where filius occurs without bonus). This situation never arises in 
the case oi ge . . . et, since ge never occurs by itself in any comparable 
environment (i.e. we have Xus Yus, and in the same total environment 
we have Xus alone; but while we have geXet, we never find geX or geY 
in the same total environment). Therefore, there are some Latin utter- 
ances in which . . . ms . . . ws is a morphemic segment, and comparable 
ones in which us is; and there are some German utterances in which 
ge . . . et is a morphemic segment, but no comparable ones in which 
ge is. (For the relation of us to ... us .. . us, see 13.422). It is clear 
from this example that the environment (or domain) of each form of 
discontinuous morpheme has to be exactly stated. 


'the big shot's room.' We set up the morphemic segments bit 'room,' kbir 
'large;' and since the I 'the' occurs in this environment either twice or 
not at all, we recognize a morphemic segment I . . . I . . . 'the.' 

12.324. Partially dependent non-contiguous sequences. Follow- 
ing this method, if we compare / think so with He thinks so, We want it 
with She wants it, it is clear that the -s is not an independently occur- 
ring morpheme. The -s occurs only when he, she, it, Fred, my brother, or 
the like, occur with think, take, or the like. It does not occur if he, etc., is 
lacking (as in We want it) or if think, etc., is lacking (as in He or she, 
which?), or if both occur, but will, might, etc., intervene (as in He will 
want it). In the particular environment considered above, we can say 
either that the second morphemic segment, in the position after he, etc., 
is thinks, ivants, or el^e that the first morphemic segment, in the position 
before think, etc., is he ... s, she . . . s, Fred . . . s, etc." The first alterna- 
tive gives us an unbroken morphemic segment, and the second alternative 
a broken one.^^ 

In all these cases we see that grammatical features which are usually 
called agreement can be described as discontinuous morphemic segments 
whose various parts are attached to various other morphemic segments. ^^ 

12.33. Replacement of Phonemes 

12.331. Among individual phonemes. If we compare take, took, shake, 
shook, we would be led to extract take and shake as morphemic segments, 
and also a morphemic segment consisting of the change of /ey/ -^ /u/ 
and meaning past. The morpheme sequence take plus /ey/ —> /u/ yields 
took, exactly as walk plus /t/ yields walked.^^ 

^^ This does not conflict with the fact that the second morphemic seg- 
ment in I think, is just think, and that the first morphemic segment in 
He will not is he, for the utterance environment is different in these cases. 

^' Later considerations (13.4) will enable us to choose between these. 

^* Other dependences are too complicated to be expressed in this man- 
ner. In What did you say — him? and What did you steal — him? we 
know that to would occur in the first utterance, and from in the second. 
Nevertheless, we do not say that in this environment to is dependent 
upon say and constitutes one morpheme with it, because both say and to 
occur independently of each other in so many other environments, and 
because on rare occasions we might get other forms here, e.g. 7iear, in- 
stead of to. For other types of grammatical concord, also expressed by 
long or discontinuous elements, see chapter 17. 

^^ We use this analysis here rather than that of 12.321 and the Appen- 
dix to 12.233 whi('h would yield morphemes t-k, sh-k, and /ey/ pres- 
ent, /u/ past. The latter is not convenient for English because in the 


12.332. Among classes of phonemes. The interchange may be be- 
tween any phoneme of one class and the corresponding phoneme of an- 
other class matched with the first: house — to house, belief — to believe, 
life — to live, etc. In these examples, replacing a final voiced consonant 
/z, v/ by the homo-organic voiceless one /s, f/ (sometimes with attend- 
ant vowel change) constitutes a morphemic segment, meaning noun.^* 

12.333. Replacement by zero. The interchange may be between 
any .phoneme in a given position and zero in that position; i.e. it may con- 
sist of omitting a phoneme." If we compare French ; fermyer/ fermiere 
'farm-woman', /miizisyen/ musicienne 'woman musician', /sat/ chatie 
'female cat', with iermye' fermier 'farmer', /miizisye/ inusicien 'musi- 
cian', sa chat 'tom-cat', we would say that the last three have each a 
single phonemic feature constituting the same morphemic segment in 
each of these environments: the replacement of the final consonant by 
zero, changing the meaning from female to male.^* Similarly Hidatsa has 
cixic 'he jumped,' cix 'jump!', ikac 'he looked,' ika 'look!'. We segment 
these utterances into cixi 'jump,' ika 'look,' -c 'he did (or does)', and 
/omission of final vowel mora/ indicating command.''^ 

It might be argued that we can avoid having to use the omission of a 
phoneme to identify the morphemic segment in question if we took 

great majority of cases, the English morpheme for 'past' does not re- 
place the vowel of the present-tense verb but is added on to the whole 
verb as it appears when used for present time : walk, walked. For addi- 
tional reasons, see the Appendi.x to 12.233. 

"Considering the /z/— ^/s/ interchange and the /v/ — ^ /f / as two 
variant members of one morpheme can more properly be done in chap- 
ter 13. 

^^ Instead of considering the omission of phonemes as a special case of 
phonemic interchange we can consider the interchange of phonemes to 
consist of omitting one phoneme and adding another. In this case the 
morphemes considered hitherto would all consist of the addition or sub- 
traction of phonemes in respect to the rest of the utterance. 

" F. Beyer and P. Passy, Elementarbuch des gesprochenen Fran- 
zosisch 96 (1905). This analysis ceases to be applicable if we take into 
consideration forms in which the 'mute e' is pronounced, as in poetry; if 
we consider forms in which the final consonant is pronounced, as in 
liaison, it is the masculine pre-consonantal form which is derived from 
the masculine pre-vocalic form, e.g. move/ mauvais 'bad' from /movez/ 
mauvais. Cf. R. A. Hall, Jr., French Review 19.44 (1945). 

" R. H. Lowie. Z. S. Harris, and C. F. Voegelin. Hidatsa Texts (In- 
diana Historical Society Prehistory Research Series 1) 192 fn. 38 (1939). 
Note also the mora-omitting morpheme 'jussive' in Z. S. Harris, Lin- 
guistic Structure of Hebrew, Jour. Am. Or. Soc. 61.161 No. 11 (1941). 


/fermye/ as 'farmer', /miizisye/ as 'musician', /sa/ as 'cat,' and /r/, 
/n/ (or /"/ — » /n/), and /t/ as various morphemic segments for 'female.' 
However, we would find almost every consonant phoneme in French oc- 
curring as a morphemic segment for 'female,' and each occurring only 
after some few particular morphemes (/t/ after /sa/ and some others; 
etc.). Similarly, if we chose to take Hidatsa cix as 'jump, jump!', ika as 
'look, look!,' and ic, c as 'he did,' we would find (in various utterances) 
occurrences not only of these ic and c forms but of every vowel mora, 
including length, followed by c, constituting morphemic segments for 'he 
did.' In such cases, when many phonemes (/t/, /r/, etc.) in one position 
(fern.) alternate with zero in another (masc), it is simpler to consider 
the various consonants or vowels as part of the various morphemic seg- 
ments; the shorter (masc.) forms are then analyzed as consisting of two 
morphemic segments: the longer (fem.) morpheme plus a single omit- 
phoneme morpheme. 

12.34. Suprasegmen tal Elements 

12.341. Components. If we break phonemes up into components, the 
devoicing morpheme in hoiise — to house, belief — to believe (12.332) would 
consist not in the interchanges of one phoneme for another, but in a 
single devoicing component: believe + morphemic segment consisting of 
devoicing component = belief. 

12.342. Contour change. If we compare a convict — to convict, we 
must distinguish a morphemic segment convict /kan'vikt/ with verb 
meaning, and a distinct morphemic segment consisting of change of stress 
contour, meaning noun. Then the verb /kan'vikt/ + the morpheme con- 
sisting of stress change = the noun /'kanvikt/. In a table — to table we 
have only one morpheme /'teybal/, having verb or noun meaning ac- 
cording to its position in the utterance; here the stress contour is only 
phonemic and does not have morphemic status because it is not rephuie- 
able, in such environments as a — , by another stress contour.'" 

12.343. Morpheme-length contours. A clearer case of a contour 
morpheme is the extra loud stress which may occur on almost any mor- 
phemic segment of an utterance (as in No! Tell HIM to throw the red one 

'" As in the case of contours, we may also find segmental phonemes 
which have morphemic status in one environment and not in another 
superficially similar environment. For example, the final /t/ in /kspt/ 
constitutes a morphemic segment meaning past (/ cnpped. I cup my 
hand.H.), whereas in /kat/ the final /t/ is not a morphemic segment l)y 
itself but part of the morphemic segment cut (/ cut my hand). 


or Xo! Tell him to THROW the red one, etc.). The two utterances can be 
recognized as distinct from each other, and must therefore differ pho- 
nemically and morphemically. The difference between the 'ffrow/ of 
the first utterance and the /"^row/ of the second is the phonemic /"/ 
whose independent occurrence correlates consistently with the presence 
of a contrastive or emphatic meaning such as 'throw and not drop!', /"/ 
is therefore a morphemic segment, with contrastively emphatic meaning. 
12.344. Utterance contours. Contours can, of course, satisfy the 
conditions for being considered morphemic, not only when they apply 
over some one other morpheme (as in convict and him above) but also 
when they extend over any number of morphemes. Many of the utter- 
ance-long components of chapter 6, such as the rising pitch which is 
marked /?/, are independent of the rest of the phonemes of the utterance, 
since the rest of the utterance (whether represented phonemically or 
morphemically) occurs with other contours as well as with this one: 
He's going? as compared with He's going. ^'^ This test of morphemic inde- 
pendence will enable us to discover any other morphemic contours which 
we may have failed to obtain from the procedure of 6.^^ 

12.35. Combinations of the Above 

In rare cases we may wish to set up morphemic segments consisting 
of a combination of some of the features mentioned above. E.g. in French 
utterances having liaison -t- we may well recognize a morphemic segment 
consisting of intonation contour and liaison -/- (and, if we wish, also the 

" Replacing one contour by another in various sequences of mor- 
phemes nets a parallel change in meaning for all the sequences : wherever 
?/ occurs, it adds the meaning of 'question' to the utterance. 

^^ The criterion of independence is involved also in breaking down such 
long contours as can be shown (6.4) to be successive repetitions of the 
same shorter contour (or of a few different short contours). We may find 
a short utterance with one short contour over it {I'm not coming.), a 
longer utterance with the same short contour given twice over it (I'lti 
not coming. It's too late.), a longer utterance yet with the same contour 
three times over it, and so on. The number of occurrences of the short 
contour depends on the length of the utterance and on the repetition of 
a particular sequence of morpheme classes (construction, see Chapter 18) 
under each of the short contours. We therefore say that the two-fold 
and three-fold repetitions of the short contour are not morphemically 
independent. For each, utterance length and utterance construction 
there is only one set of independent short contours (. and ? and so on), 
and repetitive successions of these are not new independent morphemic 
elements, but merely sequences of the original contour elements. This 
can be done by the operation of chapter 13. 


act of rearrangement in the order of the utterance). From 0)i donne. 'they 
are giving' and Donne-t-on? 'are they giving?' we would recognize the 
morphemes on and donne. The -t- does not have to be set up as an inde- 
pendent morphemic segment; but together with the change in contour 
(and morpheme order) it constitutes a morphemic segment meaning ques- 

12.4. Result: Elements with Stated Distributions over Utterances 

We now have a list of morphemic segments into which any utterance 
can be segmented, each of these being uniquely identifiable in terms of 
phonemic elements, and occurring in stated environments of other mor- 
phemic segments (or in stated utterances). ''' An intrinsic part of the defi- 
nition of each morpheme is the environment for which it is defined: 
/siylir)/ by itself is undefined as to consisting of one morpheme (ceiling) 
or two (sealing). But in We are going — •. /siylir)/ is defined as consisting 
of two morphemes, while in That — is made of plaster, /siylir)/ is de- 
fined as one morpheme. We can now consider the utterances of the lan- 
guage as consisting entirely (even including the liaison -t-) of these 
morphemic segments, i.e. of phonemic elements among which morphemic 
boundaries are placed. 

These morphemic segments serve the purpose of 12.1. 

The operation of 12.22 assures that there will be fewer limitations 
upon the occurrence of a morphemic segment within a long environment 
than upon the occurrence of a phoneme. For each time a new morphemic 
segment is recognized, with a certain stated phonemic constituency, it 
becomes unnecessary to state elsewhere that the particular phonemic 
sequence represented by that morpheme occurs in the environment in 
question, while other phonemic sequences not represented by any mor- 
pheme do not occur there. ^^ 

" Equivatently, the intonation could be set up as a morphemic seg- 
ment, with the -t- as an automatic part of it, or vice versa. 

•■'"' In a general presentation of linguistic method, morphemes aic de- 
fined as the result of the operations of 12.2. In effect, the morphemes are 
those phonemically identifiable elements in terms of which the inter- 
element relations can be most simply stated. However, in any description 
of a particular language, the morphemes are defined by a list of individual 
morphemes. Relations among these stated morphemes can then be 
studied no matter how the morphemes had been determined: for some 
listings of morphemes the relations will appear more complicated than 
for others. 

•'^ E.g. if our morphemic elements will be later classed in Verbs, Nouns, 
Adjectives, etc., we will be able to say that after /Sa/ in an utterance 


The operation of 12.23 assures thiit the morphemic segments will be 
siu'h elements in terms of which convenient distributional statements 
can be made. For each morphemic segment which is recognized there will 
be others having partially similar distribution. 

These new segments of utterances therefore include in their constitu- 
tion some of the limitations of occurrence of the phonological segments; 
and their distribution (i.e. their privileges of occurrence) within long ut- 
terances can be more easily stated than that of the phonological elements. 

This does not mean that these segments are those elements whose dis- 
tribution within utterances can be stated most simply. The operations 
carried out in chapters 13-17 will define, in terms of the present mor- 
phemic segments, more inclusive elements whose distribution in utter- 
ances can be stated far more simply. In the course of defining such more 
inclusive elements in terms of the present segments, it may appear that 
some of the segmentations are not as convenient as the others for the 
setting up of these new elements. The work of 12 may then be considered 
as a first approximation, to be corrected whenever we wish to have a 
somewhat different segmentation as a basis for defining the more in- 
clusive elements.'^ 

In considering the results of 12 as material for general statements, it 
must be remembered that each morphemic segment has been defined 
for a particular environment, even though this was done with an eye to 
phonemically identical segments in other environments. The operations 
of 12 tell us primarily how many morphemic segments there are in any 
given utterance, or what parts of each utterance are constituents of what 
morphemic segments within it. These operations do not as j-et give us a 
compact set of morphemic elements occurring in various environments. Morphemic Segments Correlate icith Features of Social 

If each utterance is correlated with the social situation, i.e. the cul- 
tural environment and the interpersonal relations, in which it occurs. 

there is a certain positive probability of a member of Xoun or Adjective 
occurring in a given position, but zero probability for the occurrence of 
any element in the Verb class. Furthermore, if all these elements are 
defined as sequences of phonemes we could then state the probability for 
any phoneme occurring at various points within that position. 

'^ There is no conflict between the approximation of chapter 12 and 
any corrections made upon it later (chapters 13. 17, 18), since both the 
criterion of 12.23 and the basis for any later correction would be identi- 
cal : the setting up of morphemic segments in such a way that the simplest 
distributional statements can be made about them. 


it will be possible to correlate the morphemic segments of the utterance 
with features of the social situation. In some cases this is rather simple, 
as in correlating the segment five with a feature of the social situation in 
such utterances as It's five o'clock now, I got some three-by-fives for you. In 
other cases it may be quite complicated, as in correlating./??'? with a fea- 
ture of the social situation in such utterances as Mr. five-hy-five, I'll be 
back in five or ten minutes. 

When the results of descriptive linguistics are used in other linguistic 
and social investigations, one of the chief desiderata is the correlation of 
utterances and their morphemic segments on the one hand with social 
situations and features of them on the other." This is comparable to the 
correlation of the phonological segments with features of sound. In the 
case of the phonological segments, it was possible to use this correlation 
in the very establishment of the segments, since we could substitute one 
segment for another (whether impressionistically or by sound track) and 
test for identity or difference in native response (e.g. by the method of 
4.23). It was therefore possible to say that the phonological segments 
represent particular features of sound. 

In contrast, the correlation of morphemic segments with features of 
social situations cannot be used in establishing the segments.^** There is 
at present no way of determining meaning differences as exactly as one 
can measure sound differences, and there are no morphological tests (of 
hearer's response to meaning) comparable to the phonological test of 
4.23 (of hearer's response to sound). 

Since meaning was not used as a criterion in setting up the morphemic 
segments, the segments resulting from 12.2 will not always be identical 
with those which might be desired from the point of view of meaning 
analysis. However, these segments will be the most convenient ones for 
morphological description, and where considerations of meaning are at 
variance with this segmentation special note can be made. 

12.5. Correlations between Morphemes and Phonemes in Each 

Since w^e now have two independent sets of linguistic elements, pho- 
nemes and morphemes, it is convenient to ask what correlations may be 

" In dictionaries, morphemic elements are defined as a correlation be- 
tween morphemically segmented phonemic sequences and features of 
social situations (meanings). 

"It is for this reason that meaning was not used as a criterion of mor- 
pheme segmentation in 12.2. For justification of this disregard of meaning 
in the procedure, see the Appendix to 12.41. 


disL'ovored between the two sets in any particular lan}i;uatfe under con- 
sideration.'^ Aie there any phonemic features peculiar to morpheme 
boundaries or to whole classes of morphemes? Given the phonemes of a 
stretch of speech, can we predict anything about its morphemes? 

I2..1I. I'lionernic Combinations in Morphemes 

In many languages it is possible to make general statements about the 
phonemic composition of particular sets of morphemic segments/" The 
language may have many morphemes (stems) of four or more segmental 
phonemes plus a stress phoneme, as against other morphemes (affixes) 
all of which contain only one or two segmental phonemes and no stress 
phoneme.'*' Certain phonemes or phoneme sequences may occur only in 
these short morphemes, or in other morphemes which are marked by 
some special distributional or phonemic feature.''^ 

12.52. Intermittently Present Pause 

We may also find in some languages that loose contact and division 
between breath-groups of phonemes occur at morpheme boundaries, or 
that pauses occur sometimes (though not always) at morpheme bound- 
ary, but practically never within a morpheme. Such pauses could not 
be included in the phonemic content of the morpheme, except as inter- 
mittently present features in the utterance. They are free variants, and 
do not occur every time morpheme boundary occurs. But in some oc- 
currences of the morpheme sequences those pauses would constitute ob- 
servable evidence of morpheme boundary.''^ 

In many cases, however, these pauses come at points containing 
phonemic junctures (12.53-4). At these points in the utterance we find 
segments which occur only at utterance boundary or at points of in- 
termittently present pause, and which are phonemicized into junctures 
or into sequences of some phoneme plus juncture. We can then say that 

'^ Here we correlate the known phonemes of a language with its known 
morphemic segments, whereas in 2.62 we compared the general method 
of discovering phonemes with the general method of discovering mor- 

" See for e.xample the Yawelmani case in International Journal of 
American Linguistics 13.55 (1947). 

•" See, e.g. Marcel Cohen, Travaux du Cercle Linguistique de Prague 
8.37 (1938). 

■•^ E.g. in English learned (foreign) vocabulary: cf. Leonard Bloom- 
field, The structure of learned words, in A Commemorative Volume Is- 
sued by the Institute for Research in English Teaching (Tokyo, 1933). 

■*' For an explicit use of intermittently present ('facultative') pause, 
cf. chapter 4, fn. 10 above. 


the pause (when it occurs) is an occasionally-occurring free variant of 
the phonemic juncture. 

12.53. Adjusting Junctures as Morpheme Boundaries 

In chapter 8 and in the Appendix to 9.21 we saw various segments 
which contrasted with all the phonemes recognized in the language, and 
which would therefore have to be set up as constituting new phonemes. 
It was, however, possible to include these segments in some of the pre- 
viously recognized phonemes if we defined, wherever these segments oc- 
curred, a zero phoneme'called juncture. This technique could be used in 
many cases, but was primarily useful if the juncture phoneme could be 
made to occur precisely at the boundary between morphemes; for then 
the juncture could be used not only as part of the phoneme sequence (in 
that its presence is taken into consideration when we wish to determine 
the segment members of the neighboring phonemes) but also as a mark 
of morpheme boundary. In the previous sections it was possible to use 
only indirect methods and tentative guesses in deciding whether in a 
particular case it w^as desirable to set up a juncture and thereupon assign 
a segment to some previously recognized phoneme. Now we can see if 
these junctures fall on morpheme boundaries or can be adjusted to do so. 
Segments which had not been phonemically assigned with the aid of 
junctures, may now be so treated if the knowledge of morpheme bound- 
aries which we now have makes this desirable in cases where we had 
neglected to do it.'''' 

In some languages it may be possible to say that certain contours 
change from morpheme to successive morpheme, or that certain con- 
tours, long components such as vowel harmony, phonemes, or phoneme 
sequences (clusters, etc.) correlate with the boundary of a morpheme. 
Thus, in English long consonants do not occur within a morpheme; how- 
ever, they occur across morpheme boundary, e.g. the /nn ' in pen-knife. 
Any sound feature whose occurrence is limited in terms of a moii)hologi- 
cal segment (e.g. one morpheme) can be indicated by a juncture phoneme 
or an automatically placeable boundary mark which will indicate both 
the feature in question and the morphological boundary.''^ 

''■' Note the relevance of morphemic boundaries to phoneme distribu- 
tion, in Leonard Bloomfield, Language V.^'.^. The palatal [g] of standard 
German Zuge 'progress, pull' and the front [g] of zugestehen 'grant' can 
be included in one phoneme /g only if a phonemic juncture separates 
the /u/ from the /g/ at the morphemic boundary in the second form. 

^■'Several features may Ix^ limited in terms of a particular l)oundary; 
in that case they are all indicated by the boundary mark. ()ft(>n we will 

17(1 STRrCTrUAL LixnrisTics 

12. 5i. At'ir I'lioiiettiir Junctures 

Establishment in 12.2 of the boundaries between morphemic segments 
enables us therefore to decide where it is convenient to set up junctures. 
This may lead to some changes in the junctures of chapter 8 and in the 
assignment of segments to phonemes in 7-9. These changes in phonemici- 
zation may be so designed as to make phonemically identical two mor- 
phemic segments which were phonemically different before the reassign- 
ment of segments. 

In contrast with this, replacing a contour or other phonemic element 
by juncture in the case of partial dependence (8.222) will not lead to a 
phonemic juncture. If loud stress occurs on the penult vowel of every 
word or morpheme, but if the distribution of consonants is such that we 
cannot state a phonemic basis for exact placing of the juncture, we can- 
not obtain a phonemic juncture. Knowledge of the morpheme bound- 
aries will enable us to place these boundaries in each utterance, and 
we will then be able to dispense with the phonemic stress: if we write 
CVCVC:^VCCVCV^ with the morphological boundaries, we know that 
the stress is on each penult V before #. But the representation with # 
instead of is not one-one, for when we hear CYCVCVCCXCV we 
do not know if we should write CVCVC:\:\iVCCVCV or CT'CT'# 


find that some morphemes have juncture phonemes at their boundaries, 
while others do not: e.g. English morphemes ending in /ay/ are marked 
with juncture since /ay-/ represents the segments [a:y] whereas ay 
is [a.y] or [ay] depending on the following consonant : slay-nas, for 
[sla:ynas] slytiess; maynas for [maynas] tnimis. In such cases we try 
to find other phonemic characteristics for the boundaries of other mor- 
phemes, so that if possible all morpheme boundaries are recognizable 
by juncture or by peculiar phoneme sequences. Even if we attain only 
partial success, we can say that the juncture (in this case morpheme 
juncture) has several effects, depending on the neighboring phonemes: 
various phonemes have different types of segment members near it (e.g. 
/ay/ is [a:y] before /-/; /tr/ is [tr] when /-/ occurs between them, 
as in night-rate, as against nitrate which lacks the juncture; cf. chap- 
ter 8, fn. 17). In some cases we find that a certain phonemic feature cor- 
relates with morpheme boundary, but has varying positions or values 
within various morphemes (e.g. free or partially bound word stress), so 
that from the sound feature alone we cannot tell exactly where among 
the phonemes the boundary must occur. And if the phonemic feature 
always occurs a fixed number of times (e.g. once, for main stress) within 
each morpheme or construction (e.g. word), we can tell from the number 
of occurrences of that feature in our utterance how many morphemes (or 
constructions) there are in the utterance. Cf. the discussion of Grenz- 
signale in N. S. Trubetzkoy, Grundziige der Phonologic 243 ff. (Travaux 
du Cercle Linguistique de Prague 7). 


Appendix to 12.22: Partial and Seeming Independence 

Special consideration has to be given to cases of partial and superficial- 
ly apparent independence. 

Partial independence occurs when one phonemic sequence is independ- 
ent of another, while the other is not independent of the first. If we break 
a sequence into two parts, e.g. boysenberry into (/boysan/ and /beriy/), 
and find that in given utterances only one of these parts ever occurs 
without the other part, we can nevertheless say that each part is a mor- 
phemic segment by itself. Thus given /boysanberiy/ in That's a rotten 
boysenberry, we find That's a rotten blueberry, and That's a rotten berry, 
but we do not find /boysan/ next to some sequence other than /beriy/ 
in this total environment. However, we do find /bluw/ blue next to some 
sequence other than /beriy/: That's a rotten bhtepoint. In That's a rotten 
blueberry we therefore recognize bhie and berry as being two independent 
segments. Hence, in That's a rotten boysenberry, which differs from the 
preceding utterance only in having /boysan/ instead of the independent 
/bluw/, we must still consider /beriy/ as an independent element. Hav- 
ing done so, we now also recognize /boysan/ as a separate element too, 
since we do not wish to have any sequence of phonemes left over that is 
not assigned to one element or another. We want to be able to describe 
a stretch of speech exhaustively as a sequence of morphemes. 

Furthermore, our present operation is one of segmentation, and if we 
segment the utterance Give Tom a boysenberry in such a way that give, 
Tom, a, and berry (all of which can be shown to be independent) are sepa- 
rated, then unavoidably boysen has also been segmented off. Within the 
frame That's a rotten — • berry /boysan/ substitutes for various mor- 
phemes and for zero. 

For a set of segments some of which have only partial independence, 
we compare there, then, thither, this, that, where, when, whither, why, 
what, etc., in various environments. We can consider th- and wh- to be 
independent segments, and so also -ere, -en, -ither; a few phonemic se- 
quences like -y, -is, are only partially independent, but get to be seg- 
mented off when we extract the th- and wh-. Some of the segments also 
occur with h- in the place of th-, wh-, as in hither, hence. 

A more difficult problem arises with such sets as slide, slither, slick, 
sleek, slimy, etc., or glow, gleam, glimmer, glimpse, glance, glare, etc. There 
is no adequate distributional basis for separating the si- or gl- from what 
follows them. These initial sequences do not occur with a set of non- 
initial sequences which have some other characteristic in common (such 


;is tluit of occurring with a particular set of initial morphemes).'"' In the 
case of slide, glide, where the residue would be the same, we could say 
that the formulaic statement of 12.22 permits us to separate si- and gl- off. 
However, the great majority of non-initial residues of these two initial 
sequences are not the same. If there were no such case of identical residue, 
as in slide, glide we could not set si- and gl- up as morphemic segments. 
When there are one or two such cases out of a great number of non-identi- 
cal residues, we can set them up as tentative segments, but the criterion 
of 12.23 will in most cases reject them. 

An example of phonemic sequences and changes which are found not 
to be independent may be seen in the dynamic vowel processes of Yo- 
kuts. These are changes which occur in root morphemes when these roots 
are foUow^ed by suffixes. Each root may have one such vowel change w'hen 
one suffix occurs after it, and another vowel change when another suffix 
occurs after it. The changes are thus independent of the roots. Each 
vowel change occurs with several suffixes, there being fewer different vow- 
el changes than suffixes. However, each suffix occurs with only one vowel 
change, no matter what root precedes it. It would seem at first blush as 
though the relation of a given vowel chang6 to a particular suffix is the 
same as the relation of berry to boysen (berry occurs with boysen, blue, etc., 
but boysen occurs only with berry). But this is not the case, because the 
sequences, such as blue, which could replace boysen before berry were in- 
dependent on their own merit and could occur before environments other 
than berry. In contrast, the suffixes which can replace each other after 
a vowel change are not independent of the vowel change; none of them 
would occur after some other vowel change than the one in question. We 
therefore include each suffix and the vowel change with which it occurs 
in one morphemic segment. The fact that various suffixes begin with 
the same vowel change is comparable to the fact that various roots begin 
with /d/."' 

*^ Gl-, si-, and -earn, -imy, etc. do not constitute two interdependent 
fanailies such as we find in conceive, receive, concur, etc. (12.232). 

"Stanley Newman, Yokuts Language of California 23-4, 33 (1944). 
Newman does not take these vowel changes to be independent mor- 
phemes, but considers them processes operating upon the root when par- 
ticular suffi.xes are added. This is equivalent to considering them as parts 
of the phonemic definition of the suffix morphemes. For a discussion of 
these two ways of stating grammatical relations see Z. S. Harris, Yokuts 
structure and Newman's grammar, Int. Jour. Am. Ling. 10.196-211 


Appendix to 12.23: The Criterion of Similar Distributions 

The condition imposed in 12.23 amounts to a requirement of distribu- 
tional patterning. We set up as morphemic segments only those tentative- 
ly independent phonemic sequences which have distributional similari- 
ties with other tentatively independent morphemic segments. That is, we 
determine our elements in such a way that it will be possible to make 
simple and compact statements about their distribution. The task of the 
present procedure is to offer techniques for finding which segmentations 
will yield such elements. 

In this appendix there will be shown examples which satisfy 12.23, 
and others which fail to satisfy it.'*^ 

We consider the utterance The announcer is no good. The sequence /er/ 
is independent (or rather, every sequence in the utterance is independent 
of er), because we can substitute ment for er. Is there any generality of 
distribution which justifies us in considering this er a morpheme? If 
we test, by substitution, what sequences precede er, we find govern, 
assign, reinforce, etc., which also occur with ment replacing the er. We 
next ask if there is anything else that characterizes the group of se- 
quences announce, assign, reinforce, as compared with other sequences 
(say, is, very) which do not occur before er. We find that all the pve-er 

■** Y. R. Chao points out the similarity between this criterion and the 
'substitution by isotopes' proposed by C. W. Luh in his published preface 
to an unpublished vocabulary of Peiping monosyllabic words Kuo^-yii' 
tan'-yin'-tz'u^ tz'u^-hui^ (Peiping 1938), pp. 7-15. Chao writes: 'If for 
example, the question is whether shuo^ hua^ 'talk' 'speak speech' is 
one or two words, he seeks some isotopes keeping hua^ unchanged, then 
others keeping s/iwo' unchanged. If it is possible to do so, e.g. : 

shuo^ hua* 'speak speech' shuo^ hua^ 'speak speech' 

Ving'^ htia^ 'listen to speech' shuo^ tneng* 'speak about dream' 
chiang^ hiia* 'talk speech' shuo^ shu^ 'speak (story) book' 

then .s/iiio' and hua'* are two words and not one. Naturally the substituted 
forms chosen have to be isotopes, and not just any substitution. Thus, 
fei* hua^ 'wasted words' or .s/iwo' hsiao'^ 'talk (and) laugh' would not be 
isotopes of shuo^ hua*. Luh does not give the exact criteria for recognizing 
isotopes, but expresses the belief that most people would agree as to when 
parts of utterances are or are not isotopes.' It would appear that the de- 
termination of what is an 'isotope' depends upon the distributional simi- 
larity (in other utterances than the one in question) of the morphemes 
that are to be substituted in a frame. Thus t'ing^ has a distributional 
similarity to shuo^ (in environments other than — hua*) which fci'^ does 
not have. The criterion is thus equivalent to that of 12.23. 


sequences oci-ur in the environments / cannot — , Let's try to — , etc./' 
while those sequences (e.g. very) which occur in various other environ- 
ments such as He is — old do not occur in The — er is no good. It is now 
possible to make a long list of different utterances, in which all the se- 
quences which precede er substitute for each other; / cannot — -. The — -er 
is no good, etc. We define the morphemic segment er as occurring in some 
of these utterances and as following any one of these mutually substitut- 
able sequences. 

It follows that hammer does not contain this morphemic segment, be- 
cause it occurs not only in The — is no good, but also in / can't stand this 
— ing, He — ed away, where the other pre-er sequences cannot substitute 
for /haem/.^" 

It follows further that /sow/ in He is so old is not a member of the 
mutually substitutable group which precedes er, and is not identical 
with the morpheme /sow/ in The sower is no good, I cannot sow; because 
the other sequences which precede er cannot substitute in He is — old. 

A contrast with these morphemic divisions is afforded by the sequences 
tear, pair, share, in utterances like His — arrived just in time. Since we 
also have tea, pay, (one-horse) shay in these utterances, the /r/ is ten- 
tatively independent and we are in position to consider its possible mor- 
pheme status. However, when we seek some other feature which charac- 
terizes the sequences which occur before /r/ as against sequences which 
don't (e.g. book, wife) we fail to obtain results: tea, pay, etc. do not have 
a regular difference in distribution as compared with book, wife, etc. 
Furthermore, there is no common feature to all the sequences which oc- 
cur before /r/ in His — r arrived just in time: Some of them, e.g. tea, occur 
without the /r/ in His — • arrived J2tst in time; others, e.g. /iey/ from 
chair, occur only before a few other phonemes, as in chain, etc. There are 
few positions, aside from the very one in His — r arrived just in time, 
where tea, pay, shay, chai can substitute for each other. 

Similarly, the /g/ of bag, rug, bug is independent by the criterion of 
12.22, but does not correlate with any other distributional feature as is 
required by the criterion of 12.23. 

*^ In the case of er, a great number of the sequences which occur in 
/ cannot — ■. also occur before er. If we had been considering inent we 
would have found that only a few of the sequences which occur in / cati- 
not — . also occur before ment (e.g. teach occurs before er but not before 
mcnl: preach occurs before both). However, this difference in quantity is 
irrelevant here. 

^^ I.e. the morpheme er was not defined as occurring before ing, ed, or 
before another er as in hammerer. 


Appendix to 12.233: Alternatives in Patterning 

The considerations of patterning are especially complicated in seg- 
menting phonemic sequences which cannot be matched with other, par- 
tially identical, sequences. For example, if we consider run in I'll — over 
for it, we can substitute walk, stay, etc., and may thus set up each of these 
phoneme sequences as a morphemic segment. If we consider walked in 
/ — over before you came, we can substitute stayed, ran, etc. We can sepa- 
rate the -ed morphemically by setting up the environment / — ed over 
before you came, in which walk, stay, etc. replace each other. This does not 
enable us to segment ran. However, we may analyze ran on the basis of 
walked, stayed: since ran substitutes for walked, etc., and run for walk, 
etc., and since walked has been segmented into ivalk and ed, we may seg- 
ment ran into run and /a/ — ^ /se/ (replacement of /a/ by /se/). 

Independently of this, we may compare run and ran in 7 — slowly, and 
note that these are partially identical as to r — n. Hence we may say that 
there are two morphemic segments, /a/ and /ae/, which replace each 
other in the environment / r — n slowly. A formulation of this type avoids 
giving precedence to run over rail (for we could as well say that run is 
segmented into ran plus /ae/ -r^ /a/, as the other way around). However, 
the precedence of run over ran is given by the fact that run replaces the 
one-morpheme walk while ran replaces the two-morpheme walked. 

Wc could eliminate this precedence by saying that (on the analogy of 
run — ran) walk too contains two morphemes: walk plus zero; and that 
zero and cd are two segments which replace each other in the environ- 
ment / walk — slowly. Even if we did this, and listed r — n, /a/, /ae/, and 
uHilk, zero, ed, all as morphemic segments, we could morphemically 
identify either one of /a/, /ae/ with r — n, and either one of zero, ed with 
ivalk on the basis of the fact that r — n never occurs without either /a/ 
or a' , and walk never occurs without either zero or ed (see chapter 
18, fn. .35). 

All these methods of segmenting are ecjuivalent. Beyond the general 
criteria of independence and patterning (12.22 3), the choice among these 
methods depends upon how we choose to treat zero segments and the 
voiding of elements (Appendix to 18.2). The choice does not depend 
upon any absolute criterion of denying ran precedence over run, because 
the morphemic segmentation of all forms (and so the question of whether 
a form run contains one morpheme while rati contains two) depends on 
the total environments of the forms and on the other forms which sub- 
stitute for them. 


ApiM'iulix to 12.323-4: Complex Discontinuous Morphemes 

V.xvn complicated parts of the morphology of a language may turn 
out to involve completely dependent parts of utterances, and so to be 
expressible by repeated portions of a discontinuous morpheme. Thus the 
grammatical noun classes of the Bantu languages are usually treated as 
a subdivision of the noun vocabulary into classes, each of which 'agrees' 
with particular affixes, particles, etc., elsewhere in the utterance. In this 
treatment, the class markers are prefixes which occur before particular 
nouns and then 'agree' with other prefixes in the utterance. 

Instead of this, we can say that the class markers are discontinuous 
morphemes composed of various parts each of which is prefixed to any 
noun, adjective, verb, demonstrative, etc., that occurs within a stated 
section of the utterance. That is, we state a portion of an utterance (a 

domain) consisting of: demonstrative noun — adjective — adjectiv- 

izer — verb (or consisting of any portion of this, e.g. just the sequence 
— noun — verb, which often occurs without the rest of the domain in an 
utterance). We then state that the class markers are discontinuous mor- 
phemes, parts of which occur in each position indicated by — . If the only 
part of this class-marker domain which occurs in a particular utterance 
is demonstrative — ■ — noun, then the class marker in that utterance is a 
discontinuous morpheme having two sections (separated by juncture). 
Thus Swahili hiki kiti 'this chair' is segmented into hi 'this' (which oc- 
curs with other class markers too), ti 'chair' (which occurs only with the 
ki class marker), . . ki ki . . . class marker for 'things' (the dots indicate 
the domain in the particular utterance — in this case, consisting of two 
morphemes, demonstrative and noun). If more of this domain appears 
in an utterance, the class marker in that utterance has a greater number 
of discontinuous parts: in hiki kiti kizuri ki?nevun^ika 'this fine chair 
broke (lit. it-got-broken)' the class marker^^ is . . . ki ki ... ki . . . 
ki . . . . li the noun is of a different class (not the 'thing' class), then a 
different class marker will appear in all the positions of the domain 
which are available in the particular utterance in which the noun occurs. 
All other portions of the utterance, except those included in the com- 
plete domain above (demonstrative noun — adjective — adjectivizcr 

— verb), do not contain the class marker of the noun of that domain; 
and even the morphemes within the domain are not affected by the class 
marker (except that in some cases there are morphophonemic variations 

•'■' The relation among these various ki markers (i.e. among ki, . . . ki 
ki . . . , etc.) is discussed in 13. 


due to the contact between a particular morpheme in the domain and a 
portion of the class marker). 

Since each noun morpheme occurs always with only one class marker, 
whereas the other morphemes in the domain occur now with one marker 
and now with another (depending on the marker of the noun in whose 
domain they are), it follows that each noun is dependent on some par- 
ticular class marker. If we ask why it is that the class marker ki (in kiti 
'chair') or . . . ki ki . . . (in hiki kiti), rather than some other class 
marker, occurs in the given utterance, we could say that it is because 
the noun in the domain is ti rather than some other noun. Had the noun 
been ke 'woman' we would have had the class marker for 'persons' : m (in 
mke) or ... yu m .. . (in hiyu mke 'this woman'). However, we can avoid 
this partial restriction, of noun to marker," by saying that the markers 
are not independent morphemes but portions of each noun morpheme. 
In hiki kiti, the morphemic segmentation (based on independence of oc- 
currence) gives us hi 'this' and . . . ki kiti 'chair'; and indeed we can re- 
place hi while keeping . . . ki kiti constant, or we can replace . . . An' kiti 
while keeping hi constant. '••' In hiki kiti kizuri 'this fine chair' the mor- 
phemic segmentation yields hi 'this,' zuri 'fine,' and . . . ki kiti ki . . . 

When the segmentation is carried out in this way, without morpliemi- 
cally independent class markers, we no longer have a domain containing 
a noun into which a discontinuous class marker is inserted (the particu- 
lar marker depending on the noun). Instead, we say that the first two 
(or fewer) phonemes of a noun are repeated^^ in as many of the follow- 
ing positions as occur around the noun in the utterance: demonstra- 
tive — * — adjective — adjectivizer — verb (where the asterisk indicates 

=^ And of marker to a particular group of nouns: for while -ti occurs 
only with A:;'-, ki- occurs only with a limited number of nouns, and never 
with such nouns as -tu 'man', -kc 'woman' (which occur only with class 
marker tn-). 

■'■■'Since the dependence here is only partial, we can replace -ti while 
keeping hiki ki- constant (and obtain, e.g., hiki cio 'this school', with 
a c variant of ki); although we would later have to separate hi off. But 
we cannot replace . . . ki ki . . . while keeping hi- -ti constant, since -ti 
does not occur except with ki-. 

^* The repetitions of these first phonemes have variant forms in various 
environments. This can be avoided by specifying the environment in 
sufficient detail, and saying that in some cases the discontinuous por- 
t ions of t he noun morpheme are not repetitions of the first phonemes, but 
consist of other stated phonemes. 


the position of the main body of the noun). In the environment hi — * 
'this — ' the form of 'chair' is . . . ki kiti; in the environment hi — * — zuri 
— memm%ikn 'this fine — broke' the form of 'chair' is . . . ki kiti ki . . . 
ki . . . , yielding hiki kid kizuri kimevun%ika}^ 

Appendix to 12.3~i: Order as a .Morphemic Element 

^^'e can now assign every phoneme or component in an utterance to 
some morphemic segment or other. However, we may still find between 
utterances which are identical in their morphemic segments, differences 
in form that correlate regularly with differences in environment and 
meaning: The man hat just killed a hill is not substitutable in longer con- 
texts and in social situations for A bull has just killed the man. In the 
same way You saw Fred? differs from Fred saw you? This difference in 
form-^* between the members of each pair has not been included among 
our morphemic segments because it does not consist in adding or sub- 
tracting phonemic sequences (as do all the cases in 12.3), but rather in 
the order of morphemic segments, i.e. in the relative position in which 
the phonemic sequences are added." 

In other cases, there is no contrast between two arrangements of a 
given set of morphemic segments, but only one of these arrangements oc- 
curs: The man. occurs, but Man the. does not, either in the same or in 
other contextual environment or social situation. Nevertheless, in these 
cases too order of morphemic segments must be noted in describing the 
utterance, in order to exclude from the description the arrangements 
which do not occur. =* 

Finally, there are cases where the order of morphemic segments in an 
utterance is free; i.e. the morphemes occur in any order, with no attend- 
ant difference in the larger contextual environment or in the social situ- 

''* For a somewhat more detailed statement of these Bantu class mark- 
ers, and for other examples of this analysis, see Z. S. Harris, Discontinu- 
ous Morphemes, L.\ng. 21.121-7 (1945). 

'"'^ Such features of arrangement are called taxemes in Leonard Bloom- 
field, Language 166, 184. 

^' Cases of this type, where there is a contrast between two arrange- 
ments of morphemic segments, will be referred to in this Appendix as 
contrasting order. 

^^ Cases of this type will be referred to here as restricted order. 

" Cases of this type will be referred to here as descriptively equivalent 


The most obvious way to indicate the arrangement of morphemic seg- 
ments within an utterance is to say that each utterance can be completely 
and uniquely identified not simply as a sum of segments, but as an or- 
dered set of segments. However, this method is not entirely satisfactory. 
In the first place, the three types of cases cited above are not all ordered 
in the same sense; the third, descriptively equivalent, type may in fact 
be described as not being ordered.''" In the second place, the differences 
of arrangement often have a relation to the neighboring morphemes and 
to the social situation comparable to the relation which morphemic dif- 
ferences may have to the neighboring morphemes and social situation. 
In some cases differences in morphemes substitute for differences in ar- 

For these reasons we may find it convenient, in particular languages, 
to treat order on a par with morphemic segments, i.e. as just another ele- 
ment in the morphemic constitution of the utterance.^^ Then instead of 

*° The differences among the types of ordering in these three types are 
comparable to differences of occurrence among morphemes and pho- 
nemes, so that we can describe these types in the vocabulary established 
for segment occurrence. Contrasting order is comparable to contrasting 
morphemes: you' + saw^ + Fred' + order 1, 2, 3 contrasts with you' + 
saw^ + Fred' + order 3, 2, 1 in the same way that you + saw + Fred + 
come contrasts with you -\- saw + Fred -\- go. Restricted order is compa- 
rable to limitations of morpheme occurrence: the occurrence of The man. 
to the exclusion of Man the. can be treated like the occurrence of He did go 
to the exclusion of He used go. (Just as do does not contrast with use in 
this environment, so the order / , 2 does not contrast with the order 2, 1 in 
the environment of the^ -(- man'^ -f- /./). Descriptively equivalent order 
is comparable to free variants of one segment: books, papers is substitut- 
ahle for papers, hooks before and magazines in the same way that released 
[k] is for unreleased [k'] before juncture, or that /ekanamiks/ is for 
/iykanamiks/ (13.2). 

•*' It might be most convenient to define morphemes as phonemically 
identifiable elements in particular positions relative to particular other 
such elem(!nts. Elements in diffei-ent positions need not be identified as 
the same element, just as homonymous elements in different environ- 
ments need not be. Thus, we have Bengali verb plus na for the negative 
of a verb, but na plus verb for the negative in a subordinate clause (even 
when there is no subordinate particle). We have here a correlation be- 
tween change of order and the omission of a subordinate parti(-le, such 
as occurs in subordinate clauses. We may express this by saying that na 
in verb — is 'negative', but that na in — verb is 'negative plus subordinat- 
ing particle' (thus taking the place of the particle which we e.xpect in a 
subordinate clause). I am indebted to Charles A. Ferguson for the forms. 

" This would yield morphemic elements differing in constitution from 
those recognized in 12.3, but the difference would not be as great as 


identifying an utterance as an ordered set (a particular permutation) of 
particular morphemic segments, we would identify it as a set (a combina- 
tion) of particular morphemic elements. If the segments in question have 
contrasting arrangements, one of these elements will consist not of adding 
some phonemes to the utterance, but of adding arrangement among the 
sets of phonemes. In that case, we may say that the arrangement is 
morphemic." If the segments have restricted order, the utterance will 
contain no morphemic element of order, but when the relation among 
these segments is stated (16-7) these particular segments will be defined 
as having a particular automatic order among them. If the segments in 
question have various descriptively equivalent orders, no morphemic 
element of order is required in identifying the utterance, nor is any state- 
ment about order involved in the definition of the segments. If we do 
this, every formal diflference between utterances that correlates with 
differences in contextual environment and in social situation would have 
been assigned to some morphemic element or other. 

Appendix to 12.41: The Criterion of Meaning 

The procedure of 12.2 yields elements in terms of which we can de- 
scribe what utterances occur. But it leaves unstated many facts about 
these utterances, correlations between these utterances and phenomena 
not described or identified by current descriptive linguistics, such as 
might enable us to state more exactly what is said or what are the proba- 
bilities of occurrence for a given utterance or utterance-feature. For 
example, certain utterances may be characteristic of the speech of young 
chDdren in a given language community. If we wish to know what the 

might at first appear. Each of the morphemic segments of 12.3 consisted 
of the adding, omitting, or interchanging of an ordered set of phonemes, 
components, or contours, in respect to the other morphemic segments 
of the utterance. The new morphemic elements of order would consist of 
the relative order of these morphemic segments (i.e. of these additions, 
omissions, and interchange of phonemes) relative to each other in the 

*^ I.e. not automatic for the morphemic segments in the utterance. 
In the same sense, we call any phonological segment or element phonemic 
if it is not automatic in respect to the other phonological segments and 
elements of the utterance. The morphemic element of order would be 
given a dictionary meaning, based on the social situations with which its 
occurrence correlates, just as is the case with all the other morphemic 
elements. The order in You saw Fred? and in Fred saw youf has the mean- 
ing roughly of 'actor' for the position before saw and 'object of action' 
for the position after it. 


probability is that some utterance will be of this type, we need but dis- 
cover the ratio of children's utterances to adults' utterances.^'' Discovery 
of correlations of this type is by definition excluded from the procedures 
presented here. Such correlations could be treated in investigations of 
social -dialect features, or personal sub-phonemic characteristics, and 
would in general be included in the relation of language to culture and 

One of these types of correlation is, however, so universally included 
in descriptive linguistics as to require special consideration here. This is 
the meaning of utterances, or, in the last analysis, the correlation of ut- 
terances with the social situation^^ in which they occur. 

If we consider this correlation, we find that there are major limitations 
upon the occurrence of phonemic sequences, depending on the social 
situation. The sequences of phonemes are not random in respect to the 
social situation in which they occur. Our investigation of phoneme dis- 
tribution may show that the sequences /"tuw-'Sriyz,'pliyz./ Two threes, 
please. /'wats-Se'reyt-fgr-'Sis?/ What's the rate for this? /kan'sidar-Sa- 
'kltes-'strakcar-in-'rowm./ Consider the class structure in Rome, all oc- 
cur. But it will not show that the first two will occur in particular so- 
cial situations, in which the third will not. 

More generally, our previous investigation may tell us that sooner or 
later, in some situation or other, the sequence /'pliyz/ will occur, but it 
cannot tell us when, in what particular social situations, it has a higher 
probability of occurring. 

If we try to correlate each phoneme or component with the social 
situations in which it occurs, we will obtain no high correlation, e.xcept 
in rare cases. ^'' The phoneme /k/ occurs equally in an angry command 

^^ I.e. the percentage of yovmg children in the community, and the 
average number of utterances spoken by a child and by an adult. 

"^ This term, used as the equivalent of 'meaning', is taken in its broad- 
est sense, but will not be defined here because the whole discussion of this 
section is not at present given to exact statement. It should be noted that 
even when meaning is taken into consideration there is no need for a 
detailed and involved statement of the meaning of the element, much less 
i){ what it was that the speaker meant when he said it. All that is n^ciuired 
is that we find a regulai- difference between two sets of situations (those 
in which s occurs and those in which it does not). Of course, the more 
exact, subtle, and refined our statement of this difference is, the better. 

•^^ Occasionally we may find a plioneme which occurs in so few mor- 
phemes or types of social situation as to permit of such correlation: e.g. 
initial /S/ in English (which occurs in a few morphemes- </ic, there, then, 
etc.). In some cases morphological elements may be coterminous with 


to hurry {Make it snappy!), in a discourse on social change (communism), 
and so on. We cannot in general correlate these phonological elements 
with the reaction of the hearer or with the whole social situation in which 
the speaking takes place. 

More generally we can ask: How does the phonemic content of an 
utterance vary as the social situation in which the utterance occurs 
varies? If we record at quarter hour intervals, from 9 a.m. to 3 p.m., the 
greetings exchanged between formal business associates, we find as the 
day passes no gradual phonemic change in the greeting but first many 
occurrences of the sequence /gud'mornir)./ Good morning., and later a 
complete replacement by the sequence /gudaeftar'nuwn./ Good after- 
noon. The change in social situation correlates with the change of a 
whole sequence of phonemes at once, together. 

It would appear, then, that if we wish elements which will correlate 
with meanings, we must seek them in general not among single phonologi- 
cal elements but among combinations and sequences of these. The at- 
tempt to set up sequences of phonemes which correlate highly with 
features of social situations meets, however, with great technical diffi- 
culties. There are at present no methods of measuring social situations 
and of uniquely identifying social situations as composed of constituent 
parts, in such a way that we can divide the utterance which occurs in 
(or corresponds to) that social situation into segments which will corre- 

phonological elements. Thus the morphemic contours of 12.344 may in 
some cases be identical with the phonemic contours of chapter 6, if no 
reduction of the contours of 6 into constituent phonemic elements (Ap- 
pendi.x to 6.6) proves possible. Similarly, if we include among the phono- 
logical elements those segments which represent so-called gestural and 
onomatopoetic sounds, and which combine only rudimentarily with other 
segments, we will find that these segments are in effect also morphologi- 
cal elements: e.g. if the tongue-tip click written tak in English is con- 
sidered a phonological element of English, we will find that it is restricted 
to a relatively small number of social situations and meanings. In all 
these special cases a phonological element will be found to have high 
correlation with classes of social situations. More generally, we can say 
that every phoneme has some elementary meaning in that it differentiates 
one meaning-correlated morpheme from another: we can say that /t/ 
correlates with the meaning difference between short and shorn, shore, 
etc., and between take and lake, ache, etc., and so on. In linguistic systems 
in which phonemes are restricted as to their neighbors, it is also possible 
to say that the phonemes have certain expectation value: after the Eng- 
lish phonemes /par/ adding the phoneme /{•/ permits us to expect the 
phoneme d and the utterance parched, or the phoneme /m and the 
morpheme parchment, and so on, but not, say, the phoneme /z / 


spond to the constituent parts of the situation. In general, we cannot at 
present rely on some natural or scientifically ascertainable subdivision 
of the meaning range of the local culture, because techniques for such 
complete cultural analysis into discrete elements do not exist today; 
on the contrary, language is one of our chief sources of knowledge about 
a people's culture (or 'world of meaning') and the distinctions or divisions 
which are made in it. 

This is not to say that meaning differences or apparent identity of 
meaning cannot be used in the course of the search for larger-than- 
phoneme segmentations of utterances. Since both the distributional seg- 
mentation of 12.2 and the meaning-correlating divisions of the preceding 
two paragraphs involve a segmentation of utterances into parts generally 
larger than one phoneme each, it is possible for the two segmentations 
to be frequently identical; and linguists often use apparent differences 
or identities of meaning (or of translation) as hints in their search for 
morphemic segments. However, these hints must always be checked with 
the operations of 12.2 if the resulting segmentation is to satisfy the pur- 
poses of our pi-ocedures, so that meaning never functions as a full- 
fledged criterion for morpheme segmentation, on a par with the criteria 
of 12.2. By the same token, the morphemes resulting from these pro- 
cedures are not necessarily exact correspondences for such distinctions 
as are made in the culture in question;^' and if in two situations the same 
morphemes or utterances occur, we cannot derive therefrom that the 
two situations are not culturally distinguishable.'*'* 

All that is possible, then, in terms of the methods used in these pro- 
cedures, is to set up the morphemic segments purely on the basis of the 
relative distributional criteria of 12.2. Entirely independent investiga- 
tions, using techniques quite different from those of current descriptive 
linguistics, might then seek to correlate these segments with features of 
social situations.^* For the purposes of descriptive linguistics proper, 

" Nevertheless, there is in general a close correspondence between the 
morphemic division which we might establish on a meaning basis and 
that which results from our distributional criteria. This is so because in 
general morphemes which differ in meaning will also differ in their en- 
vironments, if we take sufficiently long environments and enough of 

'*'* If two types of basket are named with the same word, we cannot 
say without further investigation that the two types are equivalent or 
indistinguishable in that culture. 

'■'•' Dictionaries usually combine the listing of each distinct morpheme 
or word (short sequence of morphemes within the limits of some stated 


whon it is drsiniblc to connet't its uttorances and elements with social 
situations, it suffices to define 'meaninfj;' (more exactly 'difference in 
meaning') in such a way that utterances which differ in morphemic con- 
stituency will be considered as differing in meaning, and that this differ- 
ence in meaning is assumed to indicate differences in the social situations 
in which these utterances occur. Then the meaning of each morpheme in 
the utterance will be defined in such a way that the sum of the meanings 
of the constituent morphemes is the meaning of the utterance.^" In es- 
sence, the method here is to compare two partially different utterances 
having partially different meanings (e.g. Take my book. Take my books.) 
and attribute the difference in meaning to the difference in morphemic 
content {s is an element meaning 'plural'). This is purely a convention; 
it is based on no new information about the morphemes and gives no new 
information about them, but merely enables us to speak of the meaning 
of morphemes."' 

This convention concerning morphemic meaning in descriptive lin- 
guistics does not yield a simple single meaning for each morpheme in all 
of its occurrences. Show me the table may be said in connection with both 

morphological relation), plus some indication of its morphological classi- 
fication, with a very rough approximation to the meaning or social situa- 
tion correlation. Any serious investigations in this field will have to be 
much more subtle and detailed than dictionaries can be at best. It would 
be necessary to relate all differences among utterances with differences 
in culturally, including interpersonally, relevant features of the social 
situations; and in so doing it would be necessary to note not only in what 
situations utterances differ but also in what situations talk does not 

^^ Or rather, that the difference in meaning between two utterances 
is the sum of the meaning of the moi-phemic elements (including order) 
which are included in the first utterance but not in the second and those 
which are in the second but not in the first. 

^' Our only data is the meaning (i.e. social context) of each utterance; 
the identification of morpheme meanings with features of social situa- 
tions is merely a matter of further operations upon this data. Any in- 
vestigations that are designed to go beyond this would have to reconsider 
the data in greater detail, and in terms of the morphemes (or small se- 
quences of them, e.g. words). Such investigations might seek to discover 
the meaning of each morpheme imbedded within an utterance. For work 
in this direction, see Edward Sapir, Grading, a study in semantics. Phi- 
losophy of Science 11.93-116 (1944); Totality, Language Monographs 6 
(1930); E. Sapir and M. Swadesh, Expression of the ending-point rela- 
tion in English, French, and German, Language Monographs 11 (1932); 
note also the use of meaning in Otto Jespersen's Modern English Gram- 


a piece of furniture and a chart exhibiting data, ^^'e therefore define a 
range of meaning for each morpheme, which includes its meaning in each 
occurrence. In some cases, different meanings within this range occur in 
slightly different environments: book indicates a list of bets in book- 
maker, but a volume in book-binder J'^ In other cases there is no difference 
in environment : / bumped into a pole, can be said after a minor accident 
or after a chance meeting with an East European." In either case we 
note roughly the ranges of meanings and the linguistic environmental 
differences, if any, of the single morpheme.'^ 

It is possible to seek a single factor of meaning common to all the oc- 
currences of a morpheme, so that the range of meaning can be stated in 
terms of a meaning element which always occurs with the morpheme, 
plus added variations in various environments. However, this will not 
necessarily yield a more convenient or compact set of statements in 
every case.'^ 

It may be more useful for descriptive linguistics to treat the range of 
meaning of a morpheme as consisting of several environmentally-re- 
stricted meanings, the environmental ranges to which each meaning is 
restricted being selected in the manner of chapter 15." 

'- The meaning differences may not be obvious at first : e.g. -ize in 
neutralize means 'to render (neutral)', in ininimize 'to claim or make 
something seem (small)'. 

" We would not attempt to make a morphemic division to fit the 
meaning differences: e.g. to say that pole 'pillar' was two morphemes 
(say, po and /) and so different from Pole morphemically thcjugh not 
phonemically. To do so would have given us two new morphemes which 
always occur together, against the condition of 12.22; and if we had tried 
to say that one of the proposed morphemes, say po, also occuri'ed else- 
where, as in poster, we would have difficulties in satisfying the distribu- 
tional similarity requirement of 12.23. 

^'' Cases where the linguistic environment is very different will be 
treated in 13.41. 

'^ This attempt would have the merit (from the point of view of dia- 
chronic linguistics and of culture analysis) of stating what meaning is 
common both to the bulk of the occurrences of a moi'pheme and to new 
or idiomatic uses of it. But no such interest would attend any attempts 
to state a common meaning to homonymous morphemes (chapter 13, 
fn. G). 

'"'' In that case, one common meaning would be assigned to all the oc- 
currences of a morpheme within a large set of environments where il 
is replaceable by a large set of other morphemes; then another meaning 
would be assigned to all the occurrences of the morpheme in some smaller 
set of environments in which it is replaceable by some small set of other 


In assigning meanings to each morphemic element, we will come upon 
various special cases, for some of which meanings of the dictionary type 
are usually not stated. 

Morphemic contours, such as the rising intonation marked /?/, the 
extra loud stress marked /"/, can be assigned meanings ('question' for 
?/ and 'contrastive emphasis' for /"/) which indicate the social-situa- 
tion correlation even though they differ somewhat in type from the 
more simply 'referential' meanings of cat, hate, please. The usual mean- 
ing of some morphemes, e.g. please, approaches the type of meaning of 
these contours. 

Some morphemes, frequently including the morphemic element of or- 
der (Appendix to 12.3-4) have so-called grammatical meanings, e.g. the 
Kota echo-word giX which means 'and other things like what is referred 
to in the preceding morpheme'. ^^ 

Difficulties of stating meaning also occur in the case of morphemes of 
unique environment, which are segmented off when an utterance has a 
unique residue after all morphemes have been divided off. In boysen- 
berry, berry is certainly a morpheme. Therefore, boysen is also a mor- 
pheme, having as its meaning the differentia between boysenberries and 
other berries. 

Similarly, in there, then, thither, this, that, etc., we obtain by the Appen- 
dix to 12.22 a segment /3/ with demonstrative meaning, plus various 
residue elements with unique meanings (/is, 'near', /set/ 'yonder', etc.). 
In where, when, whither, which, what, why, etc., we obtain an element 
/hw/^* with interrogative (and in some linguistic environments, relative 

morphemes; and so on. Thus we might assign one meaning to/siy/see 
in the many environments such as He can't — the fellow, Why do you want 
to — it^ where it is replaceable by catch, stop, please, etc. And we might 
assign another meaning to see in the particular environments / — the 
point, Do you — why I want it? etc., where it is replaceable by far fewer 
morphemes {get, understand, etc.). And, of course, yet another meaning 
to siy/ in environments such as The — is calm, where it is replaceable 
by other morphemes: ocean, water, woman, etc. In this way so-called 
idiomatic and homonymous differences of meaning and the like can be 
separately stated on distributional grounds (cf. chapter 13 fn. 6, 13.41, and 
the Appendix to 15.32). 

" The X indicates all the phonemes of the preceding morpheme ex- 
cept for its initial consonant plus vowel. Cf. M. B. Emeneau, An echo- 
word motif in Dravidian folk-tales, Jour. Am. Or. Soc. 58.553-70 (1938). 

'* In some dialects, /w/. This analysis is abetted by the rarity of 
initial /S, hw/. 


or resumptive) meaning, plus various second elements with unique 
meanings (/at/ 'object reference', /ay/ 'reason', etc.). 

A more involved problem arises in the case of such seemingly inde- 
pendent sequences of phonemes as the si-, gl- of slide, slimy, glide, gleam, 
etc. (Appendix to 12.22). The chief reason for raising the question of the 
morphemic status of si-, gl- is the partial similarity in meaning among 
the words beginning with si- and gl- respectively;'^ and no adequate dis- 
tributional basis can be found for supporting this segmentation. But 
even if we follow out the only correlation, that of common meaning, 
which brings together the gl- words, we find that it gives us no basis for 
deciding what gl- words do not have this common meaning. Is glide in- 
cluded in the set? Is glimmer? Or glass, glen, gloss, glory, gloom, glad, globe, 
gladiator? Furthermore, it might be possible to form such sets of slightly 
similar meaning with partially similar form for almost any connected or 
broken sequence of phonemes. What of the /'Sar/ of brother, mother, 
father? One could even argue for some connection in meaning among 
plant, plank, plow, (to) pluck, plot {of ground), plum, perhaps including 
plodder, plebeian. Note also the ump of jump, bump, trump. 

Difficult as it may be to argue for morphemic status for sequences like 
gl-, it is also unsatisfactory to leave unstated the fact that so many se- 
quences beginning with gl- have partial similarity in meaning. The solu- 
tion is not, of course, to cast a deciding vote one way or the other, but to 
relate this situation, precisely as it is, to the other facts about the lan- 
guage. The sequence gl- is not a distributionally separable element;**" 
therefore it is not a morpheme in the definition which applied to -er, 
-ceive, con-, yes. But gl- exhibits, in many morphemes, a correlation be- 
tween meaning and phonemic form, of the type which is also true for 

'' For example, we would not ask whether the tr of try, tree, trick, 
train may not be segmented off as a morpheme. 

**" The exact distributional difference between sequences like gl- and 
our procedurally recognized morphemes is as follows: All our morphemes 
occur usually or always (or at least sometimes) next to other sequences 
which are independent morphemes on their own merits (by 12.22): in 
boysenberry, berry occurs next to a unique element, which would not have 
been considered an independent morpheme were it not that berry was so 
considered; but elsewhere we have berry in blueberry, a fine berry, com- 
parable to bluebell, a fine analysis, where the neighbors of berry are clearly 
independent. Sequences like gl-, si-, however, occur only next to other 
sequences which are themselves unique, and which do not in turn occur 
next to independent morphemes: .s7- next to ither, eek, etc. Occasional 
identical neighbors like -ide after si- and gl- hardly suffice to change the 


inoi^t of the tlijstributiomilly separable morphemes as a whole/' At some 
point in our organization of the linguistic data, therefore, e.g. at the 
point where we say that most or all the morphemes have assignable 
meaning, or at the beginnings of the gl-, si-, and other such entries in the 
dictionary, we would state that very many of the morphemes beginning 
with gl (perhaps a majority of those having only one vowel, or one 
stressed vowel plus shwa) have some reference to light, etc.; and so for 
the other sets.*- 

A more unusual case is that of the Bantu class markers.''-^ If each noun 
morpheme is set up so as to contain the repeated class marker as part of 
its phonemic constitution, e.g. if the morpheme for 'man' is not tu but 
mtu, 7ntu m . . . , and the like, and the morpheme for 'woman' not ke but 
mke, . . . yu tnke a . . . , and the like, then it follows that each noun 
morpheme includes in its phonemic constitution some one out of about 
six discontinuous phonemic sequences. Thus, both mtu and mke have 
the m- sequence. Many of the morphemes which include the same dis- 
continuous sequence are also partially similar in meaning: mtu and mke 
both indicate persons. It is therefore possible to say that each of these 
discontinuous sequences is associated, in many of its occurrences,^'' with 

^' A result of this meaning correlation, which might be included in 
our descriptive statement if we broaden the base of our description to in- 
clude many speakers or a short duration in time, is the fact that sequences 
like gl-, si- are productive. Occasional new forms are composed with 
them, the other part of the form being arbitrary (onomatopoetic, etc.), 
or extracted from some other morpheme. However, occasional new forms 
are also formed with sequences that do not occur in sets; e.g. conflations 
of parts of two morphemes. We must therefore grant that almost any 
part of any morpheme may become productive (note also Jespersen's 
"metanalysis" in his Language 385, and Bloomfield, Language 414), al- 
though sequences like gl- which occur in meaning-sets, are more fre- 
quently productive, and our formal morphemes more frequently yet. In 
view of all this, we must say that while our formal procedures yield only 
the morphemes of 12.2, the correlation with meaning, which is true of al- 
most all morphemes, is also true of certain identical parts of several mor- 
phemes (e.g. gl-); and that the potentiality for productivity, which is 
true of certain morphemes, is also true of these same identical parts of 
several morphemes, and in rare cases also of other parts of individual 

^^ R. S. Wells points out that sequences like gl- can, alternatively, 
be included in the grammar and dictionary as a special sub-class of mor- 
phemes, in which case the linguist would have to decide what sequences 
to recognize as satisfying the conditions for this sub-class. 

*^ Cf. the Appendix to 12.323-4, and fn. 55 above. 

*■* The diagnostic environment, which determines whether a given oc- 
currence of the discontinuous sequence is associated with this meaning, 


a stated general meaning (such as 'person'). Here is an important mean- 
ing distinction which does not correlate with morphemic segmentation.** 

Appendix to 12.5: Relation between Morphologic and Phonologic 

Although we can make statements about the language with the aid of 
morphemes such as we could not by the use of phonemes, there is an im- 
portant parallel between the two types of element. Since the morphemes 
are sequences of phonemes, they represent features or portions of the 
flow of speech. But the phonemes and components were also precisely 
that. What then are the differences or similarities between the phono- 
logical and the morphological methods of segmenting utterances? 

Phonological elements are independent. We have seen in 1-10 that 
the phonological elements of a language can be determined by segmenta- 
tion and operations upon the segments, all of this entirely unrelated to 
the morphological analysis, and involving no knowledge of the mor- 
phemes of the language. 

Morphological elements are independent. It is also possible to deter- 
mine the morphological elements of a language without relation to the 
phonemic contrasts and with no prior knowledge of them. In order to 
do this we would take the unique, unanalyzed, complete speech events 
and apply to them procedures analogous to those of 3-5. The only dif- 
ference is that in determining the limits for the original segment lengths, 
we would use the criteria of 12.2, in reference to total utterance environ- 
ment, instead of the criteria of 5, in reference to immediate environment. 

From morphological to phonological elements. As was noted in 2.61, 
these phoneme-less unanalyzed morphemes would be adequate for mor- 
phological identification. No morphological information would be added 
by breaking them down into phonemes.**^ Nevertheless, it is an empirical 
fact that various such morphemes would be similar in sound over part 
of their length. If we wanted to e.xpress these similarities and to find a 
convenient way of wiiting the morphemes, we could break them down 
into segments and group these segments into phonemes. We would mei'C- 
ly have to cut each morpheme into smaller segments by matching differ- 

is the remaining phonemes of the morpheme in which the discontinuous 
sequence occurs. 

*^ C'f. chapter 15, fn. 21, for the common meaning of whole classes of 

*** Except for the ability to note partial phonetic similarities among 
complementary morpheme variants (i.e. except for morphophonemics). 


ent morphemes and noting the segments in which one morpheme differs 

from another." 

From phonological to morphological elements. Instead of the prohibi- 
tively cumbersome method of the preceding paragraph, the usual method 
in linguistics is to determine the phonemic distinctions first (in order of 
rigorous analysis, not of time), and then to determine the morphemes 
in terms of the phonemes. This requires placing morpheme boundaries 
in among the phonemes, which can be done only by applying a new cri- 
terion (that of 12.2). Morphemes cannot therefore be derived from pho- 
nemes merely by application of logical operations such as were used in 
7-10. We can determine the morphemes (or the points of morpheme 
boundary) in a language only by utilizing additional information, such 
as that indicated in 12. Even in languages where all morphemes are of 
the same length, so that every so many phonemes constitute a morpheme 
(in Annamese: every sequence of consonant plus vowel plus tone), the 
morpheme is not entirely derivable from the phonemes : for whence did 
we know in the first place that all morphemes in this language are of the 
same length, and that that length is so and so many phonemes? 

*' This would yield segments which are differentiated in such a way as 
to present the minimum difference among morphemes, whereas the meth- 
od of 1-10 has been to obtain segments whose differences present the reg- 
ular differences between utterances. The two methods are not identical 
or even necessarily equivalent. However, the minimum differences 
among morphemes usually correlate closely with the regular differences 
among utterances. Therefore, these two methods can in general be made 
equivalent to one another in their results. 


13.0. Introductory 

The following chapters present a series of operations designed chiefly 
to reduce the number of elements for linguistic description.' The first 
procedure groups sets of complementary morphemic segments into mor- 
phemes.^ It thus covers regular and irregular phonological alternation, 
sandhi, morpholexical variation,' suppletion, reduplication, and other 
types of morpheme variants. 

13.1. Purpose: Reducing the Number of Elements 

We seek to obtain fewer elements having fewer restrictions on occur- 

* There have been fewer investigations into morphology than into 
phonology, in recent years. Attention may be drawn in particular to the 
morphological sections in Edward Sapir, Language, and Leonard Bloom- 
field, Language. Cf. also Ferdinand de Saussure, Cours de linguistique 
generate; B. Bloch and G. L. Trager, Outline of Linguistic Analysis; 
Vladimir Skalifika, Zur ungarischen Grammatik, Facultas philosophica 
Universitatis Carolinae Pragensis 39 (1935); B. Trnka, Some thoughts 
on structural morphology, in Charisteria Guilielmo Mathesio 57 (1932); 
Bernard Bloch, English verb inflection, Lang. 23.399^18 (1947); and 
the discussion Quelles sont les m^thodes les mieux appropri^es k un ex- 
pose complet et pratique de la grammaire d'une langue quelconque, with 
reports by R. Jakobson, S. Karcevsky, N. Trubetzkoy, Ch. Bally, 
A. Sechehaye, F. Hestermann, V. Mathesius, in Actes du premier con- 
gres international de linguistes 1928.33-63 (1930). 

Several treatments of morphological theory and methods have been 
written, some fairly parallel to the methods presented here and others 
less so. Cf. Otto Jespersen, The System of Grammar (1933); Otto Jes- 
persen, Analytic Syntax (1937); Viggo Br0ndal, Morfologi og Syntax 
(1932); Louis Hjelmslev, Principes de grammaire g^n^rale, Kgl. danske 
Videnskabernes Selskab. Historisk-filologiske Meddelelser 16.1-363 
(1928); Louis Hjelmslev and H. J. Uldall, An outline of glossematics, 
Humanistisk Samfunds Skrifter 1 (Aarhus 1939). Many relevant articles 
have appeared in Lingua, in Studia Linguistica, in the Travaux du Cercle 
Linguistique de Copenhague, in Acta Linguistica, in the Acts of the 
International Congresses of Linguists, in L.vnguagk, an 1 in other lin- 
guistic periodicals and volumes of essays. 

^ Cf. Leonard Bloomfield, Language 164, on the alternant forms of a 
morpheme, ('f. also C. F. Voegelin, A problem in morpheme alternants 
and their distribution, Lang. 23.245-254 (1947). 

^ For the term, see Leonard Bloomfield, Menomini morphophonemics, 
Travaux du Cercle Linguistique de Prague 8.105-15 (1939). 



Tlie operations of chapter 12 leave every utterance in our corpus seg- 
mented into morphemic elements (including non-segmental ones such 
as contours and order). If we are to develop a compact representation 
of our utterances, we cannot keep each of these segments as a distinct 
element, but must find ways of identifying segments of one utterance 
with segments of another (or with other segments of the same utterance). 

In doing so, it is easiest to identify segments which occur in identical 
environments, but it will be found possible also to identify segments 
which occur in different environments. It is also easy to identify seg- 
ments which have identical phonemic constitutions, but it will be found 
possible to identify segments having different phonemic constitutions. 

The problem of obtaining elements which have the fewest limitations 
upon freedom of occurrence relative to each other can be met in the man- 
ner of chapters 4 and 7. 

13.2. Preliminary Operation : Free Variants in Identical Environ- 

Before dealing with segments having complementary environments, 
we may consider those which occur in identical environments.'* If we find 
that, say, /leka'namiks/ economics and /iiyka'namiks/ economics occur 
in completely identical environments in all cases, i.e. that a speaker, or 
all the speakers in our corpus, substitute one for the other in all linguistic 
environments (even if not in all social situations), we call these two free 
variants of each other. They are then morphemically equivalent, though 
not phonemically so. 

Cases of this type are relatively rare, but the same treatment may be 
accorded to such phonemically different sequences as /sez/ as and /z/ 
as in, for example, , inaew — ay wez 'sejdi] ' Now as I was saying. These 
are, of course, cases of slow and fast speech, or of stylistic, personal, or 
social dialect differences in manner of talking. But if we choose to dis- 
regard such differences for the purposes of our investigation, and if the 
phonemic sequences in question occur throughout in identical morphemic 
environments,^ it wiU do no violence to the remaining procedures to con- 

* We may begin with the trivial step of considering all phonemically 
identical morphemic segments in identical environments to be repetitions 
of each other, i.e. to constitute various occurrences of the 'same' mor- 
pheme (cf. chapter 12, fn. 6). We then proceed to morphemic segments 
which are partially distinct phonemically but have identical distribu- 
tion throughout. 

^ We can thus carry out this identification for several morphemic seg- 
ments in one utterance and several morphemic segments in another. We 


sider these different phonemic sequences as morphemically equivalent, 
each being a free variant of the other in terms of these procedures. 

This operation makes it possible to provide a unified morphological 
treatment for dialects which vary in the phonemic constitution of other- 
wise equivalent morphemic segments. 

13.3. Procedure: Equating Unique Morphemic Segments 

We group mutually complementary morphemic elements into mor- 
phemes. This requires that we first list the morphemic elements of our 
corpus, and note the environments to which each is limited. 

13,31. Phonemically Identical Segments 

We can agree to group all morphemic segments which are phonemical- 
ly identical into one morpheme. Then all the segments yuw ' you in 
/ see you, Are you coming?, Were you coming?. It's yours, etc., would be 
members of the same morpheme. However, this operation would also 
group all the segments /tuw/ into one morpheme, in no matter what en- 
vironment they occurred, e.g. in two for a nickel, two phis two, you two, 
Then too he's pretty old, you too. But I want to. 

We could proceed on this basis, but it will be found that in some cases 
we obtain by this method elements which are different in distribution 
from any other morphemes of the language. We may therefore decide to 
sacrifice some of the freedom of distribution of this morpheme, say 
/tuw/, by assigning some of its occurrences to one /tuw/ morpheme 
(two) and some to a second /tuw/ morpheme (too), in order to obtain 
morphemes having distributions similar to others.^ This is desirable be- 

can consider /,\\'e\, bi'siyn-ya/ Well, b'seein'ya and /iwel, biy "siyiijiyuw/ 
Well, be seeing you, to be morphemically identical, calling the diffei'ence 
between the two utterances free variations from the point of view of our 
procedures. In equating such a sequence of morphemic segments in one 
utterance with a sequence in another, we may have to note that the whole 
sequence varies together: /siyn-ys/ substitutes for /siyir)yuw/, but /ya/ 
does not substitute for /yuw/ after /siyiv)/, since /siyii] ya/ does not usu- 
ally occur. 

" When this is done we obtain what is called homonyms, i.e. phonemi- 
cally identical distinct morphemes. If all segments /tuw/ are grouped 
into one morpheme there would be no homonyms, since no other mor- 
pheme would have any member phonemically identical with /tuw/. If 
we select the method that yields homonyms here, we will obtain them 
only in different environments (13.41). This is equivalent to partial over- 
lapping in the case of phonemic elements (chapter 7, fn. 14). The reliance 
here on distributional similarity to other morphemes, stated more for- 
mally in 13.4, parallels the considerations of distributional patterning 


cause our final objective in these procedures is not a race to see which 
element comes out with the widest distribution and which is next, but a 
compact description in which a few statements can be made about many 
utterances. It is therefore convenient to let the various morphemes of the 
language have identical distributions. If all the occurrences of /tuw/ are 
grouped into one morpheme, we will have a morpheme of very wide, but 
unique, distribution. If some of these occurrences are assigned to one 
morpheme (two) and some to another (too), and so on, we will have two 
or more morphemes each of which will have a more restricted distribu- 
tion, but one similar to the distribution of other morphemes. It will be 
simpler to state the restriction on each of the more limited morphemes, 
than to state the lesser but unique restriction on the one original mor- 

The criteria for deciding which groupings are distributionally simpler 
will be discussed in 13.4. 

13.32. Phonemically Different Segments 

Since the morphemic segments have been set up on the basis of dis- 
tributional criteria, and their phonemic composition is a matter now of 
arbitrary definition in each case, there is no loss to our operations if we 
group together not only phonemically identical but also phonemically 
different segments into one morpheme." If we see that one morphemic 
segment (say, knive-) occurs only in one environment (before -s), while 
another morphemic segment (say, knife) occurs only in some other en- 
vironment (never before -s), we may group the two segments into one 
morpheme [knife] .^ The morpheme [knife] then has two members: 
knive occurring before -s, hiife elsewhere. In this way, we can group 
three or more complementary segments into one morpheme: he, am, are. 

in 12.23. For the problem of what is the 'same' morpheme, see Y. H. 
Chao, The logical structure of Chinese words, Lang. 22.4-13 (1940). 

^ In the latter case, the different phonemic sequences which are in- 
cluded as members of the same morpheme must occur in different en- 
vironments; or if they occur in the same environment they must be free 
variants of each other (13.31). Otherwise, if different sequences occurring 
in one environment were included in one morpheme, we would not know, 
when that morpheme occurred in that environment, which sequence it 
represented (compare phonemic overlapping chapter 7, fn. 14). 

* Braces { | will be used to indicate morphemes containing one or 
more members (morphemic segments). 


I'll of i&, /az/ and /ar/ of wa&, were^ are all complementary to each other 
and can be put into one morpheme \he\. 

Here again it is necessary to select criteria which will determine what 
complementary segments should be grouped into one morpheme. 

13.4. Criteria for Grouping Elements 

As follows from the nature of these procedures as a whole, the funda- 
mental criterion in grouping morphemic elements is to set up such mor- 
phemes in terms of which compact general statements concerning the 
composition of the utterances of our corpus can be made. It is therefore 
desirable that the morphemes should be made as distributionally similar 
to each other as possible, or that there should be groups of morphemes 
having identical distributions. This is not a difficult criterion to satisfy, 
because we will find in a great many cases (as a result of 12.23) that there 
are whole groups of morphemic segments each of whose total distribu- 
tion (as recognized in 13.31) is almost identical with that of every other 
segment in the group. 

13.41. Matching Environments of Phoneniically Identical Ele- 

We consider the full range of environments (i.e. the complete distribu- 
tion) of one morphemic segment, as it is determined in 13.31, and match 
it with the full range of environments of other morphemic segments. If 
we find one other segment (or any number of other segments we wish to 
require) having a distribution completely identical with that of the first 
segment, we consider all the occurrences of each of these segments to be 
morphemically identical. If hotel occurs in Are you looking for a — ?, 
My — is right over here, Several — s were destroyed, etc., and if rug, tavern, 
etc. occur in virtually the same environments,'^ we say that all the oc- 
currences of the segment hotel are included in one morpheme hotel, all the 
occurrences of rug are put into one morpheme rug, and so on. 

' The /w/ of was, were would by the same token be grouped in one 
morpheme with the -t 'past': -t never occurs next to the was, were se- 
([ucnces (or next to any segment of the {be\ unit), while w- occurs only 
there. In determining complementariness, the total environments have 
to be carefully stated: e.g. are occurs after you {You are late), but not if 
/ let (or the like) precedes (/ let you he the hero). 

'" In many languages almost no two morphemes occur in a completely 
identical range of environments. E.g. hotel and tavern would occur in 
I'm staying at the —-, but rug would hardly occur there. Such minor dif- 
ferences are treated in the Appendix to 15.2. 


If we fail to tiiid any (or tlie requisite number of) environmentally 
matching segments, we try to match the environmental range of our 
original segment with the sum of the environment ranges of two or more 
groups of morpliemes. Thus we may find no segment with a range of en- 
vironments matching that of tuw \ But we may find that some of the 
environments of /tuw/ are virtually identical with the full range of en- 
vironments of three, six, etc. Other environments of /tuw/ may be 
matched by the range of environments of also. Other (or all the remain- 
ing) environments of /tuw/ may be matched by the range of environ- 
ments oi from, for, etc. If these partial matchings cover the total distribu- 
tion of /tuw/, or if the residue of environments of /tuw/ can be described 
in the manner of 13.422,'i we set up several morphemes with the pho- 
nemic composition tuw/. The occurrences of /tuw/ where it is substitut- 
able for three, four, are assigned to one morpheme /tuw/ two. The oc- 
currences of the segment /tuw/ which are substitutable for also are as- 
signed to a second morpheme /tuw/ too. The occurrences of /tuw/ where 
it is replaceable by from are assigned to a third morpheme /tuw/ to. 

If we cannot (at this stage of our analysis) find other morphemes whose 
range of environments is virtually identical with that of the segment we 
are considering, or if we cannot find morphemes the sum of whose dis- 
tributions is virtually identical with the distribution of our segment, we 
set our segment up as constituting the same morpheme in every posi- 
tion. That is, all occurrences of our segment are tentatively included in 
the same morpheme. Later comparison of morphemic distributions, in 
chapter 15, may lead to some adjustments in this assignment.'^ Thus 

'' That is, if we can describe the unique residual occurrences as being 
i-estricted to certain environments recognized elsewhere in our morpho- 
logical analysis. (The residual occurrences of /tuw/ are limited to the en- 
vironment of certain morpheme classes, e.g. between two verbs, as in 
I forgot to come.) Cf. fn. 16 below. 

'^ The considerations of chapters 15-6 may lead to changes even in the 
assignment of mat ched groups of segments. Thus we may here set up many 
morphemes like hotel, rug (indicated by X), and many other morphemes 
like think, weep (which occur in She was — ing. Don't — , etc.), and will be 
indicated by V and many more like book, slip, cut, which occur in a larger 
range of environments and will be indicated by G. In 15-6 it may be 
shown that the range of environments of the G morphemes is virtually 
equal to the sum of the environments of the N and of the V morphemes; 
and it may be convenient to treat segments of the G type as being in- 
cluded in one morpheme book, slip, cut when they are in the .V environ- 
ments (Several — s), and in other though phonemically identical (ho- 
monymous) morphemes hook, slip, cut when they are in the T' environ- 
ments (77/ — it). 


we may at this stage consider not, which occurs in a unique range of 
positions, as an environmentally unmatched morpheme in all its occur- 
rences. If the segment which was considered in the previous paragraph, 
whose environments were matched by the sum of the environments of 
several other morphemes, has a residue of environments matched by no 
other morpheme, that residue could also be set up tentatively as a mor- 
pheme on the grounds of this paragraph. 

13.42. Phonemically Different Elements 

13.421. Matching environments. We now consider two or more 
complementary morphemic elements and match the sum of their distri- 
butions (as determined for each in 13.31) with the distribution of some 
morpheme set up in 13.41. We group knife and knive (13.32) into one 
morpheme [knife] because the single morpheme ri',g occurs in the en- 
vironments of both of these segments: A knife was destroyed, Several 
knives were destroyed; A rug ivas destroyed. Several rugs were destroyed}^ 

Similarly, all the segments included above in \be\ fill a range of en- 
vironments which is also filled by such single morphemic segments as 
fail, slay. 

In carrying out this operation of matching, it is often convenient to 
start with morphemic segments all of which have occurred in cei'tain en- 
vironments and some of which are lacking in other environments. Thus 
idea, job, life, wife, have some occurrences and restrictions in common: 
all occur in My — is involved, and none in / want to — here. However, 
some of the morphemes in this set have additional restrictions which the 
others of the set do not have: idea, nation, etc. occur also in such similar 
utterances as Our — s are involved., but life, wife do not.'*" In ordei' to 
eliminate this difference in freedom of occurrence between idea, job, and 

'^However, we would not group into one morpheme the segments 
from which occurs in such utterances as / came from there, and. four which 
occurs in such utterances as / have four more. These two segments ai'e 
complementary, but there ai'c no two or more segments which occur 
both in / came — there and / have — ■ more., and in the other environments 
of these segments. Iwen if a single segment did occur in all these environ- 
ments, e.g. the segment /tuw/ {to, two), we would not accept it as suffi- 
cient precedent, but would recjuire (on the basis of 13.41) several cases 
of single segments having this total distribution. 

''' Of course, live-, wive- do occur in this utterance, but they are dif- 
ferent morphemic segments from life, wife, since they have diffei-ent 
plionemic composition. 


///(', wife, we seek other segments {live, wive), which occur only in those 
environments of idea, job in which life, wife did not."" 

In some cases there may be no one morpheme wliose environments 
match the sum of those of two or more complementary segments under 
consideration. But there might be several other sets of complementary 
segments, the sum of environments of each set equalling that of the set 
under consideration. In such cases, too, we would group the segments 
of each set into one morpheme. 

13.422. Simplifying environmp:nt.\l differenti.vtions. In the ab- 
sence of the above criteria, we may nevertheless group complementary 
morphemic elements into morphemes if the environment of each of the 
elements is differentiated by features which are not otherwise dealt with 
in the morphology of the language in question, and if the sum of the en- 
vironments of all the elements together is not differentiated by these 
features.'" For example, in Fm coming, or There isnt. there is a contour 
morpheme meaning assertion and consisting of the pitch sequence 120. 
In Marge is coming, there is one having the same meaning but consisting 
of 1020; and in All of us are corning, there is 100020. Other sets of con- 
tours can be found which similarly consist of slightly different forms for 
utterances of different lengths (chiefly, containing different numbers of 
vowels): There isn't? with 123, She is coming.^ 1123, etc. 

•* More generally: Given a segment o, (say, knife), we note its range of 
environment X (Get my — ■, etc.), then seek other segments b, c, etc., 
which also occur in X {bag, hotel, etc.; in Get my — ). We find the full dis- 
tribution of b, c, etc., and discover that b, c, occur not only in A' but also 
in the environment 1' (Get my — s, etc.). We then look for some segment 
2 which should occur in Y but not in X (knive in Get my — s, but not in 
Get my — ■); and we group a and z into one morpheme [a] whose full 
range of environment is X + Y, a range identical with that of b, c, etc. 

'"This is essentially analogous to the consideration used in forming 
phonemes, in 7.43. Like the alternative criterion of 13.421, this helps us 
organize our morphemic segments in terms of environments which we 
have to recognize for other reasons, i.e. it tries to avoid setting up new 
classes of elements and environments. In this sense. 13.421 is just a spe- 
cial case of 13.422, since each morpheme we set up defines a differentiat- 
ing environment (the total environments in which that unit occurs, and 
which we would have to state), so that the more morphemes we can form 
(out of various members) which have the same total environment as 
other morphemes have (whether these are formed out of one or several 
members), the fewer environments need we differentiate. For this reason, 
the considerations of 13.422 would be used also to separate the occur- 
rences of a morpheme in certain environments A' from the occurrence of 
the same morpheme in environments }', if there are many morphemes 
which occur in X and many others which occur in Y but relatively few 
(or no others) which occur both in A' and in Y (cf. fn. 6, 12 above). 


The environment for the morphemic segments 120 and 123 is utter- 
ances of three vowels the last of which has zero stress (to be exact : ut- 
terances whose vowels are successively 'V'VV or jV'VV). The environ- 
ment for 1020 and 1123 is utterances of four vowels with only the odd 
numbered ones stressed ('WW or /WW). Nowhere else do we have 
to differentiate such utterances; i.e. utterances with various numbers of 
stressed vowels do not correlate with any other of our distinctions. If we 
group the complementary elements, we find that the environment of the 
{. ) morpheme whose members are 120, 1020, etc., is all utterances of any 
length which end in 20; similarly for the environment of the {?| mor- 
pheme whose members are 123, 1123, etc. This permits us to speak about 
utterances without specifying the number of vowels and stresses. 

A special case of this criterion is the grouping into one morpheme of 
various segments which are complementary in the number of their sepa- 
rated parts (12.323). In Latin filius 'son' we have a morphemic segment 
us 'male'. In bomis 'good (m.)' we have the same us. In filius bo7ius 'good 
son' we have a morphemic segment . . . us . . . us 'male'. The two mor- 
phemic segments us and . . . us . . . lis are complementary: which of 
them occurs depends on the number of morphemes of the type fill, bon 
that occur in the domain. If we keep them in separate morphemes we 
will have morphemes whose total environment will be one morpheme but 
not two, or two morphemes but not one, of the type fili, bon. It is there- 
fore preferable to group us and . . . us . . . ?«s (and so also . . . us . . . us 
. . . us, etc.) into one morpheme which occurs with sequences of mor- 
phemes like fili, bon as many of them as are present (within a stated 

" A corollary of this criterion prevents us from keeping apart the 
various morphemic segments which have identical phonemes. The 
matching operation of 13.41 might have led us, say, to group into one 
morpheme berry all the occurrences of /beriy/ except those after 
/boysan/ (in boysenberry). We could say that this new berry (without 
the boysenberry occurrences) has as wide a distribution as any comparable 
morpheme, since no other morpheme occurs after /boysan/. However, 
we will not by so doing satisfy the demands of 13.422 as regards the 
/beriy/ which appears after /boysan/. This latter /beriy/ occurs only 
after /boysan/, a highly restricted environment. We can make its distri- 
bution similar to that of other segments by grouping it with some other 
unit (preferably the other /beriy/, by 13.43), and by saying that all 
sequences /, beriy/ which occur after a segment with the /"/ stress (e.g. 
/'boysan, 'bluw/) are included (together with /'beriy/) in the morpheme 
[berry \ just as all sequences /,wind/ which occur after a segment with /'/ 


Till' iTiterion of 13.422 is in the last analysis identical with llic prc- 
ceding; criteria wliich involved matching the distribution of one element 
or set of elements with that of another. All the criteria serve to group 
the elements in such a way that environmental differentiations which 
are unusual for the morphology of the language in question are replaced 
by environmental differentiations which are common for that language. 
And where the type of environmental differentiation is equally charac- 
teristic for the language, the effect of these criteria is to replace elements 
having less freedom of occurrence by elements haying greater freedom. 

In particular, the effect of the first operation of 13.421 is to raise the 
more restricted segments (like wife and wive) to the status of the less 
restricted ones (like rug), so that after chapter 13 it will be possible to 
deal with all the new morphemes as though none of them had greater 
limitations than single-member morphemes such as rug}^ 

are included together with /'wind in the morpheme [wind] : east-wind, 
the wind. 

If we do not wish first to set up some morphemes (those whose occur- 
rences are all phonemically identical) as models, and then to match the 
others to these, we can set up simultaneously all the morphemes which 
have similar environments by applying the considerations of 13.422 and 
13.43 to all the segments we have and assembling them into the most 
compact set of morphemes we can. 

'** It may be argued that some violence has been done to the meaning 
correlation of elements in thus grouping them into one morpheme. One 
could say that knive carries an implication of plural meaning, such as 
knife does not, and that this is lost when both are identified as a single 
morpheme \knife] which means 'knife (singular)'. However, since the 
morpheme s 'plural' appears whenever knive occurs (except in explicitly 
linguistic discussions where knive is the name of the segment nayv ), it 
is possible to corxelate all the plural meaning with the s, leaving knive 
free to be grouped with knife. It is as between whole utterances, e.g. 
Did you get my knife? and Did you get my knives f, that we get the mean- 
ing difference of singular and plural. There is no reason to correlate 
this difference with more than one morphemic difference, and that mor- 
phemic difference would be most simply the presence of the s which 
correlates with 'plural' in other utterances, and which could be assigned 
no other meaning correlation in this utterance. All this is not to say that 
when knive occurs it does not carry, for a person acquainted with the 
language structure, the implication of 'plural'. The occurrence of every 
restricted element may be said in this sense to imply the occurrence of 
any of the elements (in this case, the only element) with which it occurs. 
Cf. chapter 12, fn. 66, end. The methods of descriptive linguistics do not 
reveal or express everything which a speaker can communicate to a hearer 
in less than a whole utterance, since the universe of discourse is an ut- 
terance (2.32). Even these facts, however, appear indirectly in descriptive 
linguistics: for example, in the statement that the member knive of 
[knife] occurs only with the plural |-.s|. 


13.43. Choosing among Complementary Elements 

In cases where only a unique group of mutually complementary seg- 
ments will together equal the total distribution of the comparable one- 
segment morphemes, there is no problem as to which segments should be 
grouped into one unit: given any segment of the [be] group, e.g. /sz/, 
only the sum of all the other segments which have been included in [be] 
can fill out the difference between its environment and that of [fail, 
slay] etc. 

Even when our search for complementary segments is limited by 
13.421 to those which would fill out the range of environments of other 
morphemes, it may often be possible to find more than one complemen- 
tary morphemic segment. E.g. not only knive, but also live, wive, which 
occur only before -s, are complementary to knife. Which of these three 
shall we choose to group with knife into one unit? It is almost always 
possible to decide this on the basis of all the total environments in which 
each of these occur. Thus knive, live, wive, all occur in Our — ^ aie dull, 
paralleling knife in Our — is dull. But of the three, only knive occurs (ex- 
cept, perhaps, jocularly) in ril sharpen my — s on the ivhetstone, parallel- 
ing I'll sharpen my — ■ on the whetstone. ^^ 

The problem of selecting a complementary segment is usually simpler 
than this. Usually, in a situation in which there are several segments 
z', z^, z^ {knive, live, tvive) all complementary to our a (knife; see the sym- 
bols in fn. 15), we will find that there are not one but several segments 
o', a^, a^, (life, xvife) to which these various z segments are complemen- 
tary. We thus have a\ o^, o^, each occurring in environment A"" and each 
complementary to each of z', z'^, z^ which occur in environment Y . Our 
problem is now no longer to decide which z may be best grouped with 
our original a', but what is the best way of pairing each a segment with 

'^ If we wish to obtain new utterances and test the elements comple- 
mentary to knife, we can try to work this out with an informant. We 
would then group with kmfe the segment which replaces knife when 
we alter the utterance only enough so that it includes the complementary 
environment (the environment in which knife does not occur). If we have 
knife in My — broke, we alter the environment to My — s broke, and see 
what we get as a repetition of the utterance under the changed circum- 
stances. There are various ways in which we can obtain this minimally 
altered repetition. If a native speaker says My knife broke, we may ask 
him "How would you say it if there were several of them?" (if that is a 
successful way of asking him questions); or we may turn the conversa- 
tion into a situation where "several of them" are involved. In effect, 
this means that we try to hold everything constant in the social situation, 
except whatever change in it correlates with the addition of the mor- 
phemic segment -s. 


some one z segment. The differences in distribution which had to be 
judged rehitively in only one direction, can now be judged both ways.'"' 

13.5. Relations among the Members of a Morpheme 

When we consider all the morphemes of our corpus, and the relations 
among the members within each of them, we may find that some of the 
relations among the segments included in a particular morpheme are 
similar to the corresponding relations among the segments of another 
morpheme, and that other relations differ from their corresponding ones 
in the other morpheme. 

The most interesting relations for current linguistics, in terms of the 
operations of chapter 13 are: the relation between the environment to 
which one member is restricted and the environment to which other 
members are restricted; the difference in phonemic composition among 
members ; the phonemic similarity between one member and the environ- 
ment to which it is restricted, as compared with the phonemic similarity 
between the other members and the environments to which they are 

13.51. The Environments of Each Member 

13.511. Phonemically differenti.\ble. The environment in which 
one member of a unit rather than another occurs may in some cases be 
differentiable in terms of phonemes. 

In Attic Greek reduplication prefixes meaning 'perfect aspect' we have 
/me/ occurring only before morphemes beginning with /m/, /le/ only 
before morphemes beginning with /I/, etc. (/me'mene.ka/ 'I have re- 
mained,' /'leluka/ 'I have loosed'). We group them all into one mor- 
pheme {C'e}^' and can tell from the consonant following it which member 
occurs, i.e. which consonant replaces the C^ 

In English, /al/ and /sel/ al are members of one morpheme, /al/ oc- 

^° In general, we would try to group the segments in such a way as 
to require the fewest and simplest statements concerning the interrela- 
tions within each resultant morpheme. Although the only criterion rele- 
vant to our procedures will be the distributional one, we may find that 
the grouping based on distributional grounds will in most cases also in- 
volve least difference in phonemic composition and social situation corre- 
lation among the members of each morpheme. For an example of an Al- 
gonquian language (Delaware) in which a large part of the grammar can 
be expressed in terms of morpheme alternants, see Z. S. Harris, Struc- 
tural Restatements II, Int. Jour, of Am. Ling. 13.175-86 (1947). Cf. also 
Bernard Bloch, English verb inflection, Lang. 23.399-418 (1947). 

^' C being defined as the first consonant of the next morpheme, i.e. 
the consonant following the /e/. 


curring when the morpheme is zero-stressed : national, nationality; so are 
/tela/ and /tale/ in telegraph, telegraphy. In all such alternations, the 
full vowel occurs where there is no zero stress, so that given the mor- 
pheme, {al\ or {tele], we can tell from the stress phonemes in the ut- 
terance which member of the morpheme occurs in the utterance. 

A special case of environments which can be phonemically differen- 
tiated is that of the /v/ — ^ /f/ morphemic segment 'noun' which oc- 
curs after believe, live, and which is complementary to the /z/ — > /s/ 
'noun' after to house, to the /?>/ — > /d/ after breathe, etc. (12.332). We 
can group all these complementaries into one morpheme {unvoicing}. 
Similarly the morphemic segment 'male' which consists of dropping /r/ 
occurs aher fermiere (12.333), while that which consists of dropping /t/ 
occurs after chatte, povlette. All these are complementary as to the mor- 
phemes after which they occur; and we can recognize from the phonemic 
form of the preceding morphemic environment which member of the 
'male' morpheme occurs: if the preceding morpheme ends in /t/, the 
morpheme 'male' consists in dropping /t/, and so on. In the morpheme 
{drop final phoneme} 'male,' we can tell which member segment occurs 
in each environment from the phoneme which precedes that morpheme. ^^ 

13.512. MoRPHEMiCALLY DiFFERENTiABLE. In Other morphemes, how- 
ever, we cannot tell from the phonemic composition of the environment 
which member of the morpheme occurs in that environment. In {wife] , 
we cannot say that the member, wive occurs regularly before the pho- 
neme /s/, for we also find the other member, wife, before /s/ in His 
wife's job. The environments of wive can therefore not be easily dis- 
tinguished phonemically from the environments of wife. We can only 
say that wive occurs before the morpheme -s 'plural,' and wife before 
all other morphemes, including 's 'possessive' and 's 'is.' In all such cases 
we have to state what morphemes in the environment distinguish the 
distribution of one member of a morpheme from that of the other mem- 
bers of that morpheme. ^^ 

^^ Comparable methods can be used in grouping the various pitch se- 
quences, e.g. 120, 1020, etc. (13.422) into one contour morpheme on the 
basis of the number of vowels. Given the contour morpheme {.), (in- 
cluding all contours ending in 20) we know, from the number of vowels, 
(and final voiced consonants) that the pitch sequence (member of this 
contour morpheme) in Marge is coming, is 1020. We may say that the 
contour morpheme covers a particular domain (the interval between suc- 
cessive {.) s), and that the pitch of each vowel may be determined from 
its stress and position within the interval. 

" The identity of morphemes in the environment is also a determining 
factor in grouping with the {.) contours all the contours like 2031 in 


A special case of environments which can only be recognized in terms 
of morphemes is the environment of the morphemic segments whose pho- 
nemes are repeated in the utterance (agreement morphemes). In 12.323 
it was seen that in utterances like Moroccan Arabic lint Ikbir 'the large 
room.' liild Ikbir 'the big child,' we have a morphemic segment I . . . I . . . 
'the.' In Ibit Ikbir lundl 'the first large room' we similarly have a mor- 
phemic segment I ... I ... I .. . 'the,' and in lint 'the room' we have / 
'the.' These three are complementary: the number of occurrences of / de- 
pends on the number of morphemes. But the I does not occur before 
every morpheme: we have Ibit ikbir dial buia 'the large room of my 
father.' We therefore group all the complementary I sequences into one 
morpheme {/} 'the,' and say that this morpheme occurs over a particular 
exactly-stated domain (namely, sequences of stated morphemes includ- 
ing kbir, bit, but not including dial) : portions of the [l] morpheme occur, 
if at all, before every morpheme of this particular domain. When the 
morpheme {1} occurs, we have 1/ before every morpheme which is 
included in the domain.-^ 

13.52. Phonemic Differences among the Members 

13.521. Slight differe.nxe. In many morphemes the difference 
among the members is slight, e.g. a single component (or interchange of 
closely related phonemes) in wife-wive, or various distributions of and 1 
pitch before the final 20 of the ( . | contour. ^^ 

WherCs he cottiingf, 20031 in Where will you take it?. These latter con- 
tours are complementary to each other on the basis of utterance length. 
In addition, however, they are all complementary to the j.) contours 
since they occur only on utterances beginning with the \wh-] 'interroga- 
tion' morpheme. We can therefore include these contours (often marked 
by /l/) in one unit with the j. } group, since the other contours included 
in {. ( never occur on utterances containing the \wh-\ morpheme. 

It is impossible to group the ^ contours with the ? contours (e.g. 123 
in There isn't?), because the two are not complementary. The ? contour 
occurs in utterances containing {wh-} as in When's he coming j?j with 
pitch sequence 1234, meaning 'Are you asking when he will be coming?' 

^' Cf. 13.442. 

^= Sometimes the differences among members of a morpheme consist in 
some special relation such as the assimilation of two phonemes which had 
been separated in one member but became contiguous in another mem- 
ber. E.g. one of the members of the Yokuts morpheme for 'girl' is 
goyo.lum; before the plural morpheme, which consists of i plus certain 
vowel changes (dropping the second vowel of the preceding morpheme, 
and changing the third vowel to /a ), another member of 'girl' occurs: 
goyyam (after the vowel changes associated with the following ?'). We 


13.522. Partial identity. In some morphemes there are many mem- 
bers all of which have certain similarities and differences in common. In 
the Attic Greek reduplication morpheme, the members consist of some 
one consonant (whichever follows the morpheme) plus /e/. In the French 
suffix morpheme for 'male' the members consist of dropping some one 
consonant (whichever precedes the morpheme). 

13.523. No IDENTITY. In some morphemes there is no phonemic re- 
semblance among the members. Thus /gud/ and /bet/ are members of 
the same morpheme [good], the second form (in better) occurring only 
before the morpheme {er}.^^ 

In Yokuts certain bases are reduplicated. Comparing *giyi 'touch/ 
*giyigiyi 'tough repeatedly/ * 'swallow/ * 'swallow re- 
peatedly/ etc., we say that the repeated giyi,, etc. all mean 're- 
peatedly.' The repeated parts are also all mutually complementary: giyi 
occurs only after *giyi, etc. Hence we group them all into one morpheme 
whose members have no phonemic identity with each other, although 
there is a similarity among them in that they can all be described as \x\, 
where x indicates the ordered phonemes of the preceding morpheme. ^^ 
Other Yokuts bases, which are never reduplicated, occur with a suffix 
-da. 'repetitive' which is thus complementary to the reduplications and 
may be included in their morpheme. The new member -da. is com- 
pletely different, both in phonemes and in relation to the preceding 
phonemes, from the reduplication members.^* 

say that the second member differs from the first by having /y/ instead 
of /I/, in a position where the /I/ would have been contiguous with the 
preceding /y/; the other change, the dropping of the second vowel, which 
brought the two /y/s together in the second member, is described as 
being part of the i 'plural' morpheme, and is thus part of the environ 
ment in which the second member occurs. 

^^ Members having such complete phonemic differences among them 
are called suppletives. 

^' Up to this point the similarity among the members of the morpheme 
is of the type described in 13.522. 

^•^ An extreme example may be seen if we consider the position-mor- 
pheme (tagmeme) of the Appendix to 12.34, which adds the 'object' 
meaning to you in / aaw you. In )'ou saw him as compared with He saw 
you, the object position for he appears together with a phonemic change 
(from he to him). We may express this by saying that the member of the 
'object' morpheme which occurs after the morpheme he is the position 
plus the change /y/ —* /m/ : /hiy/ + /y/— ^/m/ = /him/. When we 
want to add the 'object' morpheme to he we add both the position and 
the /y/ -> /m/. 


13.53. Similarity between Member and Its Environment 

13.531. No siMiL.vKiTY. In many ctises there is no phonemic similarity 
between the various members of a morpheme and the environments in 
which each member occurs: in good-better /bet/, which occurs before er, 
is no more similar to /ar/ than is /gud/. 

Sometimes, however, we find similarities between a member and the 
environments in which it occurs: we may say that the phonemic form 
of the member depends upon the phonemic form of some part of the en- 
vironment. This is more frequent when the environments of a member 
are identifiable by a common phonemic characteristic (13.511). 

13.532. Identity in phonemic feature. The similarity may consist 
in some features of a phoneme, i.e. in some components. Of the morphe- 
mic segments /s/ and /z/ which are included in \s} 'plural,' and of the 
segments /t/ and /d/ which are included in {ed] 'past,' we find the voice- 
less member occurring after morphemes ending in voiceless phonemes, 
and the voiced member after voiced phonemes: /buks/ books, /waynz/ 
wines, /bukt/ booked, /waynd/ wined. Even the members /az/ and /ad/ 
which are included in these two morphemes have a relation, though a 
dissimilational one, with their environments. Thase occur only after 
voiced or voiceless consonants homo-organic with their own:^^ /'gaezaz/ 
gases, /'glaesaz/ glasses, /'aedad/ added, /'malktad/ mulcted.^" 

13.533. Identity in phonemes. The similarity may also consist in 
identity of a whole phoneme, as in the cases cited in 13.523. In the Attic 
Greek reduplication, the consonant of each member is identical with the 
initial consonant of the environment-constituting morpheme which fol- 
lows it. In the Yokuts reduplication, all the phonemes of each member are 
identical (in the same order) with the phonemes of the preceding mor- 
pheme which constitutes the environment determining the occurrence 
of the particular member. 

13.6. Result: Classes of Complementary Morphemic Segments 

We now have, instead of our morphemic segments, a smaller number 
of morphemes, each of these being a class of one or more complemen- 

^^ The similarity between each member and its environment would be 
even clearer in terms of components, because /s/ and /z/ would have all 
but one component in common, in English. 

^^ Much of the assimilation or dissimilation which can be spotted in 
a synchronic description of a language will appear as relations of this 
type between a member and its environment. For another situation in 
which the historical event of assimilation can transpire descriptively, see 
fn. 25 above. 


tary morphemic elements.^' In all our further morphological statements 
we will deal with these new elements; for the remaining analysis it will 
not be necessary to deal with the differences among the members within 
each morpheme. ^^ 

This procedure brings out the morphemic status of various formal 
features. Thus w^e might note that both had as compared with have, and 
French fermier 'farmer' as compared with fermiere 'farm-woman', con- 
tain phoneme omission. However, in /fermye/ the replacement of the 
phoneme by zero constitutes a morpheme 'male' (12.333), while in the 
/hse/ of had it is merely the difference between two members of a mor- 
pheme (/hse before -d and -s, /haev/ elsewhere). ^^ 

Appendix to 13.42: Zero Members of Morphemes 

In some cases the carrying out of this procedure may lead us to set up 
zero members of particular morphemes. For example, we can group to- 
gether various segments, /an/ (after particular morphemes: taken), /t/ 
(after voiceless phonemes: walked), /d/ (after voiced: bagged), /dd/ 
(after /d, t/: added, slated), into one morpheme [en] 'participle', on 
the distributional model (13.421) of -ing which occurs after all these 
morphemes {walking, bagging, etc.). This does not, however, equate the 
distributions of {en\ and \ing\, because we have ing after cut, but no 
member of {e7i} is defined for that environment. If we compare I'll take 
it, I'm taking it, I've taken it, with I'll cut class, I'm cutting class, I've cut 
class, we find that cut has nothing following it in the position where take 
and all the other morphemes are followed by {en}. We can add to {en} 
a member which would occur after cut, and whose phonemic content 
would, as we have just seen, be zero. This may be desirable in view of 

^' Not all of these new element groups need be called morphemes. It 
is only required that they be treated as single elements for purposes of 
the following procedui-es, and that they bo defined as classes of comple- 
mentaiy olomonts. The term morpheme is often used particularly for 
groups of complementary segments (not contours or order) which satisfy 
13.3 4 and which are quite similar to each other in phonemic composition 

" These differences will, of course, be important for speaking the lan- 
guage. The speaker must know which member of a morpheme occurs in 
a particular environment just as he must know which member of a pho- 
neme occurs in each environment. 

^'^ Similarly, in knife-knives the /f/-/v/ difference marks members of 
a morpheme, whereas in belief-believe it indicates the addition of a mor- 
pheme 'noun'. Note that belief -dnd believe are in fact not complementary 
in a few fscjmewhat forced) cases, e.g. in Belief.^ What for/ and Believe? 
What For.' 


13.422. since the environment of \cn\ hitherto has been 'any morpheme 
which occurs before -ing, except for cut'. With the addition of the zero 
member, it becomes 'any morpheme which occurs before -ing' : this is a 
class of morphemes previously recognized because of iJig and because of 
various other distributional features. 

The use of a zero member here docs not destroy the one-one character 
of our representation. Given the morphemes {en}, [take], [cut], [will], 
[have], we know when to use the zero member of the above selection: 
only in {cut\ + {en\. And given actual utterances, e.g. I'm cutting it, 
ril cut it, I've cut it, we know where the zero member of [en] occurs: not 
after cvtling, nor even after 7/ cut, but only after 've cvt. This ability to 
place the occurrence of a zero morphemic segment comes, of course, from 
the presence of [have] + [cut] which are environmentally correlated 
with it. However, that does not mean that [en] , including its zero mem- 
ber, is merely part of the [hauel morpheme, for the {en\, including its 
zero member, also occur without [have] but with [be] or in other recog- 
nizable positions (e.g. without preceding noun) :^'' Jt is taken, It is cut. 
When taken into the light, . . . , Whe7i cut open, . . . The saving grace is 
always the presence of some formal feature (morpheme, arrangement, 
etc.) which correlates with the occurrence of the zero member. 

A more difficult situation arises when the morpheme which has a zero 
member substitutes for absence of morpheme (among other things), i.e. 
occurs in an environment in which there may also be no morpheme at all. 
For example, the morpheme [ed] has various members (/ad/ in added, 
/t/ in walked, /ey/ -^ /u/ in took, etc.), and occurs after all morphemes 
which occur before -ing, except cut. At first blush we might try to treat 
it as we did the {en\ morpheme, and to include in it a zero member which 
occurs after cut. This would satisfy the criteria of 13.3, 13.422, but would 
run afoul of our commitment to a one-one representation. For if we com- 
pare / missed it yesterday, I cut it yesterday, with I miss it these days, I ciU 
it these days, the only indication that the zero member of {ed\ is present 
after the first cut and not after the second is the occurrence of yesterday. 
In / cvt it. we cannot tell whether the morpheme \ed] is present or not, 
although in 1 missed it, 1 miss it, we can recognize the presence or absence 
of the morpheme {erf) . This means that we have here a many-one corre- 
spondence between our elements and the utterance, i.e. in one direction: 

^* In chapter 16 it will be seen that all these positions or environments 
in which \en\, including its zero member, occur are syntactically identi- 
cal; but that is irrelevant to the present discussion. 


given our elements, we can reconstruct the utterance; given |/! + [cut] 
+ [ed] + {.] we construct 1 cut., since the member of {ed} after \cul\ 
is zero. But given / cut, we cannot say uniquely what elements it con- 
tains, i.e. whether or not it contains the morpheme {ed}.^^ If we wish to 
retain a one-one representation at this stage of our analysis, we must 
abjure any recourse to zero members of morphemes under this condition. 
The distribution of the morpheme {ed\ would then remain 'after any 
morpheme which occurs before -ing, except cut,' and the distribution of 
[cut] would be identical with that of {walk, take} etc., except that [cut] 
would never occur before {ed}.^^ 

Appendix to 13.43: Alternative Groupings 

The conditions of 13.42-3 are sufficiently elastic as to permit in some 
cases more than one arrangement of particular segments into mor- 
phemes. Two linguists, working on the same material and seeking to 
satisfy the same criteria, may not come out with the same morphemes." 

There are other cases in which the procedure of 13.3 permits various 
groupings, depending on how many environments we take into account. 
Thus in Bengali we find the following segments (with their meanings in 
each of four environments):^** 

after stem after-l-past iional after-b-fut. 

am — 1st person 1st person — 

is 2nd inferior — 2nd inferior — 

i 1st person 2nd inferior — 2nd inferior 

^= And whether or not it means 'past time', unless the environment 
tells us. 

^^ If we decide against a zero member of j ed \ for the reasons stated 
above, we would need, in addition to the two special distributional state- 
ments here indicated, also a third special statement (which would belong 
in chapter 16): cut = V -|- {ed\, i.e. cut can be substituted for verb plus 
\ed\ (as well as being substitutable for a verb by itself). 

■" In such cases, each linguist might indicate the existence of other pos- 
sible groupings than the ones he has chosen. In any event, all such alter- 
native groupings could be easily translated one into the other, for if the 
morphemes in the two arrangements are different, so are, correlatively, 
the segments which define each unit. Any difference between such al- 
ternative groupings will usually not constitute any general difference in 
the morphology. The opportunity to reconsider our groupings of mem- 
bers, if one grouping turns out to be more consonant with our generaliza- 
tions or more convenient for our description, is offered in 14.6. 

^n am indebted to Charles A. Ferguson for the Bengali forms used 



after stem 




2nd ordinary 

3rd ordinary" 

3rd ordinary 

1st person 


3rd ordinary 

2nd ordinary*" 

2nd ordinary 

2nd, 3rd ordinary 


2nd, 3rd hon- 

2nd, 3rd hon- 

2nd, 3rd hon- 

2nd, 3rd honorific 




E.g. jai 'I go,' jao 'you (ord.) go.' jabo 'I will go,' nabo 'you (ord.) de- 
scend,' nabbo 'I will descend.' We could consider, say, the occurrences of 
{-o) in each of its environments which are differentiated here (or in any 
other breakdown of its distribution) as a distinct segment ; but as a first 
approximation toward setting up morphemes we would undoubtedly 
group all these segments into one unit {-o), by 13.41." 

If we thus leave the morphemes as they appear in the table above, 
{-urn}, {-o}, etc. with one member but several meanings for each mor- 
pheme, we do so on the basis of the fact that there is no second formal 
feature which correlates with the formal environment division arranged 
by meaning which was indicated in that table: I.e. we have utterances: 

dori tano 'you- (ord.) pull it.' 
dori tanbo 'I will pull it.' 
dori tanlo 'he pulled it.' 

From these utterances all we can derive is that the meaning of -o varies 
as stem, -b, or -I, appears before it. We can not derive several -o mor- 
phemes. If we further compare such utterances as 

bhalo aci. I'm well. kothae thakte. Where did you live? 
bhalo ace. He's well. kothae thakto. Where did he live? 

we can set -i 'I' and -e 'he' up as different morphemes,''- and -e 'you' and 

^^ Except after some transitive verb stems. 

^" And third person ordinary after some transitive verb stems. 

" The meaning differences within each of these units, though seeming- 
ly important and clear-cut, would not suffice to make us do otherwise. 
True, an exceptional situation exists here in that the meaning shifts cor- 
relate with more or less the same environmental differences in each of 
these morphemes {-oj, etc., so that each meaning (first person, second 
ordinary, etc.) is in each environment uniquely represented by one or 
another of these morphemes. However, we are unable to utilize the pat- 
tern of meaning shifts at this point in our analysis. 

*^ Phonemic sequences which contrast (i.e. occur in the same position) 
but are not repetitions of each other constitute different morphemes. 
However, those which are complementary, as are the various occurrences 
of -o, do not necessarily constitute either identical or distinct morphemes. 


-o 'he' as different morphemes. But this still does not enable us to say 
that -e 'he' and -e 'you' are different morphemes, or that -e 'he' and -o 
'he' are the same morpheme. 

The basis for a rearrangement of these morphemes appears when we 
compare utterances such as : 

ami thaktum 'I stayed' 

tui thaktis 'you (inf.) stayed' 

Here we have an environment in respect to which -urn 'I' and -is 'you 
(inf.)' are automatically restricted -and complementary. The same en- 
vironment appears in: 

ami korbo 'I will make it' 

tui korbi 'you (inf.) will make it' 

This environment is then the formal feature which correlates uncondi- 
tionally with the meaning T, 'you', etc.^'; it does not correlate uncon- 
ditionally with our previous units {-o\, etc.''^ but rather with a combina- 
tion of the -0-, -t-, -h- morphemes and the -um, -is, etc. Without depend- 
ing on meaning we can now say (following 13.43): all complementary 
segments which occur with ami shall be included in one morpheme 'I'.''^ 

Finally, if we fail to consider the import of the last utterances pre- 
sented above, and keep the morphemes as they appear in our original 
list, the whole issue will be thrown open again when we consider the rela- 
tions among morphemes (chapter 15). We will then find that -e occurs in 
the same utterance with 'he' when -6- 'future' precedes it; and that -o 
occurs in one utterance with ami T when -b- 'future' precedes it, but in 
one utterance with 'he' when -I- 'past' or -t- 'conditional' precede it, and 
in the same utterance with 'you' (ord.) w^hen none of these three pre- 
cedes it. 

We thus have a complicated set of restrictions, all involving the same 
few forms in different combinations:" -um, -is, etc., and -1-, -t-, -b-. 

" I.e. arni means T not only after -I- or when there is no -1-, but when- 
ever it occurs. 

'•'' Since sometimes -o occurs with it, and sometimes -um or -i. 

*^ In chapter 17 it will be seen that this is ground for setting up a single 
component including both ami and the various suffixes which occur with it. 

■"^ Whenever there are a number of restrictions involving the same ele- 
ments it is a good heuristic principle to reconsider what the elements are 
composed of, in order to see if a rearrangement of their component parts 
may not give us elements not subject to these special restrictions. 


At this point we reconsider our morphemes and find that all these re- 
strictions can be eliminated if we regroup the segments as follows: 

-/ 1st person; with variant members -uin after -/-, -/-, and -o after -b-. 

-is 2nd person inferior; with variant member -{ after -/-, -h-. 

-o 2nd person ordinary; with variant member -e after -/-, -/-, -h-. 

-e 3rd person ordinary; with variant member -o after -/-, -/-. 

-en 2nd and 3rd persons honorific. 
Any meaning considerations which have formal correlation^^ will there- 
fore be necessarily brought into play. If they do not appear at this stage 
they will appear later, whenever the second formal feature which corre- 
lates with the first turns up, at which time we will go back and correct 
our previous work to satisfy our later considerations.'** 

*'' I.e. any meaning difference which coincides with the correlation 
between two formal features: in this case the ending -o and the restriction 
to ami T. We cannot go by coincidence of meaning with a single formal 
feature: since we don't rely on meaning, the formal feature could not be 
tabulated into a meaning pattern. The different iat ion of -o segments 
into several morphemes was distributionally possible only because the 
occurrence of -b-, -t-, etc. before -o correlated with the occurrence of 
ami, etc., respectively. 

^* This is analogous to the situation in phonemics, when we grouped 
segments into phonemes without benefit of morphemic knowledge, and 
provided for changes in the grouping on the basis of our later grouping 
of morphemic segments. 


14.0. Introductory 

In this section the more common irregular phonological alternations 
in a language are marked by means of morphophonemic symbols. 

14.1. Purpose: Identical Constitution for All Alternants of a 

We seek to define elements, as constituents of morphemic segments, 
in terms of which all the segment members of a given morpheme would 
be identical. 

This purpose was served by the phonemic elements until we began to 
group phonemically different segments into the same morpheme (13.32). 
Now that some morphemes have phonemically different members, it is 
of interest to know whether we can recapture the state of having all 
members of a morpheme identical. By definition, this could not be done 
in terms of the phonemic composition of the members, so that the prob- 
lem becomes one of setting up new elements, replacing the phonemes, 
which will satisfy this requirement. These new elements would represent 
the features common to the various members of the morpheme for which 
they are defined. 

In general, the setting up of such new morphophonemic elements will 
be easier, the greater the phonemic similarity among the members of a 
moi'pheme. And over the whole corpus, if more of the morphemes have, 
in identical environments, identical alternations among their members, 
fewer morphophonemic elements will be set up; for then the morpho- 
phonemes set up for one morpheme will also serve for many other 
morphemes. It is therefore important to discover which alternations oc- 
cur in many morphemes. 

14.2. Preliminaries to the Procedure: Morphemes Having Iden- 
tical Alternations among Their Members 

From the fact that each morpheme is a class of one or more comple- 
mentary or freely varying morphemic segments, it follows that only two 
facts are essential for each morphcune: what its segment members are 
(each being identified by its phonemic composition) ; and in what environ- 
ments each member occurs. It was seen in 13.53 that morphemes also 
differ in the phonemic similarity between each member and its environ- 
ment. Some of those alternations, both in plionomic coniixfsit ion and in 



environmental differentiation, among members of a morpheme occur in 
only one morpheme; others occur identically in many morphemes. 

14.21. Unique Alternations 

Some alternations of segments are unique to a particular morpheme. 
E.g. the morpheme {s\ 'plural' has a member en after {ox}. No other 
morpheme has, in a special position comparable to that of occurring 
after {ox}, a special member having the same relation to the other mem- 
bers of that morpheme which en has to the other members of {s}. E.g. 
the unit {'s} 'possessive' has no special member after {ox} or after any 
other single morpheme of that class. In cases of this type little generaliza- 
tion or simplification is possible. We simply state that js} has such and 
such members in such and such environments.' 

14.22. Alternations Generalizable within Morphemically Defined 

In some cases a number of morphemes have analogous phonemic al- 
ternations among their members in corresponding environments, with 
such limitations that either the morphemes or the differentiating environ- 
ments have to be identified morphemically: i.e. either the morphemes in 
which the alternation occurs cannot be distinguished by a common pho- 
nemic feature that is absent from all the morphemes in which the alter- 
nation does not occur (1-4.222) ; or else the environments of one alternant 
do not have a common phonemic feature that is absent from all the envi- 
ronments of the other alternant (14.223 ; or 14.221 w^here the environment 
of the alternant -ren is not differentiable phonemically from the environ- 
ments of the alternants /s/, /z/ which contain among them the family 
name Childe: We had dinner with the children. We had dinner with the 

14.221. Identity of p.\rt of the .alternation. Some alternations oc- 
cur analogously in several units. E.g. the occurrence of a member /s/ 

' Another example is the morpheme {be} (13.32). The differentiating 
environments of its various members are individual morphemes, such as 
/ for the member am; you, we, they, [s] 'plural,' or any other plural indi- 
cation (cf. B. L. Whorf, Grammatical Categories, L.\ng. 21.1 (1945)) 
plus {-ed} 'past' (here represented by its member w-) for the member 
/ar/; etc. (13.512). The differences among the members, /sem/, /i/, 
/ar/, etc., cannot be generalized short of actually listing each member 
(13.523). No regular phonemic similarity can be shown between each 
member and the environment in which it occurs, as between / and ain 
(13.531). And no other morphemes have an analogous alternation of 
members in corresponding environments. 


generally after morphemes ending in voiceless consonants, and a mem- 
ber /z/ after voiced, is true of the morphemes {s} 'plural,' {'s] 'posses- 
sive,' {'«} 'is', though these units differ in their other members. If we 
generalize the alternation to one of voiceless-voiced members (not just 
/s-z/), we find it also in {ed\ 'past.' However, there are other alterna- 
tions of members in which \s\ 'plural' differs from {'s} 'possessive': e.g. 
children, child's. 

14.222. Identical .\ltern.\tion in phonemically undifferenti- 
ABLE MORPHEMES. We ofteu find sets of morphemes all of whose members 
vary analogously in corresponding environments: e.g. [knife], [wolf], 
[wife], all have only two members, the second differing only in having 
/v/ before \s\ 'plural.' In this case there are, however, other morphemes 
having the same phonemic form in the relevant respect (final /f/), but 
not having this alternation of members; \fife\ has only one member in 
all environments {fife, fifes). 

It is impossible to differentiate the morphemes which have this al- 
ternation from those which do not by any phonemic feature common to 
the former and lacking in the latter. 

14.223. Alternations in phonemically differentiable mor- 
phemes in phonemically undifferentiable environments. In some 
cases all the morphemes which have some particular phonemic form in 
common have also an analogous alternation of members in the neighbor- 
hood of a particular morpheme. E.g. all morphemes which have members 
ending in /k/ when not before ity, have members ending in /s/ instead 
when before ity: opaque-opacity; {ic\ in electric-electricity.^ 

In such cases, it is possible to say that all morphemes which occur be- 
fore ity will in that position have members differing in certain phonemes 
from the other members of the respective morpheme: in particular, if the 
other member (the one not before ity) ends in /k/, the member before ity 
ends in /s/; if the other member contains /eyC/ or /ayC/, the member 
before ity contains /seC/ or /iC/ respectively (sane-sanity) ; and so on. 
This statement has now become a statement about ity rather than about 
electric, sane, etc., since the alternation does not occur before other mor- 

^ Note that in this example the alternation of the two morphemes is 
similar only in respect to the /k/-/s/ interchange: the member of 
opa(fie before ity also has /»/ instead of the /ey/ of the other member 
of the unit. This vowel difference appears in all other units having /eyC/ : 
when they occur before ity they are represented by a member having /se/ 
instead of /ey/, as in sane-sanity. I am indebted for this point to Stanley 
S. Newman's analysis of English. 


phemes which can be considered phonemically similar to ity (e.g. we have 
no alternation before al, er which also begin with /a/: electrical, saner. 
We may consider /k/ — > /s/, /ey/ —>■ /ae/ to be parts of the phonemic 
content of the \ity\ morpheme, operating whenever their domain ap- 
pears, i.e. whenever there appears a preceding /k/, /ey/, etc. In terms 
of morphemes and their members, we may say not that [ic] has two 
members (is, before {ity} and /ik/ otherwise), but that \ity\ has 
several members: /A;/ — > /s/ + /atiy/ after all morphemes ending in 
/k/; /ey/ -^ /ae/ + /atiy/ after all morphemes ending in /eyC/; etc. 
We then do not need to say that {opaque}, {ic}, etc. have a special mem- 
ber before ity: they have their one member, and the changes are part of 
the following morpheme.^ 

14.22-1. SuMM.\RY. In all these cases some generalization of the al- 
ternation over more than one morpheme is possible, but it must be 
couched at least in part in terms of morphemes rather than phonemes. 
We say that the morpheme {-s} 'plural' has various members after cer- 
tain particular morphemes (/an/ after ox, /s/ after die, /aew/ — > /ay/ 
after mouse, louse, etc.); and that otherwise all the morphemes which 
have a member /s/ (this includes the plural, possessive, and is) have the 
member /az/ after /s, z/, /z, after voiced phonemes, /s/ after voiceless 
consonants.^ Similarly, we say that a few one-vowel morphemes, which 
end in /f/ when the {s} 'plural' does not follow, have members ending 
in /v/ when it does; and we list the morphemes {knife} etc. 

li.23. Alternations uithin Phoneruically Differentiable Mor- 
phemes and Environments 

Finally, in some cases all the morphemes which had some particular 
phonemic feature have analogous phonemic alternation among members 

^ This is in effect a correction of the morphemic segmentation of chap- 
ter 12. Considerations of this type apply whenever all morphemes whose 
members outside of the neighborhood of morpheme ) A'j show phonemic 
features A, have members with phonemic feature B instead of A when 
they are in the neighborhood of {X}. The phonemic form of {X} now 
includes the change of .4 to B whenever A appears in its neighborhood. 
We have to make this change of .4 to B (e.g. /ey/ to /se,) part of the 
{X} (e.g. of ity) rather than part of the morphemes which contain .4 and 
B (e.g. of sane), because from the point of view of morphemes like sane 
the change of /ey/ to /ae/ in them cannot be defined in terms of their 
phonemic environment : the , ey/ remains before er {saner), etc. However, 
from the point of view of ity we can state the alternation in terms of the 
phonemic environment of ity: it happens in any /eyC/. 

■• The analogous alternation in {ed} could be included in this state- 


in the neighborhood of all other morphemes which have some particular 
phonemic feature. In Kota, all morphemes which, when they do not oc- 
cur before a morpheme beginning with /k/, have members ending in 
/ley/, have otherwise identical members without the /ky/ when they 
occur before a morpheme beginning with /k/: /aky/ 'husked grain,' 
/kayl/ 'female stealer,' /aki.l/ 'female stealer of husked grain. '^ The same 
alternation occurs for /t/ and /n/: /katac/ 'knife and stick,' but /katy/ 
'knife' and /tac/ 'stick'; /kuno tlk/ 'to look for bees/ but /kuny/ and 
/no.tlk/ separately.^ 

We can then describe the alternation of members of the morphemes 
in question in terms of phonemes rather than in terms of morphemes, 
and thus obtain a simpler statement. For the Kota example we would 
say: /C"y#CV is always replaced by /C'/ (/C/ = /k, t, n/). I.e. the 
sequence /C'y#CV never occurs, as our statement of phoneme distribu- 
tion (in the manner of chapter 1 1) should show, if it has been sufficiently 
detailed; and wherever there is such a succession of morphemes as would 
result in that sequence^ we have /C'/ instead.* 

Since the whole point of such statements is to achieve generality, we 
give them in the broadest form possible. If the same alternation of mem- 
bers occurs before /C'/ and before /#€"/ (i.e. irrespectively of whether 

5 M. B. Emeneau, Kota Texts 1.17-8 (19-44). The diffei'ence in vowel 
between the two members of the {kayl| morpheme is described by an- 
other general statement. 

^ We often meet situations of this type when the determining mor- 
pheme, in whose environment our morphemes have their alternant mem- 
bers, is a contour longer than one morpheme (e.g. word, phrase, utter- 
ance). Since these contours depend on sections of the utterance or con- 
structions larger than one morpheme, they are necessarily independent of 
each morpheme and therefore constitute part of its environment. E.g. 
the position of zero stress in national-nationality, telegraph-telegraphy (see 
13.511), depends upon English word contour and is independent of the 
morpheme. Therefore, certain vowels in any morpheme will be /a/ when 
that vowel occurs in the zero stress position of the word contour. (This 
applies only to particular dialects of American English.) In terms of mor- 
phemes and their members: Each time a morpheme occurs, it will be 
represented by a member whose vowels will be /a/ in the zero posi- 
tions of the particular word contour within whose length the morpheme 
falls. The morpheme \lele\ will have a member /tela/ when the concur- 
rent woid contour has zero stress over the second vowel of the unit, and 
the member /tale when the (;ontour ha« zero stress over the first. 

^ If we consider the morphemes without members adjusted to this en- 
vironment . 

'Statements of this type are often called regular jjlioiiology or auto- 
matic morphophonemics. 


the morpheme beginning with /C'/ has close or open juncture before it, 
whether it is in the same or the next word), we implicitly or explicitly 
note the irrelevance of the juncture in our statement and say /C'y#CV' 
is replaced by / C'/. This is the case here, where we have /katy/ 'knife,' 
/tayr 'to cut,' but /katir/ 'knife to cut.''" 

The simplicity of such statements varies according to the phonemic 
relation between the members in and out of the environment in question 
(13.52), and between each member and its environment (13.53), and ac- 
cording to the complexity of the differentiating phonemic characteristic 
for the morpheme (14.23) and the environment (13.511). In all these 
cases, the essential condition for a statement in terms of phonemes 
rather than of morphemes is that the set of morphemes in which the al- 
ternation takes place, and the set of determining environments," be each 
identifiable phonemically, and that the phonemic difference between the 
members in the environment in question and those outside it be the same 
for all the morphemes. '- 

14.3. Procedure: Interchanging Phonemes among Alternants of 
One Morpheme 

We group together into one morphaphoneme the phonemes which re- 
place each other in corresponding parts of the various members of a 

If we want each morpheme to have only one form or member (thus 
eliminating the distinction between morpheme and its segment member) 
we can, on the basis of the Appendix to 12.5, divide each member of a 
morpheme into a given number of successive (or simultaneous) parts, 
and assign one mark to the first part of all the members, another mark to 
the second, and so on.'^ E.g. the two members knife and knive would be 

^ Where « indicates close juncture, i.e. the lack of any phonemic junc- 
ture before the second morpheme. Regular phonology across word or 
phrase juncture is often called external sandhi. 

'" Emeneau, ibid. 18. 

" Including what kind of juncture if any. There is always, of course, 
a morpheme boundary involved, since we are speaking of what member 
of one morpheme occurs in the neighborhood of another; but not always 
is there a phonemic juncture at the boundary. 

'^ Additional differences between the members may exist in some of 
these morphemes (e.g. the /i,' for /ay/ in fn. 5), but these may be de- 
scribed by additional automatic or non-automatic alternations wiiich 
affect these particular morphemes. 

'^ The difference between the division of morphological elements into 
phonological ones in the Appendix to 12.5, and that proposed here, is the 


divided into four parts: the first part of both would be written /n//* 
the second /a/, the third /y/, the fourth /f/. Both members are now 
identical: /nayr/. The same marks would be used of course in all mor- 
phemes having comparable relative parts: the two members wife and 
wive would both be written /wayr/, while the one-member morpheme 
\fife\ would be written /fayf/ since its last part does not represent /v/ 
before {-s\. The translation from writing to speech is still unique: when 
we see /nayr/ before {-sj 'plural' we pronounce it /nayv/; otherwise 
we pronounce it /nayf/. But the one-one correspondence from speech 
to writing is lost : when we hear knife and fife we cannot tell that the 
former is /nayr/ while the later is /ta,yi/: merely by hearing sounds we 
can assign them to phonemes (if we know the phonemic system of the lan- 
guage) but we cannot tell how they will be replaced, if at all, in other 
members of the morpheme which we heard. 

14.31. Status of the Morphophonemic Symbols 

We take a morpheme written as a combination of symbols which do 
not change no matter what the environment. In each environment, each 
symbol represents the phonemic composition which the part of the mor- 
pheme occupied by the symbol has in that environment.'* Such a symbol 
is called a morphophoneme.'^ Thus in /nayr/, the f represents the 

fact that in the former we considered the division of each morphemic 
segment into such parts as would also appear in other morphemic seg- 
ments (and hence obtained phonemes), while here we divide each mor- 
pheme into such parts as will represent the corresponding sections of each 
of its members, and will as far as possible also appear in other morphemes 
(and hence obtain the phonemes and morphophonemes of 14.3). The re- 
striction, to corresponding parts is made in order to avoid considerations 
of relative order in the definition of each morphophoneme. 

'■* For convenience of later analysis, the new parts set up here will be 
written between the diagonals used for phonemes. When these parts will 
turn out to be not identical with phonemes they will be written with small 
capitals or other distinguishing letters. 

^* Each morphophonemic symbol thus represents a class of phonemes 
and is defined by a list of member phonemes each of which occurs in a 
particular environment (in particular morphemes, e.g. /nayF/, when 
next to particular other morphemes, e.g. s 'plural'). This is analogous to 
the phonemic symbol of chapter 7, which represented a class of segments, 
and was defined by a list of member segments each of which occurred in 
a particular environment. 

"^ This is equivalent although not identical with the definition or use 
of morphophonemes by most linguists: lOdward Sapir and Morris Swa- 


phonemic eomposition of the hist unit -length segment of that mor- 
pheme, in whatever environment the morpheme is; hence f represents 
/v/ when the morpheme [knife] is before {s\ 'plural', and it represents 
/f/ otherwise. 

Our whole work of establishing a single representation for each com- 
plete morpheme means merely that we refer the differences between the 
member segments back to the phonological parts of the morpheme: in- 
stead of having a morpheme with several morphemic segment members 
in various environments, we have a morpheme with only one member; 
but the phonological elements of which that member is composed are 
variously defined (to represent various phonemes) in those very environ- 
ments. There might seem to be little advantage in this, but the gain be- 
comes apparent when we realize, from the generalizations of 14.2, that 
there may be many morphemes whose members differ from each other 
identically.*' The morphophoneme 'f/ which serves in \kmfe\ can also 
serve in { wolf, wife] etc. We can therefore use one symbol, one statement 
of the /f/ - /v/ alternation, in the single spelling of each of these mor- 
phemes, instead of repeating the statement in the member alternation 

of each morpheme: instead of j^^L j, j^nSf ' \wive\' ^^'■' '''^ ^^^^ ^ 
single statement of the two representations of /f/, plus /wuIf/, /'nayF,', 
/wayF/ etc.'* 

14.32. Severftl Morphophonemes in One Alternation 

When generalized statements are made for all the morphemes in 
which a particular alternation occurs, we may have certain morphemes 
which are referred to in more than one statement, i.e. morphemes whose 

desh. Nootka Texts 236-9 (1939) ; X. S. Trubetzkoy, Sur la morphonolo- 
gie, Travaux du Cercle Linguistique de Prague 1.85-8 (1929); Henryk 
Utaszyn, Laut, phonema, morphonema, ibid. 4.35-61 (1931). 

'" Each new morphophonemic symbol is therefore a generalization of 
member alternation in several morphemic units. Even apart from this 
generalization, morphophonemes are linguistically useful in that they 
indicate a special relation among phonemes. For example, as indicated 
in 2.61, the sounds (in morphemes) which are usually indistinguishable 
from each other for the speakers of the language are those which are mem- 
bers both of one phoneme and of one morphophoneme. For examples of 
native response to phonemically identical but morphophonemically dif- 
ferent sounds, see Selected Writings of Edward Sapir 54-5. 

'''The new one-spelling morphophonemic writing of the previously 
plurimembered unit is sometimes called the base form or theoretical 
form, from which the phonemically written members are derived. 


alternant members can be described as the result of more than one gen- 
eral phonemic alternation. Thus before {owsj,'^ [odium] has a member 
without the /am/ (odious), while [outrage] has a member with pre-suffixal 
stress (ouirageous) . In [decorum] we have yet a third alternation, in the 
member /da'kor/ which occurs in decorous (and in decor). But this alter- 
nation can be described as merely the result of the, operation of both of 
the two previous ones: dropping of /am/ and changing of stress. We can 
state this by including [decorum] both in the list which contains [odium] 
and in that which contains [outrage]. ^° 

The whole method of 14.3 is to arrange the facts of member alternation 
within morphemes in terms of the alternation rather than in terms of the 
morphemes. Whereas in 14.2 we were able to group together only those 
morphemes which had identical alternations of whole members, here we 
can notice identities in parts (segments) of the alternation. In terms of 
morphophonemes, we can notice if two alternations have some morpho- 
phoneme in common (i.e. are identical during part of their length), even 
if the remainder of the alternation differs. As a result, some alternations 
which differ from any other one, may be found to be sums of other known 

14.33. Types of Alternation Represented by Morphophonemes'^^ 


When it is possible to differentiate phonemically between the morphemes 
in which an alternation occurs and those in which it does not, and be- 
tween the environments of the one member of the alternation as against 

'^ Cf. Leonard Bloomfield in B. Bloch and G. L. Trager, Outline of 
Linguistic Analysis 65 (1941). 

^° Or by marking decorum both with the morphophonemic mark used 
with odium and with the one used with outrage. 

^' We compare statements about alternations with statements about 
morphophonemes. The alternations of 14.23 can be stated as follows: All 
the morphemes which have phonemic; feature ?/' (when they are in vari- 
ous environments), have members with phoneme i/, y'^ (instead of ?/') in 
the environment of phonemes z, iv, respectively. (We call these B mor- 
phemes.) But not every morpheme which has y^ when it is in the en- 
vironment z will be found to have y^ in the other environments; some 
of them (C morphemes) have y'^ (or some other phoneme) even in non-2 
environments (where the previous morphemes had their y^). In terms of 
morphophonemes, we merely write the foi'mer (B) morphemes with the 
morphophoneme ?/' in all environments, even in the neighborhood of z or 
w, and define this morphophonemic ?/' in such a way that the morpho- 
phonemic sequence y^z represents /yh/ and the morphophonemic se- 


the environments of the other (1-4.23), it is not necessary to introduce a 
new morphophonemic symbol, since both the morphemes and the en- 
vironments can be identified by means of their phonemes. In such cases 
we merely define a morphophoneme, symbolized by the characteristic 
phonemes of the morpheme, in such a way that it represents in each en- 
vironment of the morpheme the phonemes which occur in that environ- 

In cases where a given alternation occurs in all morphemes which have 
a particular phonemic feature and in all occurrences of a phonemically 
stateable environment (14.23), we can consider all the members to con- 
sist of morphophonemes based on the phonemically most complicated 

quence y^w represents phonemic /y^w/. The other (C) morphemes we 
may write with y'^, both in the environment of z and elsewhere. 

The alternations of 14.221-2 can be stated as follows: The morphemes 
D have members with phoneme y- instead of y^ in environments z, and 
with phoneme y^ instead of y^ in environment w. But other morphemes 
E have y' even in environment z; and morphemes F which have y^ in 
environment z, also have y^ (and not ;/') elsewhere. In terms of morpho- 
phonemes, we write D with morphophoneme y, and define y as repre- 
senting phoneme y^ in the environment z, y^ in the environment ?/>, and 
the ?/' otherwise; y occurs in morphemes D. 

Other En- 








C, F 

B, D, C, F 



The alternations of 14.223 can be stated: All the morphemes which 
have phonemic feature ?/' (when they are in various environments), have 
phonemic feature 2/^ y^ (instead of y') in the environment of morphemes 
z, w respectively. But there is no phonemic feature common to all the 
morphemes z, or to all w, which is absent in all the morphemes in the en- 
vironment of which y' remains (and is not replaced by ^/^ or y^). In terms 
of morphophonemes we add to morphemes z, w a morphophoneme (which 
may be a morphophonemic juncture) ', and define 'z to represent z + 
change of preceding ?/' to y^ (or define y^" z to represent y^z), and "w to 
represent w + change of preceding y^ to y^; when not after y\ " is zero, 
so that a'z represents az. (However, we can define " to mark some other 
morphophoneme when it is not after y^.) 


or arbitrary member. ^^ Instead of having a Kota morpheme with two 
members, phonemically /katy/ and /ka/ 'knife,' we have a morpheme 
with one member, morphophonemically /katy/; the morphophonemes 
are identical with the phonemes of the first member, but the last two 
morphophonemes represent phonemic zero when they occur before /t/; 
otherwise they represent phonemic /ty/. Although we do not have to go 
here beyond our usual phonemic symbols, we have nevertheless entered 
into a one-many correspondence in giving these values to our symbols. 
For given the morphemes which are morphophonemically written 
/katy-tayr/ we know they are to be pronounced phonemically /kati.r/. 
But hearing the phonemes /kati.r/ we have no way of telling whether 
the morphophonemes are /katyt . . ./ or /kat . . ./, i.e. whether the 
first morpheme would be /katy/ or /ka/ when /t/ does not follow, be- 
cause the morphophonemes /ka -|- t/ and /katy + t/ both would give 
the phonemes /kat/.^^ 

It is also not necessary to introduce a new morphophonemic symbol 
when it is possible to differentiate phonemically between the morphemes 
in which an alternation occurs and those in which it does not, or when the 
environments in which the morphemes have their special member consist 
of a small number of morphemes (14.223). In such cases it may be suffi- 
cient to define a morphophoneme, symbolized by the interchange of 
phonemes which constitutes the alternation in question, and to say that 
that morphophoneme is part of the morphophonemic and phonemic 
composition of the stated environmental morphemes.^'' Thus, the mor- 
pheme ily of 14.223 has the phonemic form /k -^ s; ey ^ se; atiy/, the 
/k -^ s/ being understood to apply only to any preceding /k/, and 
/ey — > ae/ to any preceding /eyC/. For the sake of abbreviation, a new 
symbol, say /"/, may be defined to represent these changes before ily, 
so that the composition of that suffix becomes /'atiy/. Whether or not a 
new symbol is used for abbreviation, the alternation remains morpho- 
phonemic. For even though we include the change of /k/ to /s/ and of 
/ey/ to /ae/ as part of the phonemic composition of the morpheme ity, 
we will not know, when we hear a morpheme with /s/ or /se/ before ily 
whether that morpheme has /k/ and /ey/ elsewhere, or whether it al- 

^'^ I.e. the member about which fewest (or no) generalizations can be 

'•' Cf. 14.53 below. 

^^ Not of the morphemes in which the alternation takes place, but of 
the morphemes which constitute the environment in which these mor- 
phemes have their special member. Cf. fn. 3 above. 


ways has s, and 'ip botli elsewhere and before ily. When we see the 
morphophonemic writing /seyn/ + /ey — >ae; atiy , we know that the 
phonemic representation is /ssenitiy/. But when we hear /saenatiy/ and 
, laeksatiy/ laxity, we would not know that the morpheme in the first 
case is /seyn and in the other is /laeks/. 

14.332. New symbols required. When it is impossible to diflferentiate 
phonemically between the morphemes in which an alternation occurs 
and those in which it does not, or between the environments of the one 
member of the alternation as against the environments of the other 
(14.221-2), and when the environments in question are not some small 
class of morphemes which could be treated as in 14.223, then it is most 
convenient to define new morphophonemic symbols to indicate the occur- 
rence of the alternation. 

A common type of regularity is that of Russian /rap/ 'slave', /raba/ 
'of slave' compared with /pop/ 'priest', /papa/ 'of priest',^* or Ger- 
man /bunt/ 'group', /bunde/ '(in) the group' compared with /bunt/ 
'colored', bunte, 'colored ones.' In both cases, all morphemes whose 
members end in voiced stops when a voweP^ follows, have members 
ending in voiceless stops when silence or phonemic word juncture or cer- 
tain morphemes (constituting what we may later call separate words) 
follow. If the environment were always recognizable by a constant pho- 
nemic feature (e.g. silence, or a phonemic juncture),^' we would be able 
to give these morphemes a single phonemic form by saying that mor- 
phophonemic /b#, represents phonemic /p/, and /d#/ represents /t/. 
We would write /'bund/ 'group,' /bunt/ 'colored' and pronounce 
both the same way (when /bund occurred before /:^/)f^ the mor- 

^= The Russian analysis here is from G. L. Trager, The phonemes of 
Russian, Lang. 10.334-344 (1934). 

^* I.e. a morpheme beginning with a vowel. 

^'' Or by an intermittently present feature such as pause, which we 
could observe by obtaining repetitions of the utterances. 

^' For a method of writing these two forms identically by the use of 
components, see Lang. 20.195-6 (1944). Whether we use letters or com- 
ponents, one feature of the exactness of phonemic representation is lost, 
while a morphemic distinction is gained. For when we hear [bun] followed 
by a voiceless dental stop we would not know whether to write it bund 
or /bunt/; but the two writings would be equivalent, so that it would 
make no difference which we wrote. The only difference would be in 
identifying the morpheme; when we hear bunde/ we know which mor- 
pheme is involved: when we hear bund ' = bunt we do not. This is an 
inescapable difficulty in the phonemic representation of morphemes, and 
is a result of the imperfect correlation between phonemes and mor- 
phemes. Not in all cases (nor in all languages) is each member of each 


pheme {Bimd] would then have only one member both in /bund#.' 
(= /bunt:^, ) and in 'bunde,'. However, if the voiced-stop morphemes 
have their voiceless-stop member even in some environments which 
can be differentiated not by a phonemic characteristic but only by know- 
ing the particular morphemes, it would be impossible to tell, when we 
see the morphophonemic , bund in a particular environment, whether 
to pronounce it phonemically bund/ or /bunt/. In such a case it be- 
comes necessary to add some morphophonemic mark in these environ- 
ments so as to indicate that the /d/ has here the value /t/. If we choose 
/-/ as this mark, this would mean that not only /d#/ but also /d-/ 
have the value ft/. 

A slightly different type of regularity is observed in Menomini,-' 
where every morpheme ending in a non-syllabic has a member with 
added /e/ when it occurs before a consonant: morphophonemic /poN/ 
'cease' plus /m/ *by speech' plus the suffix /ew/ 'to him' is phonemic 
/ponemew/ 'he stops talking to him'. Following the procedure of 14.331, 
we may write morphophonemically /poNmew/ and allow it to serve 
phonemically, i.e. to indicate exactly how the sequence is pronounced. 
This is possible as long as we deal with clusters such as /Nm/ which do 
not occur except across morpheme boundary. However when we find a 
cross-boundary cluster which also occurs within a morpheme (where no 
/e/ would be pronounced between the two consonants), we would have 
no way of knowing whether to pronounce the /e/ or not, unless we are 
told whether a morpheme boundary has been crossed. In such cases we 
would have to insert a morphophonemic mark, say a juncture /-/, which 
would here represent the /e/. Furthermore, when we hear /CeC/, 
only knowledge of the morphemes involved will tell us if the /e/ here 
merely represents a morpheme boundary or is part of the phonemic se- 
quence within a morpheme: /ponemew/ is /poN/ 4- /m/ -f /ew/, but 
/ponenemew/ 'he stops thinking of him' is /poN/ + /tNem/ -f /ew/. 

14.4. Result: Morphophonemes as Classes of Substitutable Pho- 

We now have a set of morphophonological elements, no longer in one- 
one correspondence with speech, which can replace our phonological ele- 
ments for the purpose of identifying morphemes. 

morpheme phonemically different from every other member of every 
other morpheme: /bund/ differs recognizably from /l)unt/ only in cer- 
tain utterance positions. 

^'•' liconard Bloomfield, Menomini morphophoiicinics, Travaux du 
Ccrcle Linguistique de Prague S.IO!> (1939). 


Although the morphemic segments had originally been set up in chap- 
ter 12 on the biu^is of limitations of occurrence of the phonemes in respect 
to utterance-long environments, we often find after the complementary 
grouping of chapter 13 that the phonemes no longer identify the resultant 
morphemes in a simple manner. In order to have a single composition, 
i.e. a single spelling, for all occurrences of a single morpheme, it is often 
necessary to resort to elements which represent the various phonemic 
compositions of the morpheme in its various environments.*" In some 
cases these elements may be merely redefinitions of phonemic symbols 
(redefined as a one-many correspondence); in other cases new symbols 
have to be defined.*' 

Each morphophonemic symbol is defined only for the particular posi- 
tions of the particular morphemes (or environments) in which it has been 
set up. A symbol which has not been defined for certain positions can 
therefore be used for any other morphophonemic relation. That is, com- 
plementary morphophonemes can be marked by the same symbol. 

*° Each morphophoneme is itself a class of complementary phonemic 
segments. The segments represented by the morphophoneme are those 
which occur in a particular position in all the members of a particular 
morpheme. They are complementary, since for each environment of the 
unit, its morphophonemes indicate the segments in the corresponding 
parts of that member of the morpheme which occurs in that environ- 
ment. A morphophoneme is thus a class of phoneme-length segments, the 
same segments that we had grouped into phonemes, except that into one 
morphophoneme we group segments which are complementary within 
one morpheme (holding the morpheme constant), while into phonemes 
we grouped segments which were complementary without regard to 
morpheme constancy. 

*' There are two types of situation in which morphophonemes could 
be used even though the phonemes they represent, and the morphemic 
segments in which they occur, are not mutually complementary in en- 

One is the free variation among phonemically distinct morphemic seg- 
ments, as in the case of economics (13.2). We may define a morpho- 
phoneme /e/ freely representing /e/ sometimes and iy/ at other times, 
and then write /lEka'namiks/ but /,ele"ment9lA This is more useful if 
there are many morphemes in which the identical free variation occurs. 

The other is the intermittently present pause or other feature (Appen- 
di.x to 4.3). If we do not wish to recognize these elements which can be 
observed in repetitions of an utterance rather than in a single pronuncia- 
tion of it, we can define a morphophoneme which sometimes represents 
the feature and sometimes represents its absence. 

Each morpheme would then have the same morphophonemic constitu- 
tion in all its occurrences, even though its phonemic constitution varies 


14.5. Reconsideration of the Grouping of Phonological Segments 

The grouping of morphemic segments into morphemes enables us to 
reconsider the efficiency of our previously defined phonemes in their cor- 
relation with the morphemes. ^^ 

11.51. Morphophonemic Criterion for Grouping Segments 

Wherever the alternation of corresponding segments (e.g. the first seg- 
ment) in the various members of a morpheme satisfies the procedure of 
7.3, we include these segments in one phoneme (not only in one morpho- 
phoneme). E.g. if [p^a'zes] possess and [p'a'zes] in dispossess are members 
of one unit {possess}, and if [p*^] and [p'] are in general complementary 
in their phonemic environment, we group [p*'] and [p'] in one phoneme 
/p/. This means in effect that, for the purposes of the morphology, the 
criteria of 7.4 become secondary to (or are superseded by) the criterion 
of membership in one morphophoneme, i.e. by the criterion that the com- 
plementary segments to be grouped into one phoneme be ones which 
replace each other in various members of one morpheme. It will often 
happen that this new criterion has the same effect as those of 7.4, the 
more so since linguists usually make guessed approximations to 14.51 
(see Appendix to 7.4). The morphophonemic criterion applied here thuf 
leads to a reconsideration of the phonological segment groupings de- 
termined in 7-9." 

When so reconsidered, our phonemes become the expression of two 
independent relations : primarily the phonemic relation of complementary 

^^ In these procedures we have gone from segments to phonemes to 
morphemic segments to morphemes to morphophonemes. We could also, 
though with much more difficulty, have gone from segments to phonemes 
to morphophonemes to morphemes. In either case knowledge of mor- 
pheme identity would have to be added after the phonemes are set up, in 
order to enable us to proceed. 

'^ As a special case of 14.51 we find segment sequences which may be 
assigned here to a phonemic sequence different from the one to which 
they had been assigned in 7.9. E.g. we may have previously phonemicizcd 
[Clijit^] as /Cal^^/: simple /simpal/. The considerations of the Appendix 
to 14.331, however, show that we could have avoided an extra morpho- 
phoneme /a/ if we had phonemicized it as /simpl/. This is a possible 
phonemic analysis, since /I/ does not otherwise occur in /C — ^#/; i.e. 
there is no other segment in that position which had already been as- 
signed to /I/. The new phonemicization may complicate or simplify our 
statement of clusterings, i.e. of the relative distribution of consonants, 
but in any case it simplifies our morphophonemic statements. If we 
choose to give this criterion precedence over those of 7.4 we will now as- 
sign [C!#] to /Cl#/. 


distribution (plus free variation); secondarily the morphophonemic rela- 
tion of substitulability in various members of a morpheme. A system of 
writing based upon these two relations will usually be most convenient 
for morphological description''' and for use by native speakers of the 

14.52. Phonetnicization of Cross-Boundary Alternations 

One other type of morphophonemic alternation can be included within 
phonemics by a slight extension of phonemic u.sage: these are some of the 
alternations of 14.23. If a certain alternation occurs in all morphemes 

'* A system of elements based only upon the morphemic substituta- 
bility relation would be that of morphophonemes. If we considered only 
one morpheme at a time, or only morphemes with identical alternations, 
the [f] and [v] segments of knife, knive would be included as alternants 
(members) in one element, the [p''] and [p'] above as alternants in one 
other, and so on. Within these morphemes these segments are both com- 
plementary and substitutable mutually. To this arrangement, we add 
the phonemic requirement that the segments be complementary (or 
freely variable) in respect to the occurrences of all other phonemic ele- 

On the one hand, this means that no two non-freely-variable alter- 
nants of one phoneme should occur in the same phonemic environment : 
if element v has two alternants, [f] and [v], with [v] occurring before 
environment [z], and [f] occurring in all other environments, we do not 
want [f] to occur before any environment [z] and we do not want [v] to 
occur in any non-[z] environment. That is, we want the environments of 
alternant [f] and the environments of alternant [v] to be completely dis- 
tinguishable from each other when these environments are stated in 
terms of our phonemic elements (and when our two alternants are stated 
in terms of a single phonemic element v ', so that only the difference be- 
tween their environments remains to distinguish them). 

On the other hand, the phonemic requirement means that if two seg- 
ments are alternants of each other in a particular morpheme, they should 
also alternate similarly (under the same conditions) in every other 
morpheme which has the same phonemic constitution over such stretches 
as are considered in the phonemics. If this phonemic requirement is e.\- 
tended to longer stretches, it would cease to differ from the morpho- 
phonemic requirement above, since almost every morpheme differs 
phonemically, in its constitution or its environment from every other 
morpheme. If we go far enough out, we can say that [f]-[v] alternate 
after nay and Hy ' (knives, leaves), but not after , fay , / ciy (fifes, 
chiefs). However, phonemic considerations are usually restricted to short- 
er and more manageable stretches: after ay or / iy there are cases 
where [f] alternates with [v] and cases where it does not. Therefore, when 
we add this short-range phonemic requirement we can retain the [p'"]-[p'] 
grouping, but not the [f]-[v] one. 

•■'■' For the latter, see Edward Sapir, La realite psychologique des pho- 
nemes. Journal de Psychologic 30.247 65 (1933), and 2.G1 above. 


having a given (original) phonemic feature, whenever these appear in a 
certain new phonemic environment, there will be left no morphemes 
which preserve the original feature in the new environment. As was seen 
in 14.331, we can continue to write the original symbol even in the new 
environment, and merely state that in the new environment it represents 
a different segment. 

Thus we can continue to write /katy/ even before /tayr/, and say 
that. the value of /ty/ before /t/ is zero (i.e. that the value of /tyt/ 
is /t/). We must check, however, to see whether the sequence /tyt/, 
with /ty/ not representing zero, ever occurs within a morpheme, ^^ 
granted that it never occurs across morpheme boundary. If it does occur, 
then we cannot include this morphophonemic alternation (between /ty/ 
and zero) in the segmental representation of /ty/, because the represen- 
tation zero will contrast with the representation /ty/ before /t/." If the 
sequence /tyt/ never occurs except for the cross-boundary case under 
discussion (where also the segments [tyt] do not occur, but only [t]), 
we can assign to it any definition we please, in this case the segments 
zero plus [t]. 

This will be a frequent situation, since when all morphemes ending 
in X have )', instead of X, when Z follows (so that we get YZ instead 
of XZ), we will often find that XZ never occurs otherwise in the lan- 
guage (i.e. never occurs within a morpheme), so that the substitution of 
the symbols YZ amounted to an avoidance of the otherwise non-occur- 
ring sequence of symbols XZ.^^ 

14.53. Equivalent Phonemic Spellings 

The morphophonemic alternations of 14.23 and 14.52 can be consid- 
ered phonemic if we are willing to permit equivalent phonemic writings. 
In the example of the Appendix to 14.331, /"simplliy/ can only be read 
['simpliy]; the latter, however, could also be written /"simpliy/. There 
would be no real loss in the one-one property of phonemic representation 
if we admitted both forms to phonemic status, for /I/ and /ll/ would be 

'" I.e. if /tyt/ ever occurs with a value other than /t/. 

" We can do it only if there is a phonemic juncture involved, for then 
we can say that /ty/ before /-t/ has value zero, while /ty before /t/ 
(i.e. within a morpheme) has some other value. Cf. the last case in 

^* C'f. the 'protective mechanisms', e.g. in Stanley S. Newman, Yokuts 
Language of (-alifornia chapter 1 : 13, 2:15 (1944). Much the same func- 
tion as that filled by the protective mechanisms can be filled by selection 
of base forms or basic alternants, as in Leonard Bloomfield, Language 


equivalent in this enviionincnt and it would inak(> no phonemic differ- 
ence which we wrote. 

Simihirly, in the example of 14.331 and 14.52, if we define the phonemic 
sequence /tyt/ as representing the segment [t] (since the segment se- 
quence [tyt] occurs neither across morpheme boundary nor otherwise), 
we must remember that /t/ alone (aside from the /tyt/ sequence) also 
represents the segment [t]. 

This means that given phonemic kat . . ./ we can represent it pho- 
nemically as either /kat . . ./ or /katyi . . ./, because /tyt/ = /t/. 
Since the two writings are equivalent, and each would be pronounced 
only as /'kat . . ./, we can say that we have not really lost the phonemic 
one-one correspondence by admitting the two representations. As an in- 
dication of morphemes, however, we have moved from a one-one to a one- 
many correspondence, for w-hen we hear /kati-r/ we cannot tell whether 
the first morpheme is /katy/ or /ka/. 

This equivalence of two phonemic writings is not a new extension of 
the phonemic definition. The expression of limitations of phonemic dis- 
tribution by the less restricted components was in part accomplished by 
admitting such equivalent writings. ■''^ 

14.6. Reconsideration of the Grouping of Morphological Seg- 

The decision as to what morphemic segments would be grouped to- 
gether into one morpheme was made in carrying out the operations of 
chapter 13. However, the comparison of the alternations in the resulting 
morphemes (13.5), and especially the generalizations of these alternations 
(14.2), may show that the relations among the members (i.e. the alter- 
nations) within certain morphemes are very different from the general 
types, whereas a different grouping of the same members into different 
morphemes (yielding morphemes with different alternations) would fit in 
with our generalizations, or at least remove some of the exceptions which 
marked the original morphemes.'"' In such cases we may go back and re- 
group the morphemic segments into different morphemes, and then replace 
the original morphemes by the new ones in the following procedures." 

^^ See fn. 28, above. More regular statements can sometimes be ob- 
tained by employing the technique of descriptive order used in Leonard 
Bloomfield, Language 213. 

*" Cf. for example the Appendix to 13.43. 

" The reconsiderations of 14.5-6 are designed to yield the maximum 
regularity between morphological elements (selected for their having 


Appendix to 14.32: INIorphophoneniic Equivalent for Descriptive 
Order of Alternation 

When the difference between two members of a unit is described as 
the Slim of two alternations, i.e. as the operation of two independent 
morphophonemes in that morpheme, it is necessary to check whether the 
two alternations can be summed in any order, or whether one must be 
applied first. Thus in Menomini''^ morpheme-final /n/ is replaced by /s/ 
before /e, y/: /esyat/ 'if he goes thither,' /enohnet/ 'if he walks thither.' 
Furthermore, when the vowel of some morpheme occurs at word-final, it 
drops: /asetehsemcw/ 'he lays them so that they overlap,' /aset/ 'in 
return.' When we now meet /os/ 'canoe,' /onan/ 'canoes', we recognize 
that this alternation' can be stated as the sum of the previous two. How- 
ever, this can be done only if we set up a morphophonemic /on-e/ and 
then apply our two alternations in the order in which they are stated 
above; if we first drop the /e/, we will have lost the condition for then 
replacing the /n/ by /s/. 

The effect of this descriptive order of the statements about alternation 
can be obtained alternatively by an exact statement of the representa- 
tion of the morphophonemes. We may say that morphophonemic /V/ 
(including /e/) has phonemic member zero before #- and that morpho- 
phonemic /n/ has phonemic member /s/ before morphophonemic (rather 
than phonemic) /e, y/ (even if # follows). Then morphophonemic 
/on-e-±fc/ would be phonemic /os/. 

Similarly, in Kota,*' the value of morphophonemic /ay/ under non- 

simple patterned distributional relations among themselves) and the 
phonologic elements of which they are composed. The ideal is that eveiy 
morpheme have only one phonological constitution (spelling), different 
from that of every other morpheme. This ideal was in part made unat- 
tainable by the operation of 13.31, which assign?; a phonemic sequence 
in some environments to one morpheme and in other distributions to 
another morpheme. It was made farther removed by the operations of 
13.32 (and 13.2), which included phonemically distinct morphemic seg- 
ments in one morpheme. The operation of 14.3 recaptures some of the lost 
ground (on a different level) by enabling us to say that morphemes are 
morphophonemically, if not phonemically, identical in all their occur- 
rences. In 14.5-6, we then check back to see if a redefinition of some of 
our phonemes or morphemes would enable us to make this morphopho- 
nemic identity into a phonemic one. 

*'^ Cf. Leonard Bloomfield, Menomini morphophonemics, Travaux du 
Cerde Linguist i(}ue de Prague 8.105 15 (1939). liloomfield calls the 
necessary order of the statements below "descrii)tive order." See also 
his Language 213. 

«■' M. R. lOmcneau, Kota Texts I 18. 


primary stress is i . and the value of /., before vowel is zero. Both of 
these values are involved when we deduce that morphophonemic 
/mekay/ + /a/ 'he does not get up' is phonemic , mekia/ . Jf we state 
this in terms of alternations, they must be applied in the descriptive 
order given above. 

Appendix to 11.33: Alternation.s Not Represented by Morpho- 

In some cases there is in general no advantage to identifying mor- 
phemes as composed of morphophonemes instead of phonemes. 

One such case is that of morphemes all of whose members arc pho- 
nemically identical (13.31; we may say that such a morpheme has only 
one phonemically distinct member). If we apply the operation of 14.3 to 
such a morpheme, we would find that the elements representing its or- 
dered phonological segments in every environment are identical with its 
phonemes : book would be composed of the elements /buk/. We may, if 
we wish, say that this phonemic composition of the morpheme Ls also its 
morphophonemic composition. 

Another such case is that of morphemes which have more than one 
phonemically distinct member, but the alternation among which mem- 
bers is identical with no other alternation (14.21) : e.g. the morpheme con- 
taining be, am, are, etc. Here it does not pay to set up morphophonemes 
in terms of which all the members would be identical, because the alterna- 
tion of phonemes per environment within each morphophoneme would be 
as unique as the alternation of members per environment.''^ Xo economy 
would be gained in replacing the alternation of members in the morpheme 
by an alternation of Segments in the morphophoneme, since we do not 
have here the case of 14.31, where one alternation of phonemes within a 
morphophoneme replaces many alternations of members within mor- 

Therefore, linguists would generally write the various members of 
{be\ phonemically; /'aem/ next to I, /if before {s}, etc., and leave it for 
a special statement or dictionary listing to indicate that these are all 
members of one morpheme. 

For a purely morphological analysis, where reading convenience is not 

'•'' If we wanted one morphophonemic writing for all the members of 
the morpheme {be\, we would probably set up two morphophonemes: the 
first would represent ;'i when the morpheme occurs before 3rd person 
{s\ in is, ae when it occurs next to 1, etc.; the second would represent 
zero when the morpheme occurs before 3rd person {s\, /m/ when it oc- 
curs next to /, etc. 


involved, we might choose to write the morpheme with one spelling in 
all environments: {/ be] for / am, [gud + er\ for better, etc., and let a 
special statement or dictionary listing indicate that these morphemes 
have the members /sem/, bet respectively in these positions. 

The setting up of new morphophonemic symbols is thus in general re- 
served for those alternations which occur in more than one morpheme but 
not in all the morphemes having some phonemic characteristic in a pho- 
nemically identifiable environment (14.23). 

When an alternation appears in very few morphemes, it depends upon 
convenience and upon our purposes whether we indicate it by a morpho- 
phoneme or by a list of members alternating in a morpheme. E.g. we 
could say that [have] is morphophonemically /haev/, where /v/ is zero 
before {ed\, \s] '3rd person', and /v/ otherwise: / have, I had. Alterna- 
tively, we can say that there is a morpheme {have} with members /hse/ 
before {ed\, {&}, and /hgev/ otherw-ise. 

Appendix to 14.331: Maximum Generality for Morphophonemes 

If alternations are stated in the most general way possible for the 
case at hand, advantages in economy may be achieved by selecting the 
morphophonemes so as to represent the most general form of the al- 

Thus we consider simple-simply, able-ably, etc., as compared with 
moral-morally , cold-coldly and extract the morpheme ly. We may then 
say that simple is morphophonemically /simpL,/, able /eybL/, where /l/ 
represents phonemic zero before ly, and represents /I/ before vowels 
(simpler), and /al/ otherwise (simple, simpleton). In contrast, the al of 
moral would appear to be morphophonemically /seL/, w^here /L/ repre- 
sents phonemic zero before ly, but /I/ otherwise {moralizing, tnorality). 
All this analysis was based on the phonemic identity of ly in all these 
occurrences : if we use these morphophonemic spellings, we need no spe- 
cial morphophonemic writing for ly. We can, however, choose the other 
alternative, and consider {ly\ as having two members, /liy/ and /iy/. 
We would then write [ly} morphophonemically as /Liy/, where /L/ 
represents phonemic zero after /I/, but /I/ otherwise. In that case al 
would have just one relevant form, /sel/, and simple would be /simpal/, 
where /a/ represents zero when vowel phonemes follow the /I/ (this 
applies also to the morphophonemic /Liy/ whose phonemes are /iy/ in 
this case), and represents /a/ otherwise. 

The representation would be simpler if we could say that morphopho- 
nemic /ll/ always represents phonemic /I/, so that morphophonemic 


/'mariellij' would represent phonemic / 'maraeliy/.** However, there are 
occurrences of morphophonemic ,11/ which represent phonemic /ll/: 
'gayllas guileless. Since all cases of morphophonemic or phonemic /ll/ 
are across morpheme boundary (no one morpheme contains /ll/), we 
can set up a juncture to mark not all morpheme boundaries, but only 
those across which phonemic /ll/ occurs. This would morphophonemical- 
ly and phonemically differentiate the two segment sequences [1] and [11], 
by saying that [1] is represented by morphophonemic /I/ or /ll/, while 
[11] is represented only by morphophonemic /l-l/. We thus have obtained 
a general statement of the type of 14.331, which can be put without ref- 
erence to morphemes except for the placement of the morphophonemic 
and phonemic /-/ juncture, which occurs before some morphemes and 
not before others.''® We wTite simply /'simplliy/ and guileless /'gayl-bs/, 
thus permitting each of the morphemes involved to retain in these com- 
binations the form which it has elsewhere^' and also including in the 
phonemic composition of {-/^ss) (which must now be taken as /-las,', 
since it was so used here) the fact that it is preceded by internal open 

It will be found that this statement holds for all English consonants: 
[nn] (or [n]) in morphophonemic /'pen-inayf/ pen-knife, /'fayn-nas/ 
fineness; and that the /-/ phoneme can be used to represent many other 
segmental features. 

Appendix to 14.332: Choice of Marking Morpheme, Environment, 
or Juncture 

Since each morphophoneme marks the phoneme alternation which 
takes place in particular morphemes in particular environments, it is 

*'" Morphophonemically, simple would then be simpal/ , ly /liy/ , and 
morphophonemic ,11 and 1 would both have the phonemic value i . 

*® We can put this phonemically: given the segments [1.] (or [11]) we 
write the phonemes (and morphophonemes) /l-l/, which can only indicate 
these segments; given the segment [1] we may write it phonemically /!/ 
or /ll ', either of which can only indicate this one segment. The two sym- 
bols /I and /ll ' would be phonemically but not morphophonemically 
equivalent: when we hear ['eybliy] we could deduce the phonemes but 
we could not deduce the morphemes; we would not know if there are one 
or two /I morphophonemically unless we know on other grounds what 
morphemes are involved. 

*" Following fn. 32, we may phonemicize simpleton as , 'simpLtan/, the 
phoneme /!/ in this position representing the segments [al]. For the am- 
biguity as to /I/ and /ll, both representing [1], see 14.53. 

*» In the terms of G. L. Trager and B. Bloch, in Lang. 17.225-226 
(1941). In the original phonemic analysis it was not necessary to mark 
open juncture between like consonants as a phoneme, for the occurrence 
of a double consonant occurred only at open juncture. 


clear that what we have is a relation between certain morphemes and 
certain environments, which can be indicated by a statement listing 
the morphemes and environments, or by a mark on each morpheme or 
each environment involved. Various considerations of simplicity, similar- 
ity to other grammatical features, etc., are involved in deciding whether 
the morpheme or its environment or the juncture between them should be 
marked, and in what way. Thus in the case of [knife] , instead of saying 
that the morpheme is /nayr/, with /f/ representing /v/ before [s] 'plu- 
ral', we may prefer to say that the morpheme is simply and always 
/nayf/, but that one of the members of the [s] 'plural' morpheme is 
/voicing + z/, occurring after /nayf/, /wayf/, etc.^^ When to /nayf/ we 
add [s] 'plural', necessarily in its /voicing + z/ member, we obtain 
/nayvz/ knives. Shifting the burden of this alternation onto the {-s( 
may be preferable here, because {s} 'plural' has quite a number of other 
restricted members, so that less violence to the simplicity of the mor- 
phology may be done thereby than in creating the /f/ morphophoneme. 
On the other hand, the /f/ was useful in that it marked for easy notice 
the morphemes in which the alternation took place. ^° 

When the alternation indicated by a morphophoneme occurs at the 
boundary between the morpheme under discussion and the differentiat- 
ing environment, it may be simplest to set up a morphophonemic junc- 
ture between these two, just as we have previously set up phonemic junc- 
tures. ^* Thus in Nootka,^^ morphemes ending in labialized gutturals and 
velars have forms without labialization before certain morphemes (words, 
and incremental suffixes), e.g. /qahak/ 'dead', /qahakaX/ 'dead now', 

" See Leonard Bloomfield, Language 214. 

^° Even if we prefer the e.vtra jsj member, it would probably be de- 
sirable to put a mark in the dictionary after each word which sei-ves as 
the differentiating environment for the /voicing + z/ member, as a re- 
minder to those who use the morpheme list. For a major example of 
complicated morphophonemic analysis (in Tiibatulabal), involving vari- 
ous choices of what to mark, see Morris Swadesh and C. F. Voegelin, 
A problem in phonological alternation, Lang. 15.7 (1939); contrast the 
slight rearrangement of morphophonemic markers in this example in 
Lang. 18.173(1942). 

^' The difference being that phonemic junctures are used for segments 
which occur only at morpheme (or other) boundary, while morphopho- 
nemic junctures are used for features which occur at the boundaries of 
I)articular morphemes (and which may also occur elsewhere, even in 
identical phonemic environment, without the presence of a morpheme 

•'- Kdward Sapir and Morris Swadesh, Nootka Texts (1939), especially 
p. 230 7. 


but with labialization before other morphemes (formative suffixes), e.g. 
qahak*as 'dead on the ground'. None of these features would be 
repre.>?ented bj^ phonemic juncture, because they occur even when no 
morpheme boundary is present kiskiko- ' 'robin', /k*isk*astin / boy's 
name. Since the alternation occurs in all morphemes ending in labialized 
gutturals and velars, and only before certain suffixes, it is useful to mark 
the particular suffixes. Morphemes ending in other phonemes have mem- 
bers showing other alternations before these same suffixes: pisatoi*-/ 
'play place', /pisatowas/'^ 'playing place on the ground'. It is therefore 
not desirable to add to these suffixes a morphophoneme consisting of a 
particular letter, since not one but several phonemic alternations are to 
be indicated by that morphophoneme. The simplest mark is a special 
morphophonemic juncture /-/ which would be the initial part of the 
morphophonemic spelling of each of these suffixes, and w-hich represents 
various phonemic values when it is next to various phonemes. After some 
phonemes, there is no alternation before these suffi.xes, so that there the 
juncture represents zero.^' 

It may be noted that only in a rather extended sense can we say that 
this juncture morphophoneme is a class of the phonemes which replace 
each other in the various members of a morpheme (fn. 15, 30). The re- 
placement of phonemes occurs in the preceding morpheme. However, 
this morphophoneme falls within the general definition of 14.31, and is an 
extension of the narrower definition quoted here from fn. 30. This exten- 
sion is involved whenever we mark morphophonemically the differen- 
tiating environment rather than the morpheme under discussion. 

°^ Note that in some cases a non-automatic (hence phonemic) feature 
can as well be marked by a juncture as by a particular phoneme: the 
choice between the special juncture / = /' or pre-final-consonant y a/ to 
indicate the only non-automatic position of [a] in Moroccan Arabic (chap- 
ter 8, fn. 11). 


15.1. Purpose: Fewer Morphologically Distinct Elements 

We seek to reduce the number of elements, in preparation for a com- 
pact statement of the composition of utterances (chapter 19). We fur- 
thermore seek to avoid repeating alrnost identical distributional state- 
ments for many morphemes individually. 

If we consider all the utterances of our corpus in which each particular 
morpheme occurs, we will frequently see that many of our morphemes oc- 
cur in much the same environments.' In some cases it is possible to find 
a set of morphemes such that each of them occurs in precisely the total 
environments in which every other one does.^ If we keep the morphemes 
as elements of our morphological analysis, we will have a great many 
identical or almost identical statements of distribution, each dealing 
with a different morpheme. Considerable economy would be achieved 
if we could replace these by a single statement, applying to the whole set 
of distributionally similar morphemes.^ 

More generally, we frequently find morphemes whose distributions are 
partially identical: in some environments all of these morphemes occur, 
but in other environments only particular ones of these morphemes are 
to be found. It would make for economy and for simplicity of system if we 
could state the occurrences of these morphemes with least repetition. 

' It is this fact that led to the criteria of 13.4. There would have been 
little point in grouping complementary segments only in such a way that 
their sums had equivalent distributions if nowhere else in the language 
were there cases of morphemes having equivalent distributions. The par- 
ticular distribution of each morpheme, i.e. the choice of morphemes which 
occur with it in an utterance, is termed the .selection of that niorpliemc in 
Leonard Bloomfieid, Language 164-9. 

• This arises partly from the fact that in grouping morphemic segments 
into morphemes we had followed the model (jf previously r(;cognized mor- 
phemes (13.41), or had assigned the segments to various morphemes in 
such a way as to come out with morphemes having eciuivalent ranges of 
distribution (1.3.42). 

•'' The new elements, sets of distributionally similar mor{)hemes, would 
be fewer in number than the morphemes, and have more regular distri- 
bution. In a different way, the proceflures of chapters 13 and Ifi 7 yield 
elements fewer in number than what they start out with, and having 
fewer restrictions upon occ'urrence. 



15.2. Preliminaries to the Proeedure: Approximation 

If we seek to form chussos of morphemes such that all the morphemes in 
a particular chuss will have identical distributions, we will frequently 
achieve little success. It will often be found that few morphemes occur 
in preciselj' all the environments in which some other morphemes occur, 
and in no other environments.* Furthermore, it will frequently be the 
case that the sum of total environments of a morpheme taken in one 
corpus will differ from the sum of total environments of that very mor- 
pheme taken in another corpus of the same language.* This does not di- 
rectly affect our procedures, since we are treating only the material 
within a particular corpus. However, the interest in our analysis of the 
corpus derives primarily from the fact that it can serve as a predictive 
sample of the language (2.33); and the high probability of variation be- 
tween one corpus and another means that the corpus which up to this 
point had served as a satisfactory sample of the language can no longer 
serve thus in the matter of the exact environments of morphemes. 

It is therefore impossible, in most cases, to effect a great reduction in 
the number of elements by grouping together those morphemes which 
have precisely the same total environments. We will have to be satisfied 
with some approximation to such a grouping.* The desired approxima- 

* Such distributional identity may be true of certain types of personal 
names in Engli.'^h: given a sufficiently large corpus, there may be no 
utterance in which Tom occurs which cannot be matched by an equiva- 
lent utterance with Dick (but not Jack which cannot be matched in 
Jack of all trades, or John as in John the Baptist pronounced without in- 
tervening comma). 

'" Our corpus may contain, for the morpheme root, the environments 
Watch it grub for — s. Those — s look withered to me. The eleventh — of 
2048 is two, That's the — of the trouble, etc. Another corpus of material 
taken from the same language may contain the first two, but not the 
last two, and may contain a new environment The square — of 5939 
is 77. 

* This approximation will not introduce an appreciable element of 
vagueness into our further work, since the only purpose of chapter 15 is 
the reduction of the number of elements, and the approximation merely 
permits a greater reduction than would otherwise be possible. The num- 
ber of morpheme classes would vary according as we use no approxima- 
tion, or little, or much. But the treatment of 16-7 would vary corre- 
spondingly. If less approximation is used here, more equations would be 
required in chapter 16, to state the particular and slightly different range 
of environments of each of the larger number of resulting classes. In any 
case, the summary statement of chapter 19 for the utterances of the lan- 
guage would be the same. 


tion would have to disregard some of the differences in distribution 
among morphemes, i.e. if it groups together two particular morphemes 
into one morpheme class, that would not mean that every one of the total 
environments of the first of these is completely identical with an environ- 
ment of the other; this means that the total distribution of the first is 
not necessarily identical with the total distribution of the second. 

Some of the differences to be disregarded for purposes of classification 
would be such as would not occur in some other corpus of the same lan- 
guage: in our corpus Dick might occur in — 's twelve minutes late, and 
Tom might not; but in another corpus Tom might appear in that environ- 
ment. Disregard of such differences is necessary if our corpus is to serve 
as a sample of the language. Other differences to be disregarded, however, 
might be such as would occur in almost any corpus of the language; in 

He left at two -ty sharp, four might appear and seven might not in any 

corpus of the language; nevertheless we would put four and seven into 
one morpheme class on the basis of their other occurrences. 

Various procedures can be followed in obtaining such an approxima- 
tion to classes of distributionally identical morphemes. Two of these are 
discussed below. 

15.3. Procedure: Rough Similarity of Environments 

The most direct approximation to classes of identically distributed 
morphemes would seem to be the grouping together of morphemes which 
are identical in respect to some stated large fraction of all their environ- 
ments. To perform this approximation, we take each morpheme and state 
all of its environments within the corpus, where the environment is taken 
to be the whole utterance in which it occurs.^ We then select one mor- 
pheme, and match its range of environments with that of each other 
morpheme. We do not expect to find many cases of identical ranges, but 
decide in.stead upon certain conditions; if a morpheme satisfies these 
conditions, it will be assigned to the class of our originally selected 

The conditions may vary with the language system and with our pur- 
poses. They may be as crude as requiring that 80 per cent of the environ- 
ments of the one morpheme should be ones in which the other also occurs. 

^ In practice, we may begin by using a rather small corpus, containing 
relatively short utterances. We may state in detail the environments of 
only some selected morpheme, and then rapidly scan the other mor- 
phemes to see if the range of their environments seems roughly similar 
to that of our selected morpheme. 


Or the I'onditions may require that particular types of difference apply 
among the environments in which the two morphemes do not substi- 
tute: e.g. that the morphemes in which the two environments differ be 
themselves members of one class by the present method. Thus if hear and 
tear occur, without being substitutable for each other, in Vll — the bell 
and 77/ — the paper, respectively, our conditions might require that bell 
and paper be assignable to one class in terms of this same method of ap- 

\'arious simplifications can be utilized in this work. For example, the 
results of each classification can be used in all subsequent classifications. 
If bell and paper had been previously assigned to one class A', we would 
henceforth replace them by that class mark each time they occur. Then 
the environments of hear and tear, which were different in the para- 
graph above, become identical in the form III — the N, and are no bar 
to grouping hear with tear in one class V.^ 

In effect, we define the occurrence of each class in respect to the oc- 
currence of every other class, rather than defining each morpheme in 
respect to the occurrence of every other morpheme. Instead of proceeding 
step-wise from morpheme to class, we can say that having considered 
1 see the fellow. I hear the fellow. I see the moon. 1 hear the voice. I like the 
moon. 1 like the voice., we set up simultaneously two classes A' and V , 
with see, hear, like as members of V , dind fellow, moon, voice as members 
of A^. Then saying that / V the X occurs does not mean that every mem- 
ber of V occurs with every member of .V, but that every member of V 
occurs in this construction with some or other members of N, and every 
member of X occurs with some members of T'. 

15.31. Descriptive Order of Setting Lp Classes 

In many languages, it will be found that some classes of morphemes 
are more easily set up first, the others being set up with their aid. Thus 
in considering a Semitic language (e.g. Modern Hebrew), we may soon 
see that there are a few very frequently occurring morphemes which are 
interrupted sequences of vowels (e.g. /-a-a-, 'verb past'), many less fre- 

* With this simplification, the statement that hear and tear substitute 
for each other in the environment VU — the X means that these two oc- 
cur in the same morpheme-class environment, but not necessarily in 
the same morphemic environment. It means that hear occurs in this en- 
vironment for some members of A", and that tear does for some members 
of A', the members being not necessarily identical in the two cases. The 
various utterances represented by 77/ hear the X are morphologically 
equivalent after the A' is defined. 


quent vowel sequences (e.g. /-e-e-/ 'noun'), very many interrupted con- 
sonant .sequences (e.g. /k-t-v/ 'write'), and several short non-interrupted 
morphemes of most frequent occurrence (e.g. /'ti 'I did', /im/ 'plural'). 
We begin with these most frequently occurring morphemes whose num- 
ber seems to be small. We find that for practically every utterance con- 
taining / ti/, e.g. /xa.savti kax./ 'I thought so', the corpus contains utter- 
ances which are identical e.vcept that the /ti/ is replaced by /ta/ 'you 
did' (/xasavta kax./ 'You thought so.'), /nu/ 'we did', /tern/ 'ye did', 
/u/ 'they did', /a/ 'she, it did', or zero 'he, it did' (/.xasav kax./ 'Ho 
thought so.').^ We include all these substitutable morphemes in a 
class .4.'" 

We now use utterances containing .4 as frames for morphemes which 
can substitute for /-a-a-/.'^ We therefore take the utterance /xasavti 
kax./ and find that /-a-a-/ can be replaced by /-i-a-/ in /xisavti kax./ 
'I figured it so' and by /hi — a-/ in /hixsavti oto./ 'I considered him 
important. '^^ These three vowel morphemes thus constitute tentative- 
ly'^ a class B. 

We now form a substitution class for /x-s-v/, using any members of 
A and of B in the frame. In the utterance /xasavti kax./ we can substi- 
tute /k-t-v/ 'write', /g-d-1/ 'grow', and many other such consonantal 
morphemes. We include all these in a class C 

^ This zero means at present merely that the frame occurs at times 
with no member of the class. The desirability of considering it a zero mor- 
pheme member of the class is considered in the Appendix to 18.2. 

'" We may not have all these utterances in our corpus at the beginning 
of the search, but can obtain them in the course of it, by checking; sec 

" We use A, rather than each member /ti/, /ta/, etc., in the frame, 
since it appears that almost every morpheme which occurs next to one 
member of A will also occur next to any other member. Some few mor- 
phemes may appear next to one member of A and not ne.xt to another: 
e.g. /m-t/ 'die' may occur before /u/ 'they did' but not before /ti/ 
'I did'. A different type of utterance in which members of A do not 
substitute for each other may be seen in /katav — ■ laacmi/ ' — wrote to 
myself where only /ti/ T occurs ('I wrote to myself). For the expression 
of this restriction, see the Appendix to 17.33. 

'- The last is not a perfect substitution, since the frame was somewhat 
different in that case. For the sub-classes resulting from such limitations 
(when they are more systemic than in the present case), see 15.32. 

"Tentatively not only because of the uncertainty of the change of 
frame for /hi — -a-/, but also because we have not yet tested them in 
other frames and in longer utterances. 


Since we have exhausted the morphemes of our original utterance, we 
may now ask in what other utterances our first three classes occur. We 
will never find A except next to B, and never B except next to C. How- 
ever, we find C in utterances which do not contain A or B. E.g. we find 

hu xosev. 'He thinks,' /hem xosvim./ 'They think,' haxisuv mahir./ 
'The calculation is quick,' /haxisuvim mhirim./ 'The calculations are 
quick,' /eyze xa§ivut ye§ laze?/ 'What importance does this have?' In 
the first two utterances almost every member of C can replace /x-§-v/: 

hu kotev./ 'He writes', etc. In the other utterances only some of C oc- 
cur: hagidul mahir./ 'The growth is fast.' We may however form a class 
D of all these morphemes such as /-o-e-/ indicating present action, 
/-i-u-;' indicating operation, which occur with all or some C in these 
frames. We further form the small class E of morphemes which replace 
/im/ 'plural/ and which, like it, occur after the sequence C + D. From 
this point we proceed to ask with what other morphemes E occurs. We 
find /otomobilim/ 'automobiles,' and form a class F of single mor- 
phemes w'hich substitute for /otomobil/: /integral/ 'integral,' /ax/ 
'brother,' etc. The members of F do not substitute for C (but they do 
for C + D) ; they do not occur with T) or B ox A, whereas C never occurs 
except with some member of Z) or B. 

15.32. General Classes for Partial Distributional Identity 

Once we have found a class of morphemes which substitute for each 
other in one or several frames, e.g. the class B (/-a-a-/, etc.), we must 
check to see if these morphemes substitute for each other in almost all 
other environments as well. E.g. in the class B we would find that all 
three members occur with certain consonant morphemes C, that only 
two of them occur with other C (only /-a-a-/ and /-i-a-/ with /y-s-n/ 
'sleep'), and only some one of them occurs with the remaining C (only 
/-a-a-/ with /n-t-n/ 'give'). 

In view of this, we cannot consider B a single class, since the differences 
of distribution among its members are different or greater than what 
would be admitted by the conditions of 15.3. However, we are unwilling 
to set each member up as a separate class and so lose sight of their sub- 
stitutability in many environments. Such a situation can be expressed 
by setting up each of the three morphemes as a separate sub-class Bi, 
Bi, Bz, of a general class B}* Each sub-class occurs in the environment 

'•• Indicating the three vowel morphemes as Bi, Bi, B3 expresses the 
relation among them in a way that would not be apparent if we wTote 
them out phonemically as /-a-a-/, etc. (The vocalic character by itself 


of particular consonant morphemes C plus any suffix A. The general 
class B occurs in the environment of C + ^ in general : that is to say, in 
the environment of some members of C (+A) all members of B occur, 
and in the environment of other members of C (+ A) some particular 
sub-class of B occurs. In effect, the sub-classes are groupings of mor- 
phemes in respect to all occurrences of the morphemes (and hence to 
their environments in all these occurrences), while the general class B is 
a grouping of morphemes in respect to a selected number of environments 
which they have in common (A + particular members of C). Only the 
sub-classes are therefore morpheme classes proper in the sense of 15.3; 
the general class is a class of morphemes-in-particular-environments 
(15.4), i.e. the particular environments which are common to all the 

15.4. Alternative Procedure: Classes of Morphemes-in-Eiiviron- 

Instead of beginning with classes of morphemes having almost identi- 
cal environments (with each morpheme belonging to only one class) and 
then adjusting them (as in 15.32) to particular ranges of environments, 
we can devise a procedure of appro.ximation which will from the first 
yield groupings of those morphemes which appear in particular ranges 
of environments. 

We begin by selecting a morpheme in one of its utterances. We select 
a few additional morphemes which substitute for our original one in this 
environment, and then select a few additional environments in which all 
these morphemes, both the original one and the additional ones, occur. '^ 
We continue adding to the morpheme list and to the environment list. 
If any morpheme which we seek to add occurs with some but not all of the 
environments which are already in the list, or if any new envii'onment oc- 
curs with some but not all of the morphemes in the list, we drop from the 
lists either the new morpheme or environment, or else the old one with 

does not indicate membership in this B class, since other vowel mor- 
phemes, such as /-e-e-/ are not members of B at all.) But it does not re- 
duce the number of elements. In other cases, however, the subclasses 
may contain many morphemes, so that they effect a reduction in the 
number of morphological elements. Cf. class-cleavage and over-differen- 
tiation within a class, in Leonard Bloomfield, Language 204-6, 223, 399. 

'^ We select such as we suspect will occur with many other morphemes 
or environments which we will want to add to this list. 


which it did not occur.'* Whether we reject the new or the old depends on 
considerations of expected generality: we keep the one with the aid of 
which we expect to be able to form a larger and morphologically more use- 
ful claims. We thus obtain a fairly large class of occurrences of morphemes 
in utterance environments, such that each morpheme in the list occurs in 
each environment in the list. 

For example, we may begin with see in Did you — the stuff? we add 
tie, find, etc., which are substitutable for see in this environment, and then 
add He'll — it later, — them for me, please, etc., which are substitutable 
for Did you — the stuff? in the case of all these morphemes. We add many 
more morphemes, e.g. burn, lift; and many environments, e.g. / didn't — 
the book, — ing pictures is a bit out of my line. Finally we try the en- 
vironment Let's — where it was. Some morphemes, e.g. see, find, occur 
in this environment ; others do not. Since we already have here a number 
of morphemes having many environments in common, we do not break 
up the growing class by rejecting the old morphemes which do not occur 
in this new environment. Instead, we reject the new environment, and 
continue to add other elements to the original growing class. 

Each rejected morpheme or environment is then assigned to some 
other list, in which it is found to fit, or else it is used as the nucleus of a 
new list. Thus a new class may be begun with see and find in the environ- 
ment Let's — ■ where it was. To this class we will be able to add guess, 
which would also be added to the previous class. 

When this work is carried out in detail, we obtain a great many classes 
of morphemes in environments. Some of the classes are very large, e.g. 
class 7 of the Appendix to 15.4, which contains all the morphemes that 
occur in a few short environments like They will — . Other classes are 
very small, many containing only one morpheme, e.g. class 6 in the Ap- 
pendix. Many of the classes have morphemes in common (not only 
classes 1-7, but even these classes with class 8 in the Appendix).'^ Many 
of the classes, too, have environments in common (so for classes 1-7, but 
there would be almost no environments common to classes 1-7 and class 
8). No two classes, however, would have a morpheme in its environment 
(i.e. a whole utterance) in common. 

'* And make up a separate list (representing a special sub-class) of the 
morpheme-in-environment which does not fit into the main list. We thus 
have an explicit record of what is left out. 

" If we set up a class (containing with, to, etc.) for such environments 
as I'll go — him., it would probably contain no morpheme which also 
occurs in classes 1-7 of the Appendix. 


15.41. General Classes for Partial Distributional Identity 

The larger classes would be of chief importance for the morphological 
analysis. In particular, we could form for morphological analysis general 
classes containing all those classes which have a large part of their en- 
vironments in common (e.g. the general class 1-7, containing not only 
the morphemes which occur in They will — , but also the other morphemes 
of these classes).'^ The very small classes which are included in a general 
class, e.g. class 6, would usually be of little interest for morphological 
analysis.'^ Some classes contain very few morphemes which occur, how- 
ever, in very many environments; such classes are frequently quite im- 
portant in the morphology. 

The general classes are approximations, for when we say that the gen- 
eral class 1-7 occurs after some other general class (say, one including 
class 8), we mean only that each member of 1-7 occurs after some mem- 
bers of the other general class. Our analysis will also lose something in de- 
tailed exactness when we disregard the very small classes which are in- 
cluded in some general class; the saving in work, however, may be very 
great, since most of the proliferation of classes may be of this type. 

15.5. Result: Morpheme-Position Classes 

We now have a number of classes (or general classes and sub-classes) 
of morphemes, or more exactly of morpheme-occurrences. These classes 
are set up in such a way that all the morphemes in a class substitute for 
each other in approximately every environment of that class. Each class 
occurs in a range of environments (itself stated insofar as possible in 
terms of morpheme classes) which is at least partially different from that 
of any other class. ^^ 

'* More adequate symbols may be provided by marking the general 

class 1-7 as, say, V. The largest included classes which jointly exhaust it 
(with as little overlapping morphemic membership as possible) would be 
marked Vi, and V^, etc. 

'^ These may, however, correlate with features of meaning, of the his- 
tory of the use of morphemes, and of the history of the culture, etc. Thus 
some of the smaller classes in which sec occurs may be the results of what 
were historically metaphorical extensions. 

^" These classes vary in many respects, not all of which will be fully 
utilized in the remaining procedures. Thus some classes will contain not 
segmental morphemes but contours, such as intonation or the stress 
feature of English compound nouns. Some small classes, too, will be 
identical in any corpus of tlu; language (e.g. the class containing wilh, to, 
from, etc.); there is a high probability that such, which are some- 
times called closed classes, will be identical in any corpus taken from the 


For the purposes of further morplu)l()fj;icul analysis, these chisses are 
our new elements.-' The distinction among the morphemes within a class is 
no longer relevant." By representing the major equivalences in distribu- 
tion, these classes permit the remaining procedures to dea.1 separately 
with the major difference's in distribution of morphemes, e.g. such differ- 
ences as that between general class 1-7 and class 5. 

15.51. Morpheme Index 

The mor|)hemes of the corpus may be listed under their classes in a 
morpheme index. Such an index is useful as stating the morpheme stock 

language in the near future. In contrast, some classes (usually large ones, 
called open classes) may have in one corpus of the language several 
members which they did not have in another. For such classes there is 
a greater likelihood that a corpus taken in the future will contain a 
good many new members; i.e. new morphemes develop historically most 
frequently in such classes. 

^' These morpheme classes are elements of the language description 
not only by virtue of their definition, but also in the sense that many of 
them are characterized by special features common to all their member 
morphemes. In this sense we may even say that many morpheme classes 
(or, in some cases, sub-classes) have a common class meaning. In many 
languages we find that the distributionally determined classes (of mor- 
phemes, or morpheme sequences) have meanings which we may roughly 
identify as 'noun', 'verb', 'preposition', etc. Even classes of morpheme 
classes may have vague meaning characteristics. For example, in many 
languages the free morphemes (of whatever class) may be said to indicate 
objects, actions, situations, and the like, while the short bound mor- 
phemes (again of whatever class) indicate relations among these, times 
and persons involved, and the like. This is a very rough statement, and 
many exceptions would be found even in the languages for which this 
statement might be made. Note, however, that when it was discovered 
that Eskimo had many suffixes with meanings similar to those of stems 
in many languages, linguists at first considered these to be stems 'in- 
corporated' as suffixes (see S. Kleinschmidt, Grammatik der groenlaen- 
dischen Sprache (1851); M. Swadesh, South Greenlandic (J]skimo), in 
H. Hoijer et al.. Linguistic Structures of Native America 30-54 (1946)). 
Cf. Slotty, Problem der Wortarten, Forschungen und Fortschritte 
8.329-30 (1932). 

^^ Systems of marking can be developed which would indicate both the 
individual morpheme and the class in which it belongs. C. F. Voegelin, 
and in a somewhat different way W. D. Preston, use Arabic numerals to 
identify morphemes, with indicating the class and various digits for 
the morphemes of that class. E.g. one class may be marked by 100, and 
the morphemes in it by 101, 102, and so on. Cf. C. F. Voegelin, A prob- 
lem in morpheme alternants and their distribution, Lang. 23.245-254 


of the language, and the status that each has in the morphology (indi- 
cated by the class in which the morpheme is contained). If the classes are 
mutually exclusive as to morphemes (as would be generally the case for 
15.3), each morpheme would be listed only once. If the classes of 15.4 are 
followed, where each morpheme occurs in various classes, it may be con- 
venient to go by the general classes, or the few largest classes which 
jointly exhaust the general class, in order to avoid many repetitions of 
various morphemes. 

Appendix to 15.2: Culturally Determined Limitations and Pro- 
ductive Morphemes 

A major reason for the use of approximation techniques here is the in- 
adequacy of the usual linguistic corpus as a sample in respect to the dis- 
tribution of morphemes. In many languages, several hundred hours of 
work with an informant would yield a body of material containing all the 
different environments (over short stretches of speech) of the phonemic 
segments. If the operations of 3-11 are carried out for one such corpus 
of the language, and then again for another such corpus of that language, 
no difference in relevant data would appear. It would usually require a 
corpus many times this size to give us almost all the morphemic segments 
of the language, by the operation of chapter 12. That is, only a very large 
corpus would permit of the extraction of so many morphemes that no 
matter how much more material we collect in that language, we would 
hardly ever find any new morphemic segment. However, even a corpus 
large enough to yield almost all the morphemes of the language will, in cases, fail to give us anything like all the environments of each 
morpheme. The number of mathematically statable sequential permu- 
tations of the morphemes of a language is very great. Some of these 
sequences will practically never occur, and such restrictions on occurrence 
will be expressed in 16-8. Other sequences, however, may not occur in one 
corpus and may occur in another (unless the first corpus is larger than 
any linguist could collect). 

The impracticality of obtaining an adocjuate corpus is increased by the 
fact that some utterances are rare not mei-ely because of the great num- 
ber of possible morphemically different utterances, but also because of a 
special rarity which we may call a culturally determined limitation. 
Many culturally recognizable situations, and the occasion for certain ut- 
terances or the cultural admissibility of them, occur almost never in a 
particular society or language community, even though morphemes in- 
dicating features of these situations (in the sense of 12.41) occur in the 


language. Thus it may 'mean notliing' to say The box will be murdered. 
Utterances of tliis type will he exceptionally infrequent, so that even the 
largest corpus will not contain them. We would then have a difference in 
the distribution of box and of man in respect to the environment The — 
will be murdered. Some of these infrequent utterances may nevertheless 
occur in special cultural situations, e.^. in myths and tales, in artistically 
chosen turns of phrase, in jocular talk, or in nonsense. Thus a ghost re- 
turning to earth may be described as saying They have killed me, though 
that utterance might not otherwise occur except in special situations.-'' 

In view of all this, it would be desirable, in grouping the morphemes 
into classes, to devise such an approximation as would disregard at least 
these culturally determined limitations.^^ 

The argument for using approximations in morpheme classification is 
strengthened by the fact that the predictive usefulness of an exact mor- 
pheme classification need not be greater than that of an approximate one. 
If we could state the phonological elements and their distribution for a 
corpus consisting of all the utterances which have occurred in the lan- 
guage over some adequate period, we could be quite sure that no utter- 
ance occurring in that language for some short time in the future would 
contain a new phonological element or a new position of an old element. 
Thus given the present English system in which /r)/ does not occur 
initially, the possibility that someone will pronounce an English utter- 
ance containing initial /r)/, e.g. in /rian/, is very remote. However, if 
we could state all the morphemes, each with its exact distribution, for a 
corpus consisting of all the utterances in the language over a period, 
showing that a given morpheme has not occurred in a given environ- 

" An AP Dispatch from Bolivia, July 10, 1944, includes the sentence: 
In a moment of consciousness Arze muttered 'the Nazis have killed me.' 

^* When a grammar which disregards these culturally determined limi- 
tations is used prescriptively as a guide to what one may say in the lan- 
guage, the user will not be fniSicd into saying these non-occurring ut- 
terances, since although the grammar does not exclude them the user 
will by definition find no occasion (due to cultural limitation, taboos, 
etc.) to say them. R. S. Wells points out that since this defense, and the 
whole disregard of culturally-imposed restrictions, depends upon the per- 
sonal judgment of the investigating linguist, it is fraught with uncertain- 
ty as a scientific procedure. The descriptive validity of the remaining 
procedures is limited if the classes which these procedures will treat can 
be made here to hide arbitrarily chosen limitations of distribution among 
morphemes. We can dispense with much of the linguist's judgment if we 
use a sufficiently large corpus and adequate methods of sampling in order 
to discover what is said even in relatively unusual situations. 


ment in any utterance of that language, we would still not be able to pre- 
dict with high probability that that morpheme might not appear in the 
given environment, for the first time in the history of the language, in 
some new utterance soon to be said. 

In part this is true because cultural change, technological and social, 
brings up new interpersonal situations in which the culturally determined 
limitations of yesterday may no longer apply. My run averaged better than 
600 miles an hour is an utterance which may never have occurred before 
airplanes were developed to a particular extent, but may occur several 
times immediately thereafter. 

Furthermore, this is true, even without regard to culturally determined 
limitations, because it appears that new permutations of morphemes 
which may never have occurred hitherto in the history of the language 
are in general more readily made than new permutations (over short 
stretches) of the far fewer phonemes. This applies not only to long and 
complicated whole utterances, but also to brief new combinations of 
morphemes such as de-frost, de-icer, We better re-polish it. 

In the latter case, one can term the morphemes, especially the bound 
morphemes, which occur in new combinations 'productive'." However, 
the methods of descriptive linguistics cannot treat of the degree of pro- 
ductivity of elements, since that is a measure of the difference between 
our corpus (which may include the whole present language) and some 
future corpus of the language. If we wish the analysis of our corpus to 
differ as little as possible from the corresponding part of any other cor- 
pus of the language, now or in the near future, i.e. if we wish our state- 
ments about the corpus to be predictive for the language, we must devise 
our approximation of morpheme classification in such a way as would 
disregard the variations and innovations noted here. 

Appendix to 15.3: Identical Distribution within Short Environ- 

The method of approximation most commonly used by linguists today 
is the consideration of environments shorter than the full utterance. A 
limited stretch of each utterance is selected, and morphemes are grouped 
together into a class if they can replace each other in that limited environ- 
ment. Thus we might select the position — ly to yield the class of large, 

^'^ In general, productivity of a morphome may be correlated with the 
relations among tliat morpheme, the class in which it belongs, and the 
differences in environment of the morpheme, other mor})homes of its 
class, and partially similar classes. 


clean, true, etc. Similarly, the environment the — ■ or the large — might l)e 
used to yield the eliiss of rnan, auto, life, etc.'^ 

This method, however, may not prove adequate. In many languages 
it may be impossible to devise a procedure for determining which short 
environments over what limits should be set up as the differentiating ones 
for various substitution classes. If we select —ing as a diagnost ic environ- 
ment, we would get a class containing do, have, see, etc., but not certain. 
If we select un — ■ as the environment, we obtain a class with do, certain, 
etc., but noi'have, and with see only if -en or -ing follow. We coukl oht ain 
many different classifications of the same morphemes. 

These different classifications are merely expressions of the relation 
between the particular environments in question and the various mor- 
phemes which occur next to them, or the like. Relations of this kind are 
not to be disregarded, and are discu.ssed in chapter 17; but often they do 
not correlate with other relations, so that classifying morphemes on their 
basis would not necessarily lead to a simpler set of new elements. 

Furthermore, the syntactic analyses of chapter 16 would in any case 
require the setting up of morpheme classes based on similarity of dis- 
tribution in respect to the total utterance environments. In many cases 
such classes would cut across the various classes set up in respect to short 
environments, so that the work of classification would have to be re- 
peated independently. We might plan to satisfy all considerations by 
classifying the morphemes on the basis of their short (usually imme- 
diate) environments, while using the utterance-long position as a criterion 
on the basis of which we would decide which immediate environment to 
regard as diagnostic. But in many cases even this will not work out. For 
example, if w-e decided that the position before -ly was important in re- 
spect to utterance position, we would obtain a class containing not only 
large, clean, true, but also man (in manly). In terms of immediate environ- 
ments, we would have no way of rejecting man, because the only straight- 
forward way of separating the -ly of largely from the -ly of manly, goodly 
is based on the position of these two in respect to the whole utterance. 
Similarly, the environment the — admits very, large, etc. as well as man, 
auto. And the environment the large admits and beautiful as well as man, 
auto; and many morphemes which we might wish, on utterance-position 
grounds, to include in the class of man may not occur after large. 

^* This might be called use of morphological criteria, as compared with 
the syntactic criteria of 15.3-4. 


Appendix to 15.32: Identical Morphemes in Various Classes 

The classes of 15.3 are mutually exclusive in respect to morphemes. If 
a morpheme is a member of a particular class, which may be included in 
a particular general class, it is not a member of any other class. ^^ 

In some cases we will find that the range of environments of one class 
is roughly the sum of environments of two or more other classes. We may 
disregard this for the purposes of our present morpheme classification. 
For the purposes of chapter 16, however, it will be convenient to avoid 
the repetition of environments by breaking the first class up into the 
two or more other classes: we would eliminate the class G of fn. 27, and 
would include all its morphemes in A'^ and again in V. This is another 
step, past that of 15.32, in the direction of making these into morpheme- 
in-environment classes rather than simply morpheme classes. It would 
have the new result of permitting several classes (A^, V) to have identical 
morphemes: the morpheme book would now be a member of A'^ and also a 
member of V . 

The convenience of defining a class as a sum of other classes is particu- 
larly great when we have not a large class like G, but a class of one mor- 
pheme, e.g. the morpheme /tuw/. This morpheme occurs in a unique 
range of environments, and would therefore have to constitute a class 
by itself. However, it turns out that these environments are roughly 
equal to those of three, Jour {How much is — • plus six?), plus those of 
with, at (Don't talk — him.), plus those of also (I'm going — .), plus cer- 
tain unique positions (7 umnt — go on.). In such cases we may decide to 
assign /tuw/ as a member of the three recognized classes (of three, of 

^^ Thus, in terms of the classes of 15.3, we might set up a general class 
G comprising the various classes which contain morphemes like book, 
walk, tie. We would approach this as follows: These morphemes occur in 
positions of classes 1-7 (in the Appendix to 15.4) and also in positions 
of class 8. Therefore the operation of 15.3 would place each of them in 
some particular class having a wide distribution, roughly equal to that 
of classes 1-7 (I'll — it.) plus that of class 8 (Let's take a - ■ .). The simi- 
larity among the environments of these classes (the one containing book, 
the one containing walk, etc.) would lead us to set up in 15.32 a general 
class G representing their common environments. However, the classes 
of mori)hemes like hotel, wood (which don't occur in environments like 
I'll — it.) can be grouped by 15.32 into a general class .V. And the classes 
of morjjhemes like think, die (which dtjn't occur in enviionments like 
Let's take a - -.) can be grouped by 15.32 into a general class ]'. The 
range of environments of (r is roughly (Mjual to th;it of \ plus that of l'. 
Each of these three classes contains different morj)h(>mes. 


wilh, and of also), and as a member of a small class of its own, restricted 
to a few types of environment.-"* 

The question of homonymous morphemes is thus somewhat clarified: 
Phonemically identical morphemes in one class are one morpheme as far 
as these procedures are concerned, no matter how different their mean- 
ings (13.41 and chapter 12, fn. 70). Phonemically identical morphemes in 
different classes may be distinguished on the grounds of their different 
environments (e.g. / siy/ V and /sly/ N in I see, the sea).-^ If the classes 
of the two phonemically identical morphemes have some environments 
in common, utterances may occur in which we cannot distinguish which 
morpheme (or class) is present : in dialects where / can tell my horse is 
running, and / can tell my horse s running, are homonymous, the hearer 
will not know from the utterance alone (if there are no differentiating 
neighboring utterances) which is meant. Similarly, we can distinguish 
rumor in It is — ed that we'll be leaving soon, from roo7fi + er in Did the — 
pay his bill? But we cannot distinguish the two in That's just a — .^° 

^* A somewhat different but related problem is that of a morpheme 
which occurs in roughly all the environments of one class but in only one 
or two environments of another class. E.g. but occurs in the various en- 
vironments of and. so {I didn't knew it — / asked him.) ; and in special 1' 
and >V positions in But me no 6?*/s. 

-^ Alternatively we may, if we wish, say that all morphemes, in no 
matter what class, which are phonemically identical are 'the same" mor- 
pheme. The various tuw morphemes, two, to, and too, would then be 
one morpheme occurring in various classes, as would the book in A' 
and ]'. Alternatively we may wish to call book one morpheme, but tuw 
three different ones. We might decide to consider phonemically identical 
morphemes in various classes as constituting a single morpheme only if 
a sufficiently large fraction of the morphemes of these various classes 
are phonemically identical, i.e. only if there is a sufficiently large number 
(in any case not just one) of such sets of phonemically identical mor- 
phemes distributed in precisely these classes: for X and V we have 
book, ivalk, and many others; for the full range of classes in which /tuw/ 
occurs we have no other case. (The number or fraction has to be arbi- 
trarily selected, but can be justified on grounds of descriptive simplicity. 
This holds especially for the disregarding of unique sets of morphemes, 
whose class distribution would be equivalent to that of no other mor- 
pheme.) This whole question, however, is essentially terminological and 
unimportant. It does not matter whether sets of phonemically identical 
morphemes are called one morpheme or not, so long as each study is 
internally consistent in this regard, and so long as the phonemic identities 
among the members of various classes are noted somewhere, e.g. in the 
morpheme index. 

^° This is the case because morphemes can be defined in such a way 
that complete overlapping is po.ssible: a phonemic sequence in a single 
environment may in some cases indicate either of two morphemes. 



Appendix to 15.4: Tabulating Morpheme-Environment Classes 

The work of 15.4 can be arranged in tables, each table representing a 
class. If we begin with see in Did you — ine stuff? and continue as in 15.4, 
we will obtain the following table, which we will consider as class 1: 

the stuff? 

it later. 

them for me, please. 

Did you 






I didn't 



They will 



the book. 

ing pictures is a bit out of my line. 

(This chart represents all sentences which consist of any environment 
such as / didn't — the book or — ing pictures is a bit out of my line, with 
any word of the enclosed column occupying the place of the dash.) 

When Let's — where it was fails to satisfy this group, we begin a new 
table, representing class 2: 

The magistrate 


ivhere it rvas. 
s it's O.K. 



whether he'll run or not 

They will 

We now test the morpheme stay. This does not occur with the environ- 
ments of class 1. It may occur with the first environment of class 2 but 
not with the second. It also occurs with the segmental morpheme of the 
third environment, but with a different contour (77^ stay whether he'll run 
or not usually has , / intonation, sometimes with pause, at stay, but hard- 
ly ever after run.. In contrast, I'll see whether he'll run or not almost never 
has /,/' after see, and sometimes has / ,/ after run.) 

We therefore reject stay from class 2 and begin a new table represent- 
ing chiss 3: 



, whether he'll run or not 



May Fred 


with me? 

)'ou just 


where you find a place. 

They ivill 




Tlie hist morpheme to be tried, die, may not occur in our corpus with the 
fourth environment. Rather than reject the morpheme, we may decide 
to drop the environment, and begin a new class ^ which will contain 
many of the morphemes of class 5, but not all: 

Yoit jvst 
1 used to 




where you find a place, 
here very often. 

Returning to the classes containing see, we test the environment Do 
you — the idea? Not all the morphemes of class 1 occur in this environ- 
ment, nor apparently do all those of class 2. We therefore set up class 5: 

Do yon 


the idea? 



the point. 

s easy to 




what's involved 

Several of the morphemes here would be rejected from class 2, so that 
setting up a new class here was justified. 

When we test the utterance I'll see you in hell first, we get a new class 6, 
with a few other morphemes substituting for see: 





you in hell first. 

If we now compare all the classes, we find that a few very brief en- 
vironments, such as They will — •, occur with almost all the morphemes. 
We can therefore set up a class 7 containing such environments and al- 
most all the morphemes above. 

None of these environments will occur with certain other morphemes, 
which we can list in a class 8: 

That's my 




is on fire. 

To this class we can add many morphemes which occurred in some of the 
preceding classes too: e.g. tie, bunk. 

Tables of this sort not only arrange the material in a manner that per- 
mits inspection, but also condense it considerably, since each table repre- 
sents each of its morphemes in each of its environments. 


Appendix to 15.5: Correlation between Morpheme Classes and 
Phonemic Features 

In many languages all the members of one class may have in common 
some phonemic feature which is absent in all the members of other 
classes. Thus in Semitic languages all morphemes of class C (15.31) con- 
sist of several consonants, usually three and almost always interrupted; 
all the morphemes of classes B, C consist of an interrupted sequence of 
vowels, rarely with a consonant added.'' In Tonkawa,'^ verb-theme 
morphemes are bound, noun-theme morphemes free. 

These differences may be such as appear only in certain environments 
of the class.'' There may be differences in contours, in phonemic junc- 
tures, or in morphophonemes. In any case, it is useful to state all such 
correlations. We may say that these phonemic characteristics of a class 
have a meaning, as indicating that class of morphemes. 

The considerations may have led us to include phonemically identical 
morphemes in various classes. Such phonemic identity of various mor- 
phemes may be singled out for special mention ; it will in any case appear 
in any alphabetical listing of the morphemes. 

" Yokuts presents an interesting case of a language in which each of 
the various morpheme classes has a characteristic phonemic structure. 
Of. S. Newman, Yokuts Language of California; and ch. 12, fn. 40. 

'^ Harry Hoijer, Tonkawa, in Franz Boas, Handbook of American In- 
dian Languages 3. 

■" For Greek nouns and verbs, see Marcel Cohen, Travaux du Cercle 
Linguistique de Prague 8.39 (1939). 

16. morphemp: sequences 

16.0. Introductory 

By the terms of this procedure the linguist can set up sj-ntactic form- 
classes which indicate what morpheme sequences have identical syntactic 
function, i.e. occur in identical environments in the utterance. It thus 
covers a large part of the material usually included in syntax, and some 
of that which is called morphology. The syntactic and morphologic re- 
sults are obtained by the same procedure, so that no distinction is drawn 
between them. Differently from most combinations of syntax with mor- 
phology, this section does not proceed by first dividing utterances into 
large syntactic sections and subdividing these into smaller morphologic 
ones; instead, it begins with morphemes, investigates their syntactic 
function, and builds up from them to ever larger morpheme sequences 
having identical syntactic status. 

16.1. Purpose: Fewer and More General Classes 

We seek to reduce the number of classes which we require when we 
state the composition of each utterance of the language; and to make it 
unnecessary to state in chapter 19 the special restrictions of certain sub- 

In chapter 15 we obtained classes of morphemes, such that each mor- 
pheme in a class could be substituted for other morphemes of that 
class in an utterance in which it occurred. All members of each class were 
thus approximately identical in respect to utterances. In stating the dis- 
tribution of morphemes we can therefore speak in terms of these classes 
instead of the individual member morphemes, with Uttle loss of precision. 
In some languages this may represent a considerable reduction in the 
description. How-ever, most languages will still have a large number of 
classes after the operation of chapter 15, and the work of description 
would be considerably lessened if w^ays can be found to reduce this num- 
ber. To this end, we would want to show that many classes are distribu- 
tionally equivalent to one another. This cannot be done directly, because 
all single morphemes whose utterance distributions are even approxi- 
mately identical have already been placed in the same class in chapter 15. 
However, we can find new distributional equivalents by considering se- 
quences of morpheme classes instead of single classes. No morpheme 
outside of, say, the class D (which includes morphemes like quite) has 



precisely the same distribution as do the members of D; but the sequence 
composed of a morpheme of the class .4 (e.g. large) plus the morpheme ly 
does have the distribution of D; They're quite new; They're largely new. 

We can thus extend the operation of chapter 15 to refer to sequences 
of morphemes as well as single morphemes. In the proposed extension, 
as in the operation of 15, substitutability will be considered in respect 
to whole utterances. The work of 15 thus becomes a special case of 16, 
one morpheme being a particular case of a sequence of (one or more) 

16.2. Procedure: Subslitutable Sequences of .Morpheme Classes 

We equate any two sequences of classes if one of them is substitutable 
for the other in all utterances in which either occurs.^ 

If the sequence' oi A -\- ly is always substitutable^ for J), we write 
the equation A ly = DJ' This equation means that the range of utter- 
ance environments of A ly is identical with that of D, or that wherever 
we find a member of D we may substitute for it not only some other mem- 
ber of D but also some member of A followed by ly. 

More generally, given the sequence of morpheme classes A' occurring 

' Nevertheless, it was advisable to carry out the operation of 15 first, 
since restricting the sequence to one made the work much simpler; and 
16 utilizes the results of 15, in that 16 does not consider sequences of indi- 
vidual morphemes but sequences of morpheme classes : it does not state 
that large + ly has the same distribution as quite, and that new + ly 
does, and that utter + ly does, etc., but that A + ly has the same 
distribution as D. The work of 16 is therefore greatly shortened by being 
performed after 15. 

^ A single morpheme class, which may be substitutable for a sequence 
of morpheme classes, is considered a special case of a sequence. 

■'The following morpheme class marks for English are used here: 
A {large, true, etc.), A^ {life, hotel), V {grow, have), D {very, well), T {a, 
some), I {I, it), P {in, up), R {do, will), & {hut, and), B {if, since). An 
{-ness, -th), Na {-ful, -ish), Nn {-eer, -hood), Nv {en-, -ize), Vn {-ment, -t) 
Vv {-ed) . Other class marks are occasionally defined for particular examples. 

* Here as throughout these procedures, X and Y are substitutable if 
for every utterance which includes A'' we can find (or gain native ac- 
ceptance for) an utterance which is identical except for having V in the 
place of X. 

^ The space between the two morpheme-class marks A and ly indicates 
succession in time. We can understand this equation to mean that the 
occurrence of I) is the logical product of the occurrence of A and the oc- 
currence of ly (where occurrence means utterances in which the form 


in the range of utterance environments M, we find all sequences Y, Z, 
etc., which occur in precisely that range, and write 1' = X, Z = X, etc. 

First, we try all the cases where the morpheme sequence X is just a 
single morpheme* That is, we take each morpheme da^s resulting 
from chapter 15 and seek sequences (}', Z) sub.stitutable for it." In doing 
this, we use as testing frames what seem to us to be representative envi- 
ronments of the class X. We must always be ready, however, to find en- 
vironments for which our testing frames were not representative; if a 
substitution occurred in all our frames but not in the new environment, 
it is no longer valid.* 

Only after this is done do we investigate substitutions among se- 
quences which are not equatable to single morpheme classes.^ In many 
languages we will find that no such cases exist, and that by the time we 
have found all sequences which are equatable to single morpheme classes, 
we have found all sequences which can be equated to any other se- 

* For this purpose we can use any of the classes resulting from the 
procedure of chapter 15, whether of the type of 15.3 or of the type of 15.4. 
We can use the original sub-classes, or the general classes (15.32. 15.41) 
which are defined in terms of them. If the more detailed sub-classes of 
chapter 15 are used, there will have to be many more detailed equations 
in chapter 16, indicating the particular sequences of small morpheme 
group and small environmental group which caused us to distinguish 
this morpheme group from the others which had partially similar dis- 
tribution. Instead of writing A^ + Nn = N {boy + hood, or engine ■+- 
eer, is substitutable- for boy in Where is my — • gone?), we would have to 
write Ni + Nni = A^ and N2 + Nn2 = N, where .Vi represents boy, 
girl, etc., Nui represents -hood, A'2 represents engine, profit, etc., N712 
represents -eer (boyhood, girlhood, and engineer, profiteer are all substi- 
tutable for boy). 

'We do not seek single morphemes substitutable for it, for they 
would already have been included in that class in chapter 15. 

* For instance D is substitutable for I'„ -f- / (where T'„ represents 
know, think) in We — would like to: thus, we have both We really would 
like to and We thirik he would like to. But D and T'„ -|- / are not substi- 
tutable for each other in — • we would (where only really, etc., occurs) 
or in Do you — • did it (where only think he, etc., occurs). Hence we do 
not write D = T'„ -(- /. 

' This is comparable to 13.4, where we first grouped segments into 
morphemes on the model of morphemes having only one segment, and 
later grouped segments into morphemes in a manner calculated to yield 
the simplest morphemes and the simplest relations of morpheme to its 

•" An e.xample of a sequence which substitutes for no single class is 
Moroccan Arabic R Pv, noted at the end of the list of morpheme classes 
in the Appendix to 16.22. 


In addition to indicating the relation of substitutability among se- 
quences (or between a sequence and a single morpheme class), each of 
these equations also indicates the relation of occurring together in one 
utterance (usually next to each other) on the part of the various morpheme 
classes in each sequence, i.e. on the part of all the morpheme classes on one 
side of the equation. If we say that AN = .V (e.g. good boy is substitutable 
for fool in Don't be a — .), we are incidentally indicating that A occurs 
with N in some utterances." 

16.21. JWon-repeatable Substitutions 

Some sequences prove to be substitutable for a given morpheme class 
in particular environments and not in others. This brings up a new rela- 
tion of non-repeated substitutability, which can be indicated in these 
equations by a modification of the class symbols. 

For example, if we indicate morphemes like boy, king, by A^ and mor- 
phemes like -hood, -dam, by Nn, we may write A' Nn = A^ (boyhood, king- 
dom replace life in His — was obsessed with many fears. ^"^ Since the equa- 
tion means that A^ Nn is replaceable everywhere by A', and A'^ by A'' Nn, 
we might be led to think that we could also replace N hy N Nn in the 
equation itself and obtain A'' Nn Nn = A^ However, this is not in gen- 
eral the case, since boyhood does not occur before hood. In contrast, ^ A'^ = 
A^ is repeatable, so that we can derive A AN = A^, and so on: old man, 
or old, lonely man are both substitutable for man. 

The difference between repeatable and non-repeatable substitutions 

" In many cases substitution occurs only in the environment of some 
particular class or sequence. E.g. one member of A is replaceable by two, 
but only if a member of N follows : fine is replaceable by fine young in 
They are fine men, but not in They are fine. Instead of saying that A A = 
A but only before A'^, we avoid the extra comment outside the etjuation 
by writing AAN = AN (or more simply ylA^ = A', from which this can 
be derived). This equation provides only for the substitution which oc- 
curs, and leaves no basis for replacing A by AA elsewhere. The technique 
here is to include the limiting environment in the equation itself, and on 
both sides of the equation since it is not itself part of the substitut ion and 
does not vary during the process of substitution. It goes without saying 
that the environment is defined not only in terms of the neighboring 
morpheme classes (and the position of our given element in respect to 
these neighbors), but also in terms of the intonation, stress, or other 
contour under which our given element occurs. 

'^ Since boy and king are in different classes, say A^o and Nb because 
they don't replace each other before -hood, -doni, we really have two etjua- 
tions here: Na -hood = A'', Nh-doni = A^. When we find classes which 
are mutually substitutable in some positions and not in others, we may 
indicate them by one letter with various subscripts: Na, N b- 


can bo indicated by the use of raiscii numbers. We can write A' Nn = A- 
to indicate that A' Xn (which equals A"-' not A'') cannot be substituted for 
the A'^ of A'^ Nn itself, so that we cannot derive A'^ Nn Nn = A^; i.e. boy- 
hood is A'- and therefore cannot be substituted for the A"' of A'' + -hood 
to yield boyhood-hood.^^ In contrast, .4A'' = A'' states that wherever 
we see A^^ we can write AN^ in its place, and this permits us to replace 
even the A'' of AN^ itself by .4A'', thus yielding .1 .lA' = A'. 

The general method of assigning these raised numbers is as follows: 
We assign raised ' to each class symbol, say A', when it first occurs in an 
equation. Next time the class A' occurs, in a new equation, we check to 
see if the equivalents of N as stated in this new equation are substitutable 
for the previous A^'. If they are substitutable, we mark the new A' as 
A^'; if they are not we mark the new A' as A^^. This checking is carried 
out for the N of each new equation. Each time we test to see if the 
equivalent of the new A^ is substitutable for all preceding A'', or for all 
preceding A"-. If it is substitutable only for the A^^, we mark it as such. 
If in some equation (including the new equation itself, if it contains 
more than one A"), the new A'' is not substitutable for either A"' or A^, 
we mark it as A'^^; and so on.'"* 

In this way A'^ -s = N^ : boys or boyhoods replace boy in Such is the story 
of their — . Note that we cannot write A'- -s = A^^ since that would per- 
mit boys to be equal to A^^ and so to replace A'^ before -s, yielding boys 
+ -s.'' 

We can now consider the sequence TN, which is substitutable for A'; 
e.g. a cheese, some cheese, for cheese, cheeses, in We can use — iti place 
of meat. However, TN cannot replace the A" in some of the preceding 

'^ The variously numbered A''', A'^, etc. here and below are all one class 
(differently from the Q and R of the Appendix to 16.4); and all contain 
the same single-morpheme members. The numberings indicate the dis- 
tribution (range of substitutability) of the new morpheme-sequence 
members which are added to the class by the stating of the equations. 
Thus A'' represents boy, king, etc. A^ represents boy, king, boyhood, king- 
dom, etc. A'^ represents boy, king, boyhood, boys, boyhoods, etc. A'"* repre- 
sents boy, boyhood, boys, boyhoods, a boy, a boyhood, some boys, some boy- 
hoods, it, etc. When we say A'' Nn = A^ we mean that boyhood (which is 
NKYn and so A'^) can occur wherever A'^ occurs, e.g. before -s {N^-s = 
N^), but not wherever A''' occurs, e.g. before hood again. 

'* If some class .symbols never go above • , we can dispense with the 
raised number for them. Thus it is sufficient to write D without numbers. 

'^ On the left-hand side of the equation, each raised number will be 
understood to include all lower numbers (unless otherwise noted). Hence 
we do not have to write A^'-^ -s = N^: the A'^ will repj'esent both boy (A'*) 
and boyhood (A'^). 


equations: we cannot substitute TX for N in A N^ = A'^ for we would 
derive a non-extant A TN^ = N^ {Siviss some cheeses).^^ Therefore the 
resultant X must have a new sub-class numbering which will preclude 
its substitution in the preceding equations: TN^ = A''^. We can now say 
that the morpheme class equals this N*: thus it is substitutable ior free- 
dom, the long grind in — • will be re-established}'' Among the later equa- 
tions we will have ones like ^N^i N* V = N*: the books my various friends 
borrowed, or men I have known rep\a,c'mg fish in — will be discussed later. 

Each higher numbered symbol represents all lower numbered identical 
symbols, but not vice versa. Therefore, the higher numbered symbols 
have a more inclusive representation, and arc of greater importance in 
any compact classification of the morpheme sequences of a corpus.'* 

'* This equation would be correct if we state the relative order of A 
and T: that whether a formula has AT or TA it always indicates the 
sequence TA in speech. However, one purpose which the sub-class num- 
bering serves is to preclude the necessity for such additional statements, 
and to let the sequence from left to right indicate succession through 

'^ The equation / = TN^ = A'' indicates that T (e.g. a, some) never 
occurs before a member of /, since T does not occur before A^^ but is in- 
cluded in it. 

'* It is also possible to set up a somewhat different system of succes- 
sive numbering, which would more closely accord with successive mor- 
phological levels (cf. ch. 18 fn. 11). Instead of assigning raised numbers 
for morpheme classes, we assign a number for each boundary between 
morpheme classes: .4^ -An = 'A'' instead of ^4' -An = A'' (darkness sub- 
stitutable for light). These numbers are considered as part of the en- 
vironment of the morpheme classes in question, on a par with the other 
morpheme classes which constitute the neighbors of the class in question, 
and the po.sition of our class relative to these others, and the intonation 
or stress or other contour under which our morpheme class occurs. 
Whenever we find that assigning a previously-used, lower, number in a 
new equation would make possible substitutions that do not occur, we 
use the next higher number. Thus if /iW^ = W^ (old fellow substi- 
tutable for Senator), we cannot wi'ite T-N^ = "^X"^ (the war substitutable 
for butter) since this would permit us to construct the non-occurring 
AT^X'^ = ^V^ (as though we could substitute old the war for Srnolor). 
Hence we write TW = 'A'^. (The raised-number forms for the fiist and 
last equations here are: .4A'' = X and T'A'^ = A'^) In this way, suc- 
cessively higher numVjers are assigned to various inter-morpheme-class 
boundaries. The boundary numbers are related to the raised numbers, 
but not identical with them. One of the advantages of the boundary num- 
bers is that they indicate on which side the sequence is reaching a liigli(>r 
con.struction level (as in 'A'^, which shows that the noun phrase is closed 
on t he left, since no part of the noun phrase can precede t he art icle T, but 
not on the right, where an adjectival phrase such as/ro?/; Washington can 
still be added j. 


16.22. Analysis of the Complete Corpus 

As in the ciise of all the other procedures, the operation of chapter 16 
can be worked out most conveniently in any particularjcorpus only when 
it is worked out for the whole corpus. Trye, the stating of equations can 
be done for any substitutable sequences, without regard to the other se- 
quences of the corpus. But the determining of the smallest number of 
different raised numerals (16.21) necessary for each class can only be 
done by taking all the substitutions of the corpus into consideration. 
The use of this method for the analysis of 16.3 will also usually require 
consideration of the whole corpus of material in the language in ques- 

16.3. Sequence Substitution as a Morphologic Tool 

The operation of chapter 16 expresses many of the most wide-spread, 
and, from the point of view of a systemic description, most structural, 
relations among morpheme classes. Therefore this procedure makes it 
possible to treat some of the more complicated and apparently aberrant 
morpheme relations. 

16.31. Exceptionally Limited Morphemes 

In some cases we find a class of morphemes which occurs only with 
another class, which in turn occurs only with the first class. For example, 
the wh- of why, when, where, which, is clearly a separable morpheme (cf . 
then, there), and occurs in a fairly large number of positions (at the be- 
ginning of certain questions and of X, before or after T', etc.) But when- 
ever it occurs it always has one of a very few other morphemes (-en, -ere, 
-ich, etc.) after it. These morphemes, in turn, occur only after wh- and 
after a few other morphemes (chiefly th-, which is not in the same class 
as wh- because its utterance position is different), so that they too form 
a small special class occurring in very limited sequences. 

While the procedure of chapter 15 made it possible only to state the 
membership and distribution of these restricted classes, the equations of 
chapter 16 serve as crutches on which to support an analysis of the re- 
stricted classes in terms of the other equations of the language. By 
analyzing the sequences in which these morphemes occur, we are able to 

'' For a sketchy outline of substitutions in a whole corpus see Z. S. 
Harris, From Morpheme to Utterance, Lang. 22.161-183 (1946) (for 
English and Hidatsa); review of Emeneau's Kota Te.xts, Lang. 21.283-9 
(1945) (for Kota) ; Structural Restatements I, Int. Jour. Am. Ling. 13.47- 
58 (1947) (for Eskimo, Yawelmani); and the Appendi.x to 16.22 (for 
Moroccan Arabic). 


show the equivalence of their sequences to other sequences (composed 
of previously classified morphemes), and so the equivalence of their con- 
stituent morphemes to the previously recognized morpheme classes. 

The general technique used here is as follows: given a morpheme se- 
quence X (e.g. this) the parts of which have not been assigned to any 
otherwise known classes, we find what other morphemes or morpheme 
sequences YZ (e.g. TA) can replace it in the utterances in which it oc- 
curs : X = YZ. We may then take the component morphemes a, b of the 
sequence A' and say that a is a member of the class 1', and b of the class Z. 
As a result A' is no, longer unique: it is a sequence of members a, b of 
known classes Y, Z, occurring in a sequence in which these classes are 
known to occur. 

The disadvantage in the latter part of this technique lies in the fact 
that the a, b, analyzed out of X usually occur only in this one sequence or 
in a very few more ; so that whereas in general we might expect any mem- 
ber of Y to occur before almost any member of Z, we find a restricted to 
b and b to a. E.g. if that is divided into th- in T and -at in N, we find al- 
most any T occurring before any A''; a house, some houses, some stream,s, 
etc. but th- only before -at and a few others, and -at only after th-. 

16.32. Morphemic Resegmentation 

The method of 16.31 can be applied to a reconsideration of the mor- 
phemic segmentation of chapter 12, if we permit X to be not only a mor- 
pheme sequence but also a single morpheme. In some cases, the opera- 
tion of chapter 12 leaves us with some particular morpheme which can 
be assigned to no class, or with a class containing just a few morphemes 
which differ distributionally from all other classes. If now the operation 
of chapter 16 shows that this morpheme or small class is substitutable 
for some sequence of other, more general, morpheme classes, there may 
be some advantage in dividing the unique morpheme (or each mor- 
pheme of the small class) into several new morphemes each of which 
will be considered a member of the corresponding class in the class se- 
quence which equals the unique morpheme.^" 

Thus in Moroccan Arabic, dial 'belonging to' occurs after A'''^ and 
before <S; Iktab dialu 'his book' (cf. Appendix to 16.22). No other single 

^" This constitutes a reconsideration of the segmentation of chapter 12, 
assigning some of the phonemes of the dependent .se(}uence (which con- 
stituted the unique morpheme) to one new morpheme which is a member 
of one class, and other phonemes of the sequence to another new mor- 
pheme which is a member of another class. These new morphemes will 
not represent independent phonemic sequences. 


morpheme occurs in this position. We now ask what sequence of mor- 
phemes occurs in tliis position, and find Hi F, as in Ihtab Hi ^ndu 'the book 
that (is) with him'. We then divide dial into two morphemes, a relative 
which enters into one morpheme class with Hi, (which may be marked 
as class D) and a preposition which becomes a new member of P by the 
side of '^nd, etc.^' 

There are, of course, obvious disadvantages to this resegmentation. In 
chapters 12, 17 we take special limitations of concurrence among ele- 
ments, and try to express them by including the phonemic sequences 
which occur together in one morpheme segment. Here we would be taking 
a single morphemic segment and breaking it up into two phonemic se- 
quences, two morphemic segments, which only occur together. We are 
replacing here a single morpheme which constitutes a unique class by 
two morphemes which are members of major classes, but which have a 
special limitation of occurrence between them. Much the same advan- 
tages and disadvantages are involved in the comparable work of 16.31, 
e.vcept that there the morphemic segmentation had already been made, 
so that we were in any case faced with the need for a statement of re- 
stricted occurrence among the segments. 

Whether X is a single morpheme (16.32) or a sequence (16.31), we 
have, then, the explicit choice of merely stating that A" = YZ (e.g. 
this = TA; dial = DP), or of identifying the parts of X with the equated 
sequence and putting th- in sub-class Ta and -i& in Aa, and then saying 
that TaAa = TA, (or dia with its alternant d is Da and I is Pa, and 
DaPa is substitutable for DP)}"^ The latter is the more useful, the more 
members we have in the sub-classes which are involved in the special rela- 
tion of occurring together (e.g. TaAa), or the greater the number of simi- 

^' When we try to decide how to divide dial into these two parts, we 
notice that the second part can be taken so as to be identical in form with 
a known preposition /- 'to', thus leaving dia- as a relative 'which is' identi- 
fiable as an alternant of another member of D, the morpheme d- 'that of 
which occurs in a very few environments: We thus combine \d\ 'that of 
+ \l\ 'to' -\-{u\ 'he' to obtain dialu 'which is to him, his'. It happens that 
earlier periods of Arabic, and other dialects of Arabic today, have cog- 
nates of this relative, which has here been isolated on purely descriptive 
systemic grounds. 

^^ The classes Ta and A a are useful only for this one equation, to make 
it clear that th (in Ta) does not occur before any morpheme (any A) 
which is not in Aa. Once the sequence of th- and -is has been stated by 
this equation, the Ta and Aa can be disregarded, for the sequence of 
th- and -is is not restricted in the way the component parts had been; the 
sequence occurs wherever TA occurs. 


lar equations which we would have to deal with if we do not break them 
up {this = TA, what = TA, etc.).'^ 

16.33. Indicating Differences among Utterances 

Any two utterances which are not descriptively equivalent differ from 
each other in morphemic content. This difference can be readily recog- 
nized in terms of the morpheme index. Some utterances, however, also 
differ in a less easily recognizable respect: e.g. when a morpheme or 
sequence in one utterance is a member of a different class than when 
the same morpheme or sequence occurs in the other utterance.^'' 

We take, for example, the utterance She made him a good husband be- 
cause she made him a good wife. We know that there is a difference in 
meaning between the two occurrences of made; and since we know this 
without any outside information beyond hearing the sentence, it follows 
that indication of the difference in meaning and in construction can be 
derived from the structure of the utterance. The difference is not in the 
morpheme made, since the two occurrences are identical in form, and 
must therefore be in the class membership of made in the two cases. But 

^' If we do not break the morphemes up, and obtain a number of simi- 
lar equations such as that = TA, what = TA, then the procedure of the 
Appendix to 16.4 would suggest that we put that, what, etc. into a single 
class for the purposes of the TA position, even if in other positions they 
do not replace each other and enter into different equations: (e.g. what 
cannot be substituted for that in the plan that our group proposes). 

^* This is possible because the classes of 15.4, and any morpheme classi- 
fication used for chapter 16, are classes of morphemes-in-environments, 
so that a single morpheme or morphemic sequence (or in any case a 
single phonemic sequence) may in one environment be a member of one 
class and in another environment a member of another class. This is only 
partial overlapping of morpheme classes, and given two different ut- 
terances containing the same morpheme we can tell to which class the 
morpheme belongs in each utterance by noting the different environ- 
ments. In the case of / can tell my horse's (or: horse is) running (Appendix 
to 15.32), there is a segment /oz/, member of a morpheme \'s\, member of 
class Na, and a segment /az/, member of a morpheme {be], member of 
class V. However, here we have complete overlapping, and we cannot tell 
which morpheme, of which class, occurs because the environment is iden- 
tical. We could, of course, carry out substitutions in the manner of 16.33, 
and if we are satisfied that the utterance has not changed except for our 
substitutions we may find that our informant will accept either / can tell 
the running of my horse as an equivalent, or else / can tell that my horse is 
running. But we could never distinguish, except in such terms as infor- 
mant to equivalents, which morpheme occurred in the original 


the ehiss membership must be recognizable from the different class se- 
quences and their substitution in the two utterances. 

We therefore proceed to analyze the utterance, going backward along 
the equations as far as will be necessary to reveal the difference.''^ First, 
we know that the utterance is a case of N*V^ & N^V* = N*V\ At this 
stage the two halves of the sentence are still identical. Each V* has the 
structure V^ (make) N* {him) N* (a good hxisband / wife) + -Vv (-ed). The 
English analysis as a whole contains two cases of this sequence: F/WW* 
= Ve^ (make Harding President), and IV A' W* = V,^ (make my husband 
a party). We cannot tell which each of our V* is, and whether both go back 
to the same one, because make is equally a member of Vd and V/. We 
find, however, that TVA'Wi^ = TV A'l* ^cA'M^^'here the subscript num- 
bers are used merely to identify the A' which has different positions in the 
two sequences). We try now to see whether either V* in our utterance has 
the TV A^* A' * structure by applying to each the substitution which is pos- 
sible for V/ N* N*. To do this we interchange the two A'^ and insert a Pc 
between them. In the first T'^ we get a meaningless utterance which does 
not occur in our corpus: she made a good husband (Ni) for (Pc) him (N) 
instead of she made him (N) a good husband (Ni). In the second, however, 
the substitution merely gives us an equivalent utterance which we have 
in our data: she made a good wife (Ni) for (Pc) him (N) instead of she 
piade him (N) a good wife (N\). Clearly, then, the second T^* = Vd"^ N* N* 
-\- Yv = Ve^ + Vv. Since the first V* does not equal this, it can only 
equal the remaining T'A"A' construction, namely T/'A'^A'^ -f- Vv = TV + 

" Morpheme classes used here, other than those listed in fn. 3, 32, and 
59 are: Vf verbs which occur before A^^A'^^ (two independent noun 
phrases): make, consider, want (but not buy, go) as in 77/ — this book a 
best seller. T^e^ is equivalent to T'/W^V^ (e.g. make them me?nbers substitut- 
able for join in We're going to — for this calendar year.) Pc indicates those 
prepositions (P) between two A's which sometimes alternate with zero 
when the A^ which follows the P and the N which precedes the P exchange 
positions. I.e., when we have NiP\ varying freely with A'A'j we say that 
the P in question is a member of the sub-class Pc: e.g. to, for in They're 
giving a present to the boss are replaced by zero in They're giving the boss a 
present. The sections following They in these two utterances are 
TVA\*PcA'^ and TVA'lVi^ respectively, and are substitutable for T'«^ 

^^ We can check this by noting that if in the first T''* we substitute a 
verb which is not a member of V/ we get a sequence which hardly ever 
occurs, and whose meaning is not changed by the A'^FcA''' substitu- 
tion: She bought him a good husband would not differ in meaning 
from She bought a good husband for him. But if we try another member of 
Vf, e.g. consider, we find again that the substitution gives a meaningless 


We have thus found that the two halves of the original utterance are 
formally different in the substitutions which can be performed upon 
them (note, also, the alternative analysis in chapter 17, fn. 12). 2' The 
method of working was to discover the class membership of the mor- 
pheme in question in each environment by expressing the environments 
in terms of their classes, and then seeing which substitution equations of 
chapter 16 applied in each case. 

16.4. Result: Classes of Substitutable Morpheme Sequences 

We now have new morphological elements, each a class of sequences 
of morpheme classes (including single classes) which can substitute for 
each other in any environment whatsoever. The most inclusive elements, 
those to which the greatest number of different morpheme-class se- 
quences can be equated, are represented by the highest-numbered sym- 
bols of each class. E.g., as between A'^^ and A'^^, the latter represents more 
sequences (all those of A'^'' and others besides) and would therefore be 
taken here as the new morphological element, replacing the A'^ ( = A^') 
of chapter IS.''* Classes like English A, all the occurrences of which are 

utterance, or in any case one of highly altered meaning: She considered 
him a good husband as against she considered a good husband for him. Verbs 
in Vf are therefore verbs which involve obvious change in meaning when 
the NiPeN substitution is imposed upon them; verbs not in Vf do not 
involve any reportable change in meaning under that substitution. 
Therefore the made in made him a good husband functioned as a member 
of Vf. 

^^ Objection might be made at this point that the potentialities of sub- 
stitution cannot be used to distinguish portions of speech; for these 
should be distinguished by their internal structure, independently of 
what substitutions occur in partially different utterances. However, ex- 
perimental work in the psychology of perception, especially that due to 
Gestalt psychologists, leaves little doubt that an utterance is perceived 
not as an independent structure but in its relation to other utterances. 
Therefore any differences in substitution potential which can be recog- 
nized from the structure of an utterance are relevant even to that ut- 
terance alone (and are certainly relevant to the whole language). 

^* R. S. Wells terms A'' (up to its highest raised number) the expansion 
of A^'; i.e. the expansion of a morpheme class is the class of all sequences 
which occur in its environments. (See his Immediate Constituents, Lang. 
23.81-117 [1947].) The classes of chapter 15 were definable extensionally 
by a list of morphemes and intensionally by environments (any environ- 
ment in which all the stated morphemes occurred) : the classes of chap- 
ter 16 are definable extensionally by a list of environments and inten- 
sionally by morpheme sequences (any morpheme or sequence which 
occurs in the stated environments). 


iiu'luded in equations for some other symbol (A^), are no longer counted 
among the morphological elements. 

The procedure of 16.2 has indicated the equivalence of many sequences 
to single morpheme classes, written on the right side of the equations. It 
will in general be found that very few morpheme classes remain on the 
right side of the equations, without being included in some sequence 
which is equated to some other morpheme class. We thus come out with 
a few classes (each having its highest raised number, by 16.21), e.g. 
.V^, V*, D, and several contours, to some one or another of which every 
morpheme sequence is equivalent. Any utterance can be described as a 
sequence of these few remaining classes, since any sequence in the ut- 
terance can be equated to one or another of these: These hopeful people 
want freedom is NV because these is TA, hopeful is A^ Na = A, freedom 
is A An = N, and TAAN = TAN = TN = A^, and VN = V {see it 
for see in 7 — now.). 

These few classes are classes of morpheme sequences (including single 
morphemes as a special case) rather than of morphemes: A^ now repre- 
sents not only these hopeful people but also freedom and the industrial 
workers. They therefore represent a segmentation of the utterance into 
larger parts, each of these parts containing an integral number of mor- 
phemic segments. ^^ There are as few or fewer of these morpheme-class- 
sequence parts than morphemes in an utterance, and fewer distinct class- 
sequence elements in the corpus than distinct morpheme classes. In addi- 
tion to the segmentations of utterances into phonemic and into mor- 
phemic segments (as immediate constituents, see 16.54), we thus have a 
derived segmentation into these major morpheme-sequence elements, 
which can in turn be segmented into the morphemes included in the se- 

In defining these morpheme-sequence elements, the equations of chap- 
ter 16 have indicated a great number of the special relations of selection 
among morpheme classes. The fact that free (in A^ occurs before dom 
while true (jwAb) occurs before -th is indicated by Aa -dom = N , At-th = 
A'; the equations do not recognize any sequence of At and -dom or Aa 

^^ I.e. in general no utterance has a morphemic segment which belongs 
partly to one of these major segmentations of the utterance and partly 
to another, as if in the last utterance example there were some morpheme 
which was included partly in the A^ and partly in the V. The only pos- 
sibility of this would come under 16.32 above. We could also say that 
Moroccan Arabic S = I -{- A'^ (Appendix to 16.22), since we have 
S = N\ and IN' = A^l 


and -th. The A^'s which result from the two equations are, however, iden- 
tical. No distinction is made between the positions of freedom and of 
truth: wherever one occurs the other can be substituted, even though 
true cannot be as freely substituted for free (e.g. before -dom). The equa- 
tions therefore express the restrictions of concurrence among morpheme 
classes, and are limited by them; but having expressed them, the equa- 
tions make it unnecessary for us to consider these restrictions in any of 
our further work.^° The resultant classes no longer have the restrictions 
of the classes of whose sequences they are composed. 

16.5. Relation of Class to Sequences Containing It 

Generalizations useful for the constructions of 17.5, and for other 
purposes, can be obtained from the equations of chapter 16 if we consider 
the occurrence of each class relative to the sequences in which it is con- 

16.51. Resultant Class Differing from Sequence Classes 

In XF = Z, where a sequence of two (or more) classes is equivalent 
to some other class, we may say that Y changes the utterance position^' 
of X into that of Z. This way of talking is useful when Y is in some sense 
secondary to X, e.g. when Y never occurs except in this equation, where- 
as X occurs in various other equations, too. Thus in A An = N {darkness 
substitutable for dawn), or A'^ Na = A {boyish substitutable for large), 
it is convenient to say that the addition of ^n permits A to occur in A^ 
position, or that Na changes A'' into A as regards utterance position. ^^ 

Various generalizations may be possible here. For example, in English 
there are bound morphemes that transfer each of A^, V, A into each of 
the others: A'^ -al = A {industrial), en- N = V {enshrine), N -ize = V 
{ionize), V -t = N {portrait), V -able = A {agreeable), A -ness = N {sly- 

'" These restrictions can also be indicated graphically by the method of 
the Appendix to 19.31. 

•'' Or 'syntactic function', or status in respect to the utterance struc- 
ture. Sequences of the type XY = Z are called exocentric constructions. 

^'^ In XI' = X, we could call X primary and Y secondary. Similarly, 
we can say that in V/N* = Ve^ {lay it substitutable for lie) it is A^'' that 
changes V/ into VJ'. In this case, it is not that A'^* does not occur other- 
wise, but that VJ^ and VJ^ are both sub-classes of V, and that the ut- 
terance position in which V'^A'^'' and Ve occur is a position occupied only 
by sub-classes of V with or without additional classes like A'^ or D. For 
considerations of primacy, of. J. Kurytowicz, Derivation lexicale et deri- 
vation syntaxique in Bulletin de la Soci6t6 de linguistique de Paris 87.79- 
92 (H)36). 


tiess), A -en = V (lighten); but for D we find that only A can be trans- 
ferred into that class (by the morpheme -ly): A -ly = D (really). 

16.52. Resultant CAass Identical tvith One of the Sequence 


In XI' = A' we may say that 1' has zero status in respect to the ut- 
terance structure. Such is the status of ^4 inAA' = A' (^nepzonosubstitut- 
able for piano), but not in TA = N (the blue for dresses in / prefer — ) or 
in A An = N. Such also is the status of^ain^yio = yl (youngish sub- 
stitutable for young), or of Ph as in V^Pt = V^ (walk o^substitutable for 
walk), or of not as in R not = R (will not substitutable for will). The class 
D occurs in many environments and has zero status in all of them : any 
utterance or part of utterance containing D can be matched by an other- 
wise identical sequence not containing D (I want it for / want it badly). 

In some cases there is a sequence of morphemes that has zero status. 
E.g. in N^PN* = A''^ (piece of junk for auto) and V*PN* = V"^ (travel in 
this place for travel) we may say that PA'^ has zero status." Similarly 
to V^ has zero status in V^ to V^ = V^ (tried to escape replaceable by 
tried or escaped). 

16.53. All Sequences Containing a Class 

The comparison of all the sequences containing a particular class per- 
mits various generalizations concerning that class. The class in question 
may turn out never to occur by itself on the right hand side of the equa- 
tions of chapter 16 (i.e. tc be replaceable by no sequence, e.g. the classes 
An or &). It may occur last in all its sequences, or in all of a certain 
group of sequences. It may occur in only one sequence (e.g. An), or be 
secondary in all the sequences in which it occurs (e.g. P). The class may 
be such that each time it occurs in a sequence it is also the resultant of 
that sequence (i.e. it may always occur as the X oi XY = X). 

When we match together particular sequences in which a given class 
occurs, we may be able to derive additional information concerning the 
status of the class in one or both of the sequences. Thus we have NWd'^N* 
= an NV utterance (He fixed it or He fixed the clock) and ^N^N^nVd* = 
N^ (the clock he fixed). Since there is no N^NW/N* (the clock he fixed it, 
without comma, does not occur), we may say that the first 'A'* of 

" Or even that the P annihilates the status of the A"* which it would 
otherwise have had. Sequences of the type XY = X are called endocen- 
tric constructions, and the X in XY is then called the head of the con- 


'iVWSiV'd^ has the same status as the last A'* of N^V/N* in respect to 
the rest of these sequences (i.e. to the NVd),^'^ even though the sequences 
as a whole have different statuses in respect to the utterance. 

We may find that in one language there are certain large morpheme 
classes each of which occurs only in sequences equated to a corresponding 
position class (e.g. a language with different noun stems occurring in 
noun position and verb stems in verb position). In another language 
there may be one large morpheme class which is equated to various posi- 
tion classes by occurring in sequences with various small morpheme 
classes of bound forms (e.g. Hidatsa, where almost any stem occurs in 
noun position if s is added to it, and in verb position if c is added to it). 
In the latter case we may say that the utterance status is borne by affix 
classes which themselves never equal a position class but operate on other 
classes (stems) which are by themselves positionally neutral. There is no 
noun class in Hidatsa, only a stem class (neither noun nor verb), a 
class of nominalizing suffixes, and a class of verbalizing suffixes. 

Results of value for a compact grammatical description may be ob- 
tained from a consideration of the relations of classes and sequences to 
the raised-numeral inclusive sequence symbols N\ N'', N^, and the like 
of 16.21. 

Some classes may turn out to be included only in a low-numbered se- 
quence symbol: e.g. Vn occurs only in T''l'w = A^' {abolition substitut- 
able for bread) ; others are included only in a high-numbered symbol : 
e.g. Vv in V^Vv — V'^ (walked, or tried to escape substitutable for walk). 
We may say that the l'' domain of -ed is greater (in number of possible 
substitutes and often in number of successive morpheme places) tlian 
the A'' domain of Vn. Similarly, P occurs only with A'^ and N^, and may 
therefore be said to apply to a complete noun-phnuse.^'' Some classes oc- 
cur frequently as the resultants of sequences (i.e. on the right hand side 
of the equation) and rise to fairly high numbers (e.g. N^); others occur 

^* We may say that both indicate the object (clock) of the NVj (he 
fi.rcd). The moaning of parts of an utterance (or of an utterance section) 
relative to the rest can thus be gauged by this comparison technique. 
The method used here is similar to the usual distributional investigations 
in linguistics. If we find a restriction on the occurrence of clock and it, 
such that one or the other occurs within an utterance but not both, we 
can define them as alternants of one element, in this case 'object' of the 
verb. If clock occurs at the beginning or the end, but it only at (he end, 
we would place clock into a sub-c^lass which has a broader range of posi- 
tions (but not necessarily a wider distribution) than the sub-class of it. 

'^ I.e. to have a complete noun phrase as its domain. 


rarely as resultants, and require no numbering (e.g. D, all of whose 
equivalents may be substituted for each other.) 

A study of the inclusion numbers at which various new classes enter 
the sequences equated to any particular resultant, e.g. the fact that A 
enters into the sequence equated to A'^ while T enters into the sequence 
equated to A''*, will yield a picture of what has been called the incre- 
mental growth of constructions.^® 

16.5 1. Immediate Constituents 

It is further possible to take all the sequences which equal, say, V^ and 
compare them with all those which equal V^, and so on. In this way we 
can generalize as to what is added to each numbered symbol to obtain 
the ne.xt higher number for that symbol. In many cases we would find 
that it is not a unique class, but any one of several classes or sequences, 
that may be added to a symbol in order to obtain its next higher number. 
Hence if we are building up a sequence of classes and stop at any point, 
say when our sequence equals T'^, and then ask what we might add to 
obtain V^, we will often find a great number of possibilities. 

The reverse of this procedure, when we take a given utterance in order 
to equate its successive included sequences to various resultants of chap- 
ter 16, is called the determination, in successive stages, of the immediate 
constituents of the utterance." The operation is generally similar to that 
employed in 16.33. We take the utterance and see what equation most 
simply fits it, i.e. what is the simplest equation'* such that we can con- 
sider our utterance to be a case of it : e.g. My most recent plays closed down 
is a case of NW^ = utterance. We then take each member of the se- 
quence (on the left hand side of the equation), and ask for each of these 
what is the simplest sequence which represents the relevant part of our 
utterance and for which the member in question is a resultant (on the 
right hand side) : e.g. TN^ = N^ {T for my) would serve for the first part 
of the utterance, and V^^v = V^ {Vv for -ed) for the second. This opera- 

3® Leonard Bloomfield, Language 221-2. Such comparisons will also 
show the relative ranks of various classes toward closure of the construc- 
tion: cf. Otto Jespersen, Philosophy of Grammar. 

'' Leonard Bloomfield, Language 161, 209. 

'* Simplest is used here to mean not containing material which could 
be equated, on the basis of other equations, to the portions of the simplest 
equation. E.g. the utterance given here could be considered a case not 
only of N'^V* = utterance, but also of TNW^ = utterance. However, 
the latter is not necessary, since we have an equation for English which 
states TN^ = N\ 


tion is repeated until the members of the sequences represent the indi- 
vidual morphemes of the utterance.'^ Thus V^ (close down) would be 
analyzed by the equation V^Pb = V^, and N^ (most recent plays) would 
be analyzed first into N^ -s = N^, then into AN^ = N^; finally A [most 
recent) would be analyzed into DA = ^. In this analysis, the constituents 
of our N*V* utterance were: 

at the first stage, A'^ and V^ 
at the second stage, T and A^^; V^ and Vv 
at the third stage, T, N^, and -s; Vv, V^ and Pt 
at the fourth stage, T, A and A'^, -s: Vv, T' and Pt 
at the fifth stage, T, D, A and N\ -s; Vv, V and Pb.'" 
The basic operation in analyzing any stretch of speech into its imme- 
diate constituents is similar to that of 16.31. First, we determine by 
means of substitution what is the status of the given stretch in respect to 
the utterance (or to the succession of utterances in the speech) : e.g. given 
the stretch gentlemanly, we determine that it is a case of A from the fact 
that it is replaceable hy fine, narrow-minded, etc. in He's a —fellow, etc. 
Then we inspect the equations of chapter 16, obtained for the language 
in question, in order to see what sequences have A as their resultant 
(i.e. what sequences equal ^4). We shall not be able to choose among the 

'* In selecting the appropriate equations from the description of the 
whole corpus (for these English equations, see fn. 3 above), we adjust the 
inclusion numbers to satisfy the chain of equations. This is based on the 
definition of the numbers, which indicates that numbers on the left hand 
side of the equation represent themselves or any lower number. There- 
fore, V^Vv = V'^, which is one of the equations of our English analysis, 
represents V'Vv = V* and VWv = V* as well as itself. Since our next 
equation, analyzing close down will have as its resultant not V^ but V^ 
(i.e. close down does not equal V^), we select V^ in our present citing of 
V^Vv = V*, and therefore cite it as V^'v = 1'^. 

■"' If we use dots between class markers, with a greater number of 
inter-class dots to represent immediate constituent divisions at an earlier 
stage, we can indicate all five stages of successive subdivision of this 
utterance as follows: 

T :: D . A : N^ : . -s : ■ : F' ■ Pb : Vv 

For such use of the varying numbers of dots, see W. V. Quine, Mathe- 
matical Logic. For a general discussion of the methods of analysis into 
immediate constituents, see R. S. Wells, Immediate Constituents, L.\ng. 
23.81-117 (1947). Note that it is possible in most cases to arrange the 
constituents in the order in which the morphemes they represent occur 
in the utterance. However, this is not always possible: e.g. the Vv mor- 
pheme in the utterance occurs between the T' and the Phi one might 
say that it is in this case an infix rather than a suffix of the verb phrase. 


various sequences unless we know the class of each morpheme in the 
given stretch (since by the preceding paragraph, we select a sequence 
which represents the succession of morpheme classes in this stretch). 
Therefore, it is necessary to consider the individual morpheme classes: in 
the case of gentlemanly, these are A (gentle), N' (man), and either Ad or 
.Va (ly occurs in both of these classes).'" Division into gently and man as 
immediate constituents is excluded because D N 9^ A. Taking gentle and 
ynanly as the constituents is not satisfactory because although A A = A, 
that equation is valid only for certain stress patterns over the two A, 
e.g. ' — 1 — (He's a polite young fellow) , but not when the first A is loud 
stressed and the second zero stressed (as would have to be the case in 
gentlemanly fellow) . Taking gentleman and ly as the constituents is satis- 
factory because A' Na = A, and the loud-zero stress pattern occurs for 
this sequence.^' 

Appendix to 16.1: Why Begin with .Morpheme Classes? 

The procedure of chapter 16 utilizes certain relations among the mor- 
pheme classes of chapter 15. In 15 we were able to express, by classifica- 
tion, only one relation among morphemes: substitutability in certain ut- 
terances in which the morphemes occurred. Only on that basis were both 
large and small included in the same class, A. We were unable to indicate 
if one morpheme occurs ne.xt to a particular other one; and if two mor- 
phemes substituted for each other in only some of their utterances, we 
had no scale in terms of which to describe and analyze the utterances to 
which the substitution was restricted. Relations of this type are involved 
in the sequences of morpheme classes which are recognized by chapter 16 ; 
every statement of 16.2, such as that A + ly is substitutable for D, in- 
dicates a relation among morpheme classes (between any morpheme of 
one class and any one of the other) : e.g. that A occurs at least sometimes 
next to ly; and that ly occurs in environments which are identical with 
those of D except that A precedes it. Other relations too can be stated 
among morpheme classes, some of which will be indicated in 17-8. But 
the statements of chapter 16 are by themselves, without the addition of 
17-8, sufficient to indicate where each morpheme class occurs in every ut- 
terance. From the statements of chapter 16 alone we shall be able to learn 

^^ Any two of these occur together as an utterance: gently A Ad = D; 
manly N Na = A; gentleman AN = A''. For simplicity, we omit here the 
occurrence of man in V (man the ships). 

*^ Had all the possibilities been ruled out, we would have had to accept 
the three morphemes as equal immediate constituents. 


what sequences of morphemes, and what utterances, do and do not occur 
in the language. 

The question might be raised why our procedure begins with mor- 
phemes, rather than with some larger sections of utterances which might 
in turn be composed of morphemes. E.g. rather than discuss the position 
of ly in utterances, why not discuss the position of words like newly, 
completely, etc. The answer is that the statements of chapter 16 in many 
cases recognize these larger sections, such as words, but, instead of taking 
them ready-made, lead up to them in the course of considering sequences 
of morphemes: e.g. the sequence A (new) + ly. However, for the pur- 
poses of chapter 16 it is pointless to distinguish between the utterance 
positions of morphemes and of words in cases where a morpheme and a 
word have identical utterance positions. Thus Moroccan Arabic xuia 
'my brother' and ixu diali 'my brother' are substitutable for each other 
in any utterance in which either occurs. There is no reason to distinguish 
distributionally between them. There are, of course, differences between 
the two sequences, in the morphemes or morpheme classes of which they 
are composed, in the stress, in the bound or free occurrence of their parts 
(e.g. ia is never stressed and never constitutes an utterance by itself, 
which is not the case for diali). But such differences are apparent from 
the morpheme classes involved in each sequence and from the facts stated 
in 17 8 about each morpheme class; it is not necessary to repeat these 
differences in chapter 16. 

Appendix to 16.2: Morphemic Contours in the Substitutions 

In the consideration of morpheme sequences it is necessary to include 
all suprasegmental morphemes such as intonation and stress contours. 
Thus the {, j morpheme, which consists of a levelled preceding pitch plus 
an intermittently present pause, is always present in the equation 
'NV, & NV = NVD = NV: We asked him, but he wotddn't do it can bo 
replaced by We asked him unsuccessfully or We asked him. In contrast, 
the { , j morpheme is sometimes present and sometimes absent in the se- 
quence NVBN. (i.e. NV, BN. = NVBN.).*^ 

The command morpheme { ! | is the only contour in whose environ- 

*' As in I'll kill him if he comes, or I'll kill him, if he comes. The pres- 
ence of the comma adds, of course, a note of afterthought, or other mean- 
ing difference, to the BN; but the addition of any morpheme would add 
something to the meaning. AN = N does not indicate that the meaning 
of good boy is the same as that of boy, but only that when we find one of 
these we can substitute the other for it and still have an I'lnglish ut- 


ment a member of V (e.g. hurry) is not replaceable by V Vv (e.g. hurried) : 
e.g. Hurry! Since 1' plus this morphemic contour also does not occur with 
a preceding N,** we may say that TV = A'T'T't'. = XVVv.'' 

Similarly, intonation contours such as ' — ,, — must be noted when they 
occur.** This morpheme, with its rare form ,, — ' — , means that the mor- 
pheme or sequence bearing the reduced stress /,,/ refers to, or in some 
way modifies the meaning of, that bearing the main stress / '/. Many 
equations which contain this modifier morpheme Avould not hold if that 
morpheme were omitted. 'A'FuA'F = iVFA' is exemplified in the sub- 
stitution 1 see you're leaving for / see you. But 'A"T' 'AT would only 
occur with a sentence contour such as {.} or j?! after each half. 

Morphemes like the modifier contour may affect the substitutability 
of a morpheme sequence more than do most other morpheme classes. 
E.g. in The stage struck Barrymore we have NVN. The V is replaceable 
by other sequences which have been equated to V, e.g. collapsed under. 
The contour {.} is replaceable by B W or by ,&NV (or their equiva- 
lents): The stage struck Barrymore as it revolved. The stage struck Barry- 
more, and he collapsed. In the stage-struck Barrymore we have ^X,iA.^^ 
The 'A"iu4 is replaceable by A: The young Barrymore. The whole se- 
quence the stage-struck Barrymore may occur with the contour { . j (e.g. in 
an announcement, or in an answer to a question), and more frequently 
before or after V: The stage-struck Barrymore kept up the family tradition. 
I saiv the stage-struck Barrymore. This sequence cannot replace The stage 
struck Barrymore in the utterances given above. In general, sequences 
containing ' — ,, — are replaceable by one morpheme class, usually that 
of the second member*' bearing a single /'/ stress. 

In indicating suprasegmental morphemes in equations, it is useful to 
mark them by some sign which is inserted in the succession of mor- 
pheme-class signs, usually at the point where the domain of the contour 

** A7 TV and A', TV occurs as in Fred! Hxirry! or You, get a move on! 
But NV! does not occur. 

** Where ' indicates loud stress and n reduced loud stress (the * of 
G. L. Trager and B. Bloch, The syllabic phonemes of English, L.\ng. 
17.223^6 (1941)). 

■•* V-en = A {the broken promises). In stage-struck we have \ strike] -\- 
\-en\. In The stage struck we have [strike] -f \-ed]. 

*' E.xcept in statable cases, e.g. when the second is P: put-up is sup- 
plantable by A {put-up job), push-over by A' (7/'.s a push-over.). 


ends. For the purposes of the equations, these morphemes can then be 
treated just like the segmental morphemes. 

Appendix to 16.21: Alternative Methods for Non-repeatable Sub- 

The restriction on substitutions discussed in 16.21 can be symbolized 
also by writing that A'' Nn is implied by N, or includes N, rather than 
equals N, if iV Nn cannot be substituted for the N oi N Nn itself and for 
every other N. In that case, A'' Nn = N does not imply A'^ Nn Nn = A''. 
If we then come upon a substitution which is indeed repeatable, e.g. A 
in the AN = N above (where the A can be replaced by a sequence of 
several A: good clean for good), we indicate this by an equation instead 
of an implication, or else we use implications throughout but add addi- 
tional ones*^ such as AAN = AN:*^ good clean boy replaces good boy. 

Alternatively, we can set the equations up in a descriptive order. Thus 
we may begin with the equation N-s = N: boys replaces boy in Where 
did the — come from. Here we can substitute the result of the previous 
equation, since it follows from A^ Nn = A'^ and N -s = N that A^ Nn -s = 
N: boyhoods replaces boy in In time they'll forget their — . However, we 
cannot substitute the result of the second equation in the first. We can- 
not replace N hy N -s in A' Nn = N and derive A' -s Nn = A^ (e.g. 
boys-hood). ^° This restriction can be indicated by defining a descriptive 
order in which the equations may be carried out. We may say that the 
results of each equation may be substituted in any later equation but not 
in any previous one (or in itself). If the results of a number of equations 
may be freely substituted among themselves but not in some other group 
of equations, we may arrange the equations in groups, such that no group 
is substitutable in any preceding group. 

Instead of maintaining a descriptive order of equations, we may adopt 
yet another method of indicating the restriction on substitution of re- 
sultants. We may say that the resultant A'' (e.g. boys, boyhoods) of 

*^ From the equations A N = N and A yl A'^ = ^ A'' we can derive 
sequences of any number of A's before an A'^, if on the basis of the first 
equation we substitute A A^ for A^ in the second. 

" In cases where many repetitions of a class occur in a sequence, a 
different form of equation may be simpler. In some languages the re- 
striction on the definition of = may in general not be desirable. 

^° There are some morphemes, members of a class other 1 luin -Nn, which 
do occur after A^ -s: e.g. ful in hands-ful, etc. 


N s = N is not the same class as the first N (boy) of A'^ Nn = .V, since 
boys cannot replace boy before -hood. We may indicate this resultant 
boys by Q instead of A^ (where Q indicates both boys and boy, which re- 
place each other except before -hood, etc.), and write N -s = Q. There 
would then be no question of boys occurring before -hood, since what we 
have before Nn (-hood) is A^, not Q. Only one change in the definition of 
our classes is necessitated by adding the new class Q to our list: In the 
classes resulting from 15.3 each morpheme, in general, belonged to only 
one class or other;*' here (somewhat along the lines of 15.4) we must say 
that many members of Q are also members of N, while none of the mor- 
pheme sequences included in Q are. Such an extension of definition of our 
classes (now no longer pure single-morpheme classes) does not restrict 
in any way the use to which they can be put. 

As we proceed, we obtain various other resultant classes which like Q 
contain the members of A^ as well as various sequences." Thus the se- 
quences truth, freedo7n, may be indicated by the equation A An = R. 
The new class R includes the whole of A'^, since truth, freedom, replace life 
in Every man seeks — ■. But the sequence members of R do not occur be- 
fore Nn (-hood, -dom), while N does. The whole class R does occur be- 
fore -s: truths replaces boys. Therefore, by the side of A^ -s = Q we also 
have R -s = Q, there being no reason why the membership of Q should 
not be expanded to include the results of the latter equation. 

When we now consider AN = A^ (good boy for boy), we find that we 
must also say AR = R (pure truth for truth) and AQ = Q (good boys for 
boys). Clearly, the classes A", R, Q, all of which include the original 
single-morpheme class A', are going to have identical occurrences in some 
environments, and as long as we list them as separate classes we shall 
have to make special statements for each of them. On the other hand, 
we cannot combine them into one class because there are some environ- 
ments in which one and not another of them occurs. This situation, which 
arises from the partial substitutability of these classes, can be simply in- 
dicated by marking A', R, Q, and all other classes which include the 
membership of N, as sub-classes of an over-all A' class. We write A"' for 

*' Only morphemes whose distribution differed from that of others 
and equalled the sum of two or more other classes were placed simul- 
taneously (as 'different' morphemes, if we will) into two or more other 

■'"^ This is equivalent to the statement that these sequences replace A' in 
some utterance environments but not in others. 


the original N, N^ for both R and Q." and so on. This reduces to the 

method of 16.21. Our equations are now: 

N^ -N^n = A^^ (boyhood for boy) 

A -An = A''^ (freedom for 60?/)" 

jyi,2 _g _ jya (boys, freedoms for feo?/) 

^ A^i.2.3 = ^1.2.3 (good boy, good boys for boy). 

Appendix to 16.22: Morpheme-Sequence Substitutions for Mo- 
roccan Arabic 

Before the operation of chapter 16 can be carried out for Moroccan 
Arabic it is necessary to state the morpheme classes of the language.^* 
Almost all morphemes of Moroccan Arabic can be put into the following 
general classes on the basis of gross similarity of environment: 
R: roots, most of them consisting of three successive but not necessarily 
adjoining consonants, defined by the fact that their phonemes occur 
intercalated with those of Pv or Pn: d'rb 'hit' in dd'arbu 'they fought'. 
Pv: verb-patterns, most of them consisting of successive but not adjoin- 
ing vowels or consonants which are staggered with the root consonants; 
defined by their occurring next to certain few affixes (Va): t-a — 'do 
in common' in dd'arbu 'they fought'. 
Pn: noun-patterns, constructed like Pv, defined by their occurring with 

previously-recognized roots; — i- 'adjectival' in h'mis 'fifth'. 
^V; nouns independently stressed, occurring after prepositions (P), l- 

'the', or before plural suffixes: t'dmubil 'auto', bu 'father'. 
Va: verb subject affixes, forming one main-stress domain with R Pv: n- 

'I am' in nkt^b 'I write', -I 'I did' in ktbt 'I wTote'. 
Na: noun affixes, in one main stress with R Pn or N or M R Pv: -a 'fe- 
male', -in 'plural' in lah'ur 'the other', l^h'ra 'the other (f.)', "h'rin 
'others'. These two have a repeated form, i.e. when they occur with a 
noun in what will be described below as the noun phrase, they will also 

" It will be seen in the Appendix to 16.4 below that N"^ will be ade- 
quate for both R and Q. 

" In a complete statement there would be several additional sub- 
classes of .V. E.g. we would distinguish Ai -Ani = N* (truth, freedoyn) 
from Ai-An-i = N'^ (lateness), because N* will occur before -Na (e.g. 
-ful, -less), while A'^ will not. We have .V'^ -Na = A (truthful, freedom- 
less, as well as hopeless, replace great), but we do not have A'^ (lateness, 
or boys) before -Na. This limitation requires us therefore to set up N*, 
and write A'''^''' instead of N^-^ in the equations above. 

" I am indebted to Charles A. Ferguson for checking the Moroccan 
forms. For the phonetic values of the symbols, see Z. S. Harris, The pho- 
nemes of Moroccan Arabic, Jour. Am. Or. Soc. 62.309-18 (1942). 


occur with every other noun in that phrase (including the subject 
prefix of the associated verb). The plural suffix has many variant forms. 

/-; 'the'. Distribution as for Na; but /- does not occur with the subject 
prefix of the verb. 

M : prefix m- 'place of, instrument of, etc.', occurring before R Po with 
Na, but not Va, under a single main-stress contour: bnhrr'ds 'the brok- 
en one' (hrs is R 'break', its repeated middle consonant is Pv 'inten- 

S: objective-possessive suffixes occurring after R Pv. M R Pv, R Pn, 
N, P: d'rbak 'he hit you', buk 'your father'. 

P: prepositions, some of them prefixes and some independently stressed, 
occurring before X, R Pn, M R Pv (with their affixes Na), S, and in 
fixed combinations before other P and before certain /; fi 'in' in 
flnidina 'in the city', mn 'from' in mnhum 'from them'. 

.4; adverbs, occurring with none of the affixes, usually at the end of an 
utterance or after R Pv (with its affixes) or .S; daba 'soon', iams 
'yesterday', abadan 'ever', daimn 'always'. 

Hi 'which, who (relative)' occurring with no affi.xes, often unstres.'^ed, al- 
most always after N , R Pn, M R Pv; never at end of utterance. 

Pr: pronominal nouns, occurring with no affixes, before noun and verb 
phrases: ana T, hua 'he'. 

/.• introducers, occurring with no affixes except P, at the beginning of an 
utterance or before any morpheme class including another /.• kij 'how?', 
as 'what?', kiJ as 'in what way?'. la indicates those which occur some- 
times before a verb and two noun phrases, e.g. kifkthti Iktab? 'How did 
you wTite the book?' It indicates those that never do, e.g. as ktbti 
'What did you \\Tite?' or as kt3b rr'azl 'what did the man write?' 

u 'and' almost always unstressed, before any morpheme class. 

We now ask what sequences can be treated as single classes in relation 
to the other classes in the utterance. 

There is one fi.xed sequence which is not equivalent to any single mor- 
pheme but can be considered as a new single element in the utterance 
structure: R Pv, which we will call F'. Roots occur also with patterns 
other than Pv, but Pv never occurs without R:^ nfkatb 'I correspond', 
ktbt 'I wrote', tkllmu 'they conversed' are all cases of R Pv, with Ya. 

^^ Since one Pv is zero, an alternative description is possible: that \ a R 
also occurs, with verb meaning, without Pv (rather than: with zero Pv), 
as in nkl9b 'I write' as compared with ntkatb 'I correspond'. In that case 
we would say R = R Pv = V, with ktb 'he wrote' as the single-mor- 
pheme substitute. 


In the remaining sequences, we will always find some single morpheme 
which can be substituted for them. The next two equations, like the pre- 
ceding one, yield word classes. 

M R Pv = RPn = A^'; mf^lbm 'servant' (R: '^Im), s'bba^' 'painter' 
{Pn:/donbling middle consonant/ plus /a/) each substitutable for 
kabt'an 'captain' in^n — ■ 'Where is ■ — ?' 

Na N^ = N^: mt^lldm 'servant', mf^llma 'servant girl' in the sentence 

l.]S[^ = N'^: Imt^lldm 'the servant', Imt^lbna 'the servant girl'. 

The following equations yield noun and verb phrases. Each resultant 
(i.e. right-hand side) V and N can be substituted only for a left-hand side 
V and iV having the same or higher raised number. 

Vb^ Va T'2 = V^: This holds only when the first V is one of Vb, a small 
sub-class of roots: Wi 'want', kan 'was'; the subject affixes (Va) of 
the two V are either the same or else members of the pairs which indi- 
cate identical person {n- 'I am', -t 'I did' ; etc.) :" 7ib^'i nktab 'I will want 
to write', knt nmsi 'I was walking' . The second V^ always has the prefixed 
rather than the suffixed members of Va. This could be expressed by set- 
ting Va- V^ = V^ and V^ -Va = V^. Then n- and -t would contain the 
same Va morpheme T, except that n- = Va + prefixation, while -/ = 
Va + suffixation. The equation at the head of this paragraph would 
then become Vb^V^ = V\ and all V^ in the following equations would 
be replaced by V^ since that would include all lower numbers, and so 
represent verb with prefixation (non-past) or verb with suffixation 
(past); V^ in these equations would then become V*. 

jyijya _ ^2. ^^'q^ ssult'an '(the) house of the sultan', d'ar sult'an l^'rb 
'(the) house of the sultan of the West (Morocco)' substitutable for 
dd'ar '(the) house' in hadi — 'this is — '. For the A'^ we can substitute 
the lower-numbered A'2 (which in turn = A" A'^), obtaining A^W A' =* = 
A'^^ etc. Hence this formula indicates that any number of A'' without l- 
may precede the final one with 1-: since A''^ = l-N\ Usually there is 
only one A'' before the l-NK In meaning, each noun modifies the pre- 
ceding one, and there is no agreement in feminine and plural suffixes 
among the nouns. 

^V3jY3 = ^'3; f.'(^£i jfiarikani kbir 'a big American man', liiil'-dliua Ikbira 
'the big servant-girl', substitutable for s'bba'^' 'painter' in ^ft — 'I saw'. 

" These pairs can be expressed by a single morphemic component, in 
the manner of the Appendix to 17.33. 


Each noun modifies the first one; Z-, -a, and the plural suffix occur either 
with all or with none of these .V. Here too there are great restrictions 
on selection. Note that the form of the equation covers long series of 
.V, not merely two: in r'azl marikani kbir we may consider the first two 
.V to equal one ^V by this equation, and then that new A^ plus the last 
.V equal one N again by this equation. An A^ l-N = l-N phrase from 
the preceding equation may occur as the first of l-X l-X = A'^- stilCan 
l^'rb hml {X l-X l-X = l-X l-X = A'^) '(the) first sultan of the West'. 

.S = A'^: The possessive objective suffixes can be replaced by a noun 
phrase: sritu 'I bought it' for srit Iktab 'I bought the book'; ktabu 'his 
book' for kiab rr'azl Ikbir 'the big man's hook' ; fih 'in him' i or fss' bah 
'in the morning'. 

Xmi PX' = XHli XW^ = XHli X'V^PX^ = X\ I.e. Hi changes a fol- 
lowing sentence construction into a modifying noun, which then con- 
stitutes the last noun of the noun phrase: no A^' satisfying the two 
previous equations follows the A''* resulting from Hi. ktab Hi '^ndu 'a 
book that (is) with him', Iktab Hi sriti 'the book which you bought', 
rr'azl Ikbir Hi za m^ak 'the big man who came {za) with you' all sub- 
stitutable for r'azl in sft — 'I saw — '. If Hi is included in a class D 
(16.32), we would wTite the formulae X^DPX^, etc. 

A'M'a = Pr Va = Va = X". Since verbs (R Pv, without .1/) always oc- 
cur with subject affixes Va (including zero), and Va always with RPv, 
we cannot substitute a noun for Va. However Va agree in person and 
number with the preceding noun phrase; and if we wish to describe 
concord simply as a morpheme repeated throughout an interval 
(12.323 above) we must say that if a noun phrase occurs before the 
verb, then the verb's Va is part of that noun phrase: in nta tktab 'you 
will WTite', vta 'you' and /- 'you will' form a noun phrase together, as 
subject of the verb. We can also say that any noun phrase plus Va can 
be replaced by Va: X*Va = Va: Imf^llma tkllmdt 'the servant-girl 
spoke' replaceable by tkllmdt 'she spoke' {-9t 'she did'). This equation 
indicates that Va is always the last part of a subject noun phrase. 

P X* = P lb = la = PA = A: mnd'ari 'from my house' or mn hna 
'from here' for hna 'here' in ziti — 'you came — '; mn in 'from where', 
kif 'how' in — ziti. The latter two have a different sentence position. 
All of these have different frequencies of occurrence in different posi- 
tions: e.g. A may be more common at the beginning of utterances, and 
P X' after Hi. 

A'=T'^A'^ = IaX'=V' = A'=T'^; rr'azl iktab ktab 'the man wrote a book' or 
as rr'azl iktab 'what the man will write' for rr'azl iktdb 'the man will 


write.' The la replaces the N* (which can never become A'^*): both in- 
dicate the object of the V^. 

Any morpheme class or sequence plus u plus an equivalent morpheme 
class or sequence equals the morpheme class or sequence itself. In any 
environment in which we find NhtN^ we also find A^'*, and so on: rr'azl 
umr'tu zau 'the man and his wife came' {N^uN^N'' = N'uN'^ = N^) ; 
this + Va {-u 'they') for rrzala zau 'the men came'. When two or more 
A''' occur with Va, the Va contains the plural morpheme, as here. 

Moroccan Arabic morpheme sequences have now been shown to be 
equivalent to sequences of V^, N^, A, and Pr. We can write almost all 
utterances in the language as N^V^, N^N*, N* PrN^ (and just A'', yl, or 
]'.' alone) with A occurring at any point, and with any of several intona- 
tions, chiefly /./,/?/,/!/. Since the V^ replaces the second A^'* of the 
last two types, we may consider both as indicating a predicate, the first 
N'" or A^'* always representing the subject. A^^F^ does not always mean 
that the A^^ precedes, since the A'^^ may be a subject suffix {-Va} as in 
klbt 'I wrote'. Agreement correlates with this phrase division: -a and 
plural extend over all A^ places recognized here (subject and predicate) 
(but not over the eliminated object N'* of V'^N* = V^, or the A^* of PN* = 
A; these have internal agreement); I- extends over each A'' singly. 

A further reduction is possible if we write N^Pr = N^, changing the 
present N^ (= N'^Va) to A'*. Then buh mf^lbni 'his father is a servant' 
would be A'''A'^ {huh = A^'jV^ = A''^, which is included in A^''); and 
b}i.h hua mf^lbm 'his father is a servant' would be A^* A"^ (from A^'' Pr N*). 
The raised numbers e.xpress the fact that Pr is the last member of the 
sequence in the first noun phrase, although it can also be viewed as merely 
a connection, selected in various utterances in the place of zero, between 
the two A'^ of a nominal sentence. We would then state the two-part ut- 
terances as N'^'N'* and A'^T^, the A"'' and V^ replacing each other as predi- 

Appendix to 16.31: Sequence Analysis of Words Containing 
wh- and th- 

If we wish to treat wh of what and th of this as independent morphemes 
(Appendix to 12.22), we must consider a group of restricted morphemes: 
the components of what, which, who, why, where, when, how, that, thi.s, the, 
then, there. We would list wh- as a separate morpheme, occurring in these 
and other combinations, and always with either of two meanings: intro- 
ducing a question, or a subordinate (relative) clause. The morpheme th- 


can be similarly extracted, with a more or less demonstrative meaning. 
This leaves, in turn, the morphemes -at, -ich, -o, -y, etc/""* 

No one of these morphemes occurs in exactly the same environment 
as any one of the other English morpheme classes, since the immediate 
environment of wh- or th- always contains -en, -ere, or the like; while 
-en, -ere, etc., always occur immediately after wh- or th-, which no mem- 
ber of the other classes does. It is therefore impossible to a.ssign any of 
these morphemes directly to the other morpheme classes. Instead of at- 
tempting to do this, our method will be to analyze the sequences which 
contain these morphemes. We will see if the sequence wh- + -at, or the 
phrase in which what is included, is subtitutable for any morpheme class, 
or for any sequence of morpheme classes. ^^ Then we will work backward 
to see what the syntactic position of wh- by itself is. 

We begin with the positions in which //;- appears and wh- does not: 
the good man, I like this. Since the. this, that are substitutable for a in 
— very good fish, we say that each of them equals T. But these are se- 
quences, not single morphemes, and each sequence consists of th- plus 

^* We may consider the /'h/ of who, how as a positional variant of the 
/hw/ or /w/' of what, why. The two similarly-spelled second morphemes 
of what and that might also be profitably considered alternants of one mor- 
pheme, as might the -is and -ich of this, which. If one does not wish to 
divide these words into two morphemes each, the whole analj'sis of this 
section can be replaced by including that, what, etc. as single morphemes 
equated to TA, AV^ and other sequences. They would then be eliminated 
syntactically in the equations of chapter IG, so that the final picture of 
the utterance would be substantially the same as we will obtain in our 
present method, after dividing these words into two morphemes each. 
There are certain advantages in dividing these words: the similarity of 
meaning among them, the quest ion-and-answer pairs like where-there, 
and the fact that the intonation i will turn out to be automatic with 
respect to the one morpheme ich-. 

'"^ The general English analysis on which the following treatment rests 
is referred to in fn. 18 above. Nioipheme classes from this analysis which 
are used here (other than those listed in fn. 3, 25 above) are: Vb-' be, 
appear, get, keep, stay, (but not have), etc., occurring between A' and ad- 
jectives other than V-ing: The stuff xcill — fresh. Vdi the transitive verbs 
which occur before N: make, buy, want (but not go, sleep), as in 77/ — 
butter. Ve: intransitive verbs which do not occur before X: go, sleep. V 
represents Vb, Vd, Ve, and so on. X^ is A" -s (boys; although A'^ is used 
for this in 10.21); A'^* is TN" (the boy, the boys); N* is TAN^'^Va (the 
best drinks available), or V^-ing (thinking), or 'A'^ X'^uVa* (the clock 
he fixed), or 'A'^A'S,IVP (the house he slept in), or I (he, it). AX^ = X' 
(good boy for boy). V^ is have V^-en (have eaten), or V^ Pb (walk off, have 
gone over); IV is IV X* (take it); V* is I"' Vv (ivalked, icent). 


one of the mutually substitutable -e, -is, -at.^° We must therefore ask in 
what class to put the two parts of the sequence. Either both of them are 
members of T,^' or else one is T and the other is syntactically zero.^^ 

The choice between these two statements can be decided with the help 
of the other position in which only th- appears: this, that substitutable 
for it in / like — . — is good. The sequences this, that equal the class I in 
distribution, and hence equal a whole noun phrase A' "*. We might say 
that -is, -at = N^ while th- = T, so that this = T -\- N^ = N^ (noun 
phrase). However, this is unsatisfactory because we do not otherwise 
have a sequence T N^ in which some adjective A could not be inserted 
between the article and the noun: We can insert good between a and 
yuan, but not between th- and -is. We can therefore best satisfy both 
positions by saying: -is, -at = T when N^ follows, and A'^''- otherwise; 
whereas th- = T. Then the good man = T + A + N^ = N*; This is good. 
= T + A'' + Vb + yli = N'V; this man = T + T + m = N^; these 
men = T + T + N^ = N\ In / like this, we have this = T +N^ = A' ^ 
In I like these, we have these = T + A'^ = A'''. 

We proceed to the positions in which only ivh- occurs and th- does not : 
Whose books came through.' Whose came through ? Which books do you ivant? 
What do you want? Why did you do this/ On what day did he disappear? 

In the first sentence the word containing ivh- can be replaced, aside 
fiom the intonation, by the, my: My books came through. Hence the ivh- 
words which occur in that position, namely whose, what, which — T. 

In the second sentence, we can substitute it, the books: The books came 
through. Hence the wh- sequences which occur in this position equal a 
whole noun phrase: who, what, which = A'''. 

*° Since th never occurs alone, and -e differs in distribution from -is, 
-at, we may say that there is no -e moi'pheme, the is the variant of th 
when -is, etc., do not follow it. The fact that the does not replace this in 
/ like this is expressed by saying that in / like — t here must occur an A' ; 
hence the (which = T) is not sufficient, while this = T -\- N. 

" This would involve an equation T'T^ = T^ for these members of T. 
A partial analog for this is to be found in the equations all + T = T 
{all my for my in We lost — books) and T -\- cardinal number = T {some 
three for some in It happened — • years ago.). The relevance of these equa- 
tions here lies in the fact that when by themselves all and cardinal 
numbers usually occur in the position of T: all, three, my, some all occur 
in / want — ■ books back. 

^^ This is not a contrastive zero like that used in morphology, but 
merely indicates that, aside from .selection, it can be replaced by zero in 
the syntactic equations. R. S. Wells points out that assigning zero value 
to some morphemes constitutes the setting up of a new (zero) class, no 
less than if the new class had any other new value. 


In the third and fourth sentences, we can substitute this book for which 
books or what, if we change the order (for the justification, see chapter 16, 
fn. 34): Do you want this book? Hence the wh- phrases which occur here 
equal a noun phrase in object (post-]') position: whose N^, what A'', 
which A'^, who, what, which = A'''. 

In the fifth and sixth sentences we can substitute /or a good reason: Did 
you do this for a good reason? Hence why, when, where, how, P whose N^, 
P what A'^ P which N\ all equal PA'^ 

If we summarize all these conclusions we find that they agree on the 
following: who, what, which = A'^ if no A^^ follows, but = T if A''^ follows 
(in the latter case who has an added -s morpheme); why, when, where, 
how = PA'*. Now, the morpheme wh- is the first in each of these words, 
and we would like to find one value for it in all these positions. Since there 
is no morpheme class which can be the first member of sequences equal- 
ling T, and A'*, and PA'*, we may take wh- as T before -o, -at, -ich, and 
as P before -y, -en, -ere, -ow. Then -o, -at, -ich = T before A'^, and = iV^ 
otherwise; and -y, -en, -ere, -ow = N*. 

The l intonation of the sentences given above occurs only with sen- 
tences beginning with the wh- morpheme (or having wh- after P or & or 
/,/: But why did you doitf Now, why did you do it?). When the wh- phrase 
is followed by a simple verb phrase, it represents the subject; i.e. wh- 
NW*i = A'*!'* + i (What fell i = It fell + l). When the wh- phrase is 
followed by R N*V\ it represents the object, i.e. wh- N^{P)R N* V^ I = 
^M yi (P)AM + ^" (What does he want I = He wants it + i). Since we 
never have A'*F^ after wh- words (unless A'^'^ precedes them as above), 
we can define R A'*F^ i as the positional variant or value of A'*!'^ i in the 
position : wh- word N*V^ i . 

Finally, we consider the positions in which both wh- and th- occur: e.g. 
which or that in The family — / met lived here. The family — bought it lived 
here. I know — it was. 

In the family which I met lived here, the wh- word can be replaced 
by whom, that, whose sons', etc. We can also substitute the family I 
met or the family (in The family lived here.) for the whole phrase the 
family which I met. It follows that the family which I met, and each of its 
substitutes, constitutes a noun phrase A'*. From the sequence substitu- 
tions of the corpus as a whole (cf. fn. 58) we know that the family I met is 
N^NWd* = N*. Now some of the substitutes for the family I met are the 

*^ The parentheses indicate that P may be included or excluded from 
both sides of the equation. 


family whose very beautiful daughters I met or the family ivhose daughters 
the new tenant met or the bus the new tenant takes. Since the new tenant is a 
complete N* in itself, I or the new tenant must represent the middle N* of 
N^N*Vd*. This leaves the family which or the family whose beautiful daugh- 
ters to represent the initial A'' of that equation, since it occupies the place 
of the family or the bus. Faced with the situation of taking a noun phrase 
(the family) and adding something to it (which or that or whose very beau- 
tiful daughters) which nevertheless leaves the whole sequence still a noun 
phrase, we turn again to the analysis of the whole corpus, and find as an 
analog N^PN^ = N^ (a piece of junk for a book in It's just — .) which does 
precisely this. We can say that just as the PN* here is an appendage to 
the N^ so is that, or the phrase introduced by wh-. Of course, PN* can be 
added in this way to almost any A'^^, while that or the wh- phrase can only 
be added when N'^Vd* (/ met, etc.) follows; but that will merely have to 
be indicated in the present analysis. We may therefore say that A^' + 
that + N*Vd* = N^ -{- wh- phrase + N*Vd* = N*. Since very beautiful 
daughters = A''-, the phrase whose very beautiful daughters = T (ivh-) + 
T (-ose) + A^^ = A^^ Similarly, which and that in this position would be 
analyzed as T + N^ = A^'. The family which I met would therefore be 
N^N^N^Vd'^ and would equal N^N'^Vd* = N*, as in the family I met. The 
appearance of N^N^ instead of one A'^' would occur only when the second 
A'^' is that or a wh- phrase. The analogy of this A^' A^' (which equals the 
single A'^') to A'^^ PN*^, which also equals a single A^', is heightened by 
the fact that we can also substitute the family with which I boarded or the 
family from whose beautiful daughters I learned German, which are cases 
of A''' PN^ (the second A'^^ beginning only with wh-, not that) equalling 
the N' of N^ N* F/.^"* 

When we substitute whichever I met for the family I met, we ana- 
lyze whichever as T + T + A'^ = A''^ constituting the first .V of 
ATS Ar4 Vd* = N\ 

In sentences like The family which bought the house lived here, we have 

" In the case of A'^A^'A^M'^'' (the family ivhose sons I met), the verb is 
never followed immediately by a noun phrase. We may therefore say 
that the A^W replace the object of the verb, exactly as does the A^' in 
N^N'Vd' {the faynily I met). In the case of A'^ PiV' A^^ V'/ {the family 
with whose sons I played bridge) the verb is occasionally followed by a 
noun phrase indicating object (played bridge is Vd*N*). This parallels 
the formula NW^Ve^P (the family 1 played bridge with, cf. fn. 59), since 
y/A'4 = I'/. The P of N'PN^ here thus replaced the P of \\*P, and 
the .VIV' (which remain from the X^PI\''') replaces the A'' exactly as it 
(lid in the first case. 


the same substitutions {that, who, whose sons) as in the preceding case, 
except that the wh- word cannot be replaced by zero.*^ We may replace 
the family which by whichever, whatever people, etc. And we may replace 
the family which bought it by the family. In the last analysis, therefore, 
the sentence is A' ■'!'■'. The family which bought it = A''', consisting of A''' 
plus a wh- phrase (or that) plus 1'^ i.e. A''* = A'^ + ivh- word + V\^^ 
Comparing the A'' wh- N^ N* V/ = N* (the family which I met) of the 
preceding case, we can equate the wh- word here with the wh- N^ es- 
tablished above. To do so, we make the morpheme after wh- or th- equal 
N^, while wh- and th- may be included in the class T. Then which = 
T + N^ = A', and the family which bought it = A'^ + A' + V* = A'^ 
And the family whose sons bought it = N'^ + T + T + N^ -\- V* = 
j^i ^n 1^4 = j^r4 Qj^ ^j^g basis of substitutability, we say that whoever 
bought it or whatever people bought it also equals A'^ A^^T^^ = A^, with wh- 
as a preliminary T, -at as A^^, and -ever or -ever people as the second A^ 

The formula here, N^NW*, differs in two respects from the 
N^N^N*Vd* of the preceding case (the family which I met) and the 
A^ N* V/ above {the family I met). First, it lacks the A'^ the one not 
including the ivh- word. Secondly, whereas in the preceding type and in 
N^N*Va* the sub-class of T' Avas Vd, with W or FA' occurring only when 
associated with P, here we have no restriction on the V. That is to say 
that whereas in the other cases Ave had met {Vd) or played bridge with 
{V N P = VeP), here we have bought it (F A^ = F^). We can therefore 
say that the final A'^ which is included in the T^''* of the present case re- 
places the initial A' which is lacking in the present case. This is the formal 
feature which corresponds to the fact that the A' included in the F^, and 
the initial A'^ of the Vd* formulas, both indicate the object of verb: the 
jamily which bought the house {N* A' V*) ; the house which the family bought 
(A3 A^ A^ Vd'^)\ the house the family bought {N^ N* Vd*). All three se- 
quences equal N*, and when this A^^ occurs in the sequence A^'* V*/./(The 
family which bought the house is pretty quiet. The house ivhich the fa?nily 
bought is pretty quiet.) it is semantically the subject phrase of the verb. 

^^ One difference is that who usually occurs with following -m when it 
is after P or in object position, but occurs without the -m in this case. 
The treatment of -m in whom, him-, etc., has been omitted from these 
equations in the interest of simplicity. However, the techniques used here 
can be used to identify the object position and the morpheme -m which 
occurs in it. Some indication of the distributional basis is given in fn. 67. 

^^ The family is marked A' ^ here because we can substitute the girl 1 
loved for it, and say The girl 1 loved (N*) who jilted me {wh N V*) lived 
here once. 


The first N of each sequence is then the subject-noun for that verb V. 
The different positions which house and family occupy in the three se- 
quences, then, permit either one of them to occur as subject of the new- 
verb {is pretty quiet), while each retains a fixed semantic relation to the 
verb of the original A^* (bought). 

Lastly, consider / knov) that it was. We can substitute what, which, who, 
whose, whose book, from what place for that. We can also substitute zero, in 
which case we have V/NH'* = Vd^N* = Y ^ (from the general English 
analysis) : I know he was. = I know that. Since that it was equals a sub- 
ordinate (secondarily stressed) it was, which equals an object noun 
phrase, we can say that that it was is A^W^F^ = A"^.*^ The various sub- 
stitutable wh- phrases equal that, i.e. the first N^:from what place is P wh 
T N' = P T T N^ = PN^ = A^^ and is thus the first A^^ of from what 
place it came (N^N*V* = A"'*) in I know from what place it came, = 

;VM y^2 ^4 = ^4 Y^A 

All the occurrences of wh- and th- words have here admitted of the 
same analysis: wh- and th- are included in T: the bound morphemes that 
follow them are included in T if a noun or noun phrase without article 
(A^^) follows, and are included in A^^ otherwise. An exact statement can 
be given as to which bound morphemes occur in which positions. But 
how shall we state the distribution of wh- and th-1 In the case of the other 
morphemes which have been included in various morpheme classes, we 
know where they occur: dog or the dog may occur, with minor limitations, 
wherever we have an A'''. In the case of wh- and th-, the occurrence is 
highly restricted. They occur not in the full range of T positions, but 
only whenever a post-iy/i- or post-</i- morpheme occurs. 

When wh- occurs with i intonation, its meaning is interrogative; other- 
wise its meaning is relative, i.e. it puts the morpheme following it in ap- 
position with the preceding A^' or V^. And th- has relative meaning in the 
positions in which it can be replaced by wh-, and demonstrative meaning 

Appendix to 16.4: From Classes of Morphemes to Classes of Posi- 

The process of equating sequences of morphemes to our morpheme 
classes has wrought certain changes in the character of the original 

'^Here again we have object nouns following the subordinate verb 
in some cases {that it killed him), not in others {whom it killed). Again 
we say that wh- words which never have object nouns after their vorl)s 
themselves indicate the object. We can further substitute what for that it: 
I know what was. Here what was is the subordinate uNV. 


classes. Manj' of these classes are now substitutable not only for members 
of their own class, but also for sequences containing members of other 
classes. E.g. TA = N* (which includes TA = TN^) means that A' is 
substitutable for A if T precedes (and if no A'^ follows).*' True, the A 
morphemes differ from the A' morphemes in that they substitute for A' 
only in stated environments, whereas A' morphemes substitute for each 
other everywhere. Correspondingly, by the time we have completed 
our equational statements, the classes on the right hand side of the 
equations no longer overlap in environment, as did the classes of chap- 
ter 15." The morpheme classes A and A' overlapped after T, but now 
the whole sequence TA is equated to A"'*, and the fact that the .4 of this 
equation contains the same morphemes as the A of the sequence .4A' is 
irrelevant to chapter 16. Overlapping in environment is thus eliminated 
by letting one class, which in some position is substitutable for another 
class, be equated in that environment to that other class, and be there- 
after disregarded except in the definition of that other class. 

If the operation of chapter 16 begins with the classes of 15.3 it changes 
them from morpheme classes to position classes.'" If it begins with the 

** Since TA = TN = N, we have utterances in which A replaces a 
member of N in exactly the same way that another member of A' would 
have replaced it: large, as well as beer, replace records in I'll take the rec- 
ords. A and N would appear to be members of the same class here, al- 
though they differ when A' alone replaces TA (Fll take beer.), as also in 
positions where they do not replace each other at all (no A' for large in 
the large dry beer). 

*3 The morpheme class environments of the resultant classes of 
chapter 15, say N and V, are necessarily mutually exclusive. For let us 
suppose that in a particular environment .4 a sequence XY had equalled 
both A' and V. Then .4A'l' = A X = A V, and only one of these, either 
A X or A V would have been the resultant (by 15.22). However, there 
may be cases of morphemes or segment variants of morphemic units in 
two distinct classes having identical phonemic forms, so that in par- 
ticular utterances the phonemic environment of A' and that of V may 
be identical. E.g. yur in./ is / -Xa = .4 plus A' as an answer to Where 
shall we go to? {Your inn.), but / = A' plus V plus P as an answer to 
How did I make out? (You're in.). 

'" The similarity of, say, class G (book, take) to classes A' (life, house) 
and V (grow, wither) (Appendix to 15.32) comes out in the course of stat- 
ing the equations. We would have A'^ -s = A'' (houses for house) and also 
G^ -s = A'^ (books for book) ; V^P = V^ (grow up for grow) and also G P = 
V (take up, book up for take). In general, since G occurs whenever A' or V 
occurs, we will have an equation with G paralleling every one containing 
either A^ or V. If the new position classes tend to contain all the mor- 
phemes or sequences which occur in a particular range of environments, 


classes of 15.4, it groups them on the basis of environment until the 
final resultant classes of chapter 16 are as nearly complementary in en- 
vironment as possible. 

Consider the treatment of morpheme classes having overlapping en- 
vironments. The morphemes ynari, prince, etc. (say, class Q) occur in 
environments like The — disappeared, and also in It is a — ly art (but not 
in It is a — al art). In contrast, duke, form, etc. (class R) also occur in 
The — disappeared, and in It is a — al art but not in // is a — ly art. Q and 
R thus overlap in environments. This overlapping comes out in the 
equations of chapter 16. There is Q-ly = A (manly for great), and R -al = 
A (formal for great), but for all other equations involving Q or R there 
will be a similar equation containing the other : Q -s = N^ (men for books, 
book), R -s = .W^ (foryns for books) , AQ = N (old prince for book), AR = 

N (new forms for book). Environments like s = N^ and A — = N 

thus define a position class A'^': Q, R, and other morpheme classes sub- 
stitute for each other, in these positions. For the purposes of this posi- 
tion, the members of Q, R, etc., can all be lumped together into one posi- 
tion class, with no relevant difference among them: A"' -s = A'^'. 

The remaining position, in which Q and R are distinct classes, can be 
treated in either of two ways. We can recognize small position classes 
Q and R, which occur only before -ly, -al, and all of whose members are 
also members of A''. The morpheme man is now a member both of A'' 
and of Q, and form both of A"' and of R. This would satisfy the criteria 
of morpheme-overlapping and complementary environments for position 

However, we may also wish to note, for other purposes, the fact that 
the total membership of Q, R, and all the other small classes occurring in 
the group of positions before -ly, -al, etc., is entirely included in the mem- 
bership of A''. This can be indicated by saying that Q, R, etc., are included 
in A'^' even when they occur before -ly, -al, etc., the only difference being 

then all the morphemes of G can be contained in A" (although they also 
occur in non-A^ positions), and all of them can also be contained in ]' 
(although they also occur in non-T positions). As position classes, there- 
fore, .V contains, life, book, take, etc., while ]'' contains grow, 
wither, book, take, etc. The membership of A'' and T' now overlap, but 
are environmentally differentiated (like the classes of 15.4): e.g. if we 
find book in N position (in an environment in which it is replaceable by 
life) we know it is there a member of N. We may now eliminate the equa- 
tions containing G, since the N and V equations include all its members. 
Cf. chapter 15, fn. 27. 


that whereas in all other positions every member of A'^' can occur in every 
position of A'^' (e.g. both prince and dxtke before -s), in these new positions, 
into which N^ has been extended, only certain sub-classes (i.e. only some 
members) of .V occur (before -bj only Q. which we may mark now as Na', 
before -al only R, which we may mark now as Nb)- The usefulness of in- 
cluding Na and A''6 in N is increased if the environments which differen- 
tiate .Vo from iV6, i.e. which force us to recognize sub-classes in A'^, can 
themselves be considered in the same way as sub-classes of some more 
general position class. Thus we have a class Na {-ish, -like) which occurs 
in A'^ + Na = A, i.e. after any sub-class of A'^ such as Na, Nb, etc. If we 
now consider -ly, -al, etc. (each of which occur in the place of Na but after 
only one sub-class of N each) as sub-classes of Na, say Noa, Nob, etc., 
we have the equations Na + Noa = A, N'b + Nob = A, etc., all of which 
can be summarized in the position-class equation A'^ -|- Na = yl . It is un- 
derstood that this equation, unlike our previous ones, holds not for every 
member of the classes involved but only for certain members (or sub- 

Since summary equations like A^ + Na = ^1 do not show the special 
selections of which sub-class of one occurs with which sub-class of the 
other, it is impossible to eliminate from our records the explicit sub-class 

"' This alternative method is mentioned here because it leads to the 
summary equation above and to an extended definition of the position 
class. In its form, however, it is a case of relations among classes Q, R 
(or, A'a, Nb) and A'*, and belongs under 17.32. 


17.0. Introductory 

This section considers the rehitions of selection (government, etc.) 
among morpheme classes. It leads to the recognition of paradigmatic pat- 
terns, and of components which express the distributional relations 
among morphemes.^ 

17.1. Purpose: Relations among Morpheme Classes 

We seek to express compactly the remaining relations among mor- 
pheme classes, other than those which are explicitly indicated in 13 (3. 

We consider the distribution of morphemes relative to each other in 
utterances of our corpus. The relation of complementary distribution 
was expressed in chapter 13, and generalized in chapter 14. The relation 
of substitutability was expressed in chapter 15 for single morphemes and 
in chapter 16 for morpheme sequences. 

Various relations, however, were disregarded or not explicitly brouglit 
out in these sections. For example, there was no explicit discussion of the 
relative distribution of the morpheme classes which were grouped to- 
gether into the general classes of 15.32, or of the overlapping in mor- 
pheme membership among the position classes of 15.4 and of the Ap- 
pendix to 16.4. In the classes of chapter 15 no notice was taken of such 
relations as the relative distribution of segment members within each 
morpheme, and no indication was given as to identities in such relations 
throughout the morphemes of a class, or between the morphemes of one 
class and the morphemes of another. 

Similarly, no study was made in chapter 16 of the relation of one 
class A to some other class B in all the sequences in which A and B occur. 

All these correlations are not necessary for the construction of utter- 
ances in the corpus. The procedures of 13-6 suffi(;e to show how every 
utterance in the corpus is constructed out of the morphological elements 
established in chapter 12. However, the treatment of chapter 17 offers 

' Whereas chapter 16 covered primarily what is called syntax, chap- 
ters 17 and 18 parallel most of what is usually considered morphology 
proper. This order of treatment was most convenient for the methods de- 
veloped here. It is also possible, however, to treat the morijhemic rela- 
tions within whole-utterance environment (syntax) after the relations 
within smaller domains (morphology proper). 



additional general statements about the morphemes and their occur- 
rence, and makes the detailed description of utterances for the whole 
corpus more compact. 

17.2. Preliminaries to the Procedure: Disregarding the Rest of 
the l Iterance 

The chief relations among morphemes which will be treated here are 
their relative limitations of distribution, and their correlations with 
various other features such as junctures. In 16.5 we considered the oc- 
currence of one class rejative to all sequences containing it; here we in- 
vestigate the occurrence of one class relative to another. 

The occurrence of one class .4 relative to another B is limited if A and 
B occur together, while A and C do not, in some utterances. We also say 
that the occurrence of class A relative to D is limited, if the sequences 
AB, DB, DC occur, but not AC or EB} Distributionally, A has some- 
thing in common with its neighbor B, which it does not have with C; and A 
has something in common with D which replaces it (even though D also oc- 
cuis in positions where .4 does not), while it has nothing in common 
with E. 

In all these cases we are dealing with the co-occurrences of .4 and B, or 
with the substitution of A and D, in all utterances, no matter what the 
rest of the utterance may be. The considerations of chapter 17 therefore 
do not require, as did chapter 16, that the rest of the utterance be held 
constant : in many cases we may even disregard the rest of the utterance, 
and deal onlj^ with the parts of the utterance containing the classes under 

All these cases may be considered partial distributional identities 
among the classes in question: A and B have identical distribution in that 
each occurs in the other's neighborhood in certain total utterance en- 
vironments; A and D have identical distribution in that each occurs in 
the environment — B, even if the rest of the utterance environment may 
be different for AB than it is for DB. The identity is only in some of the 

^ If AC also occurred, then (as far as this data goes) A and D would be 
put into the same class (since both would occur in the same environ- 
ments — B, — C) ; but we are assuming that A and D are not in the same 
class, i.e. that there are environments in which one occurs and the other 
does not. The non-occurrence of EB shows that not all the morphemes of 
the corpus occur in — B, i.e. A and D occur in — B while other classes 
do not, and D (as well as other classes) occurs in — C while A (as well as 
other classes) does not. 


occurrences: there must be environments in which A, B, or D differ, or 
else they would all have constituted the same morpheme class. 

17.3. Procedure: Morphemes Occurring Together Share a Com- 

We express the partial distributional identity of any two morpheme 
classes by saying; that these two classes have a morphemic component in 
common. If classes A and B occur together in certain utterances while 
A and C do not, we may say that A and B each contain or represent a 
morphemic component which is not represented by C. If .4 and D each 
occur in the environment — B, whereas E and G do not occur there, we 
may say that A and D, as well as the environment B, each represent a 
morphemic component which is not represented by E and G (or by C).' 

Many different cases of partial distributional identity among classes, 
i.e. many different conditions of limited co-occurrence or substitution, 
occur in various languages. The exact manner of following out the pro- 
cedure of 17.3 in each case varies with the particular conditions and with 
the relation to the rest of the corpus. 

17.31. Classes Which Accompany Each Other 

The simplest case is that of classes which always occur together. Thus 
English morphemes -ceive, -cur, -mit, etc. never occur without a preceding 
morpheme such as re-, con-, per-, etc. Particular morphemes of the first 
group occur only with particular ones of the second : perceive, permit, but 
not percur. Nevertheless, we can briefly sum up the facts about each 
morpheme as follows: we define the first group {-ceive, etc.) as consti- 

^ In brief: 

AB occurs, EB does not occur, EC occurs, AC does not occur, 

DB " , GB " " " , GC " 
We say that A, D, and B all contain a morphemic component .Y, which 
is not contained in E, G, or C. The residue of A , D, and B after ext rac- 
tion of X may, for convenience, be identified with E, G, and C respec- 
tively: E + X = A,G + X = D, C -^ X = B. The X may be extract- 
able as a specific phonemic sequence or morpheme, as in the case of -ess: 
author -\- ess = authoress; or it may be definable only as a symbol of a 
relation among morphemes, as in the case of cow: bull -\- F = cow (see 
17.31). The identity of this operation with that of chapter 10 is obvious. 
Morphemes which extend over several morphemic lengths, or are spread 
out among them, have been noted in 12.32 and 12.34 (and the Appen- 
dices to 6.1, 6.6). Cf. the analysis of contrasted morphemes and mor- 
pheme sequences into 'merkmalhaftig' and 'merkmallos' (based on the 
parallel analysis in phonology, cf. ch. 10, fn. 51) in Roman Jakobson, 
Zur Struktur des russischen Verbums, in Charisteria Gulielmo Mathesio 
74 (1932). 


tutinp a general class »S and the second group as a general class E; we 
then state that no member of ^S occurs without some members of E, and 
that most members of E do not occur without some member of ^S after 
them. The statements as to which members of S occur with which mem- 
bers of E will be made in the detailed equations of chapter 16. However, 
the fact that no member of one of these classes occurs without some 
member of the other (except for statable cases) could be indicated here 
by extracting a long component v which extends over both S and E. 
The individual members of E and S are then indicated by differentiations 
in the first and second position of the v domain. The v has in general the 
same positions as the T' class of chapter 16 (lose, come, etc.). The com- 
p(;nent v thus indicates a single element (even though it is two units 
long) parallel to the single-morpheme, and also longer, ]' with which it 
is positionally identical. 

In Semitic, members of the class v (Modern Hebrew -a-a- indicating 
action, -i-e- indicating transitive action, etc.) never occur without mem- 
bers of the class R (spr 'tell', bud 'learn', etc.), nor do members of n {-e-e- 
'object', -i-u- 'object of -i-e- action', etc.) occur without R. R never occurs 
without either v or n: safar, 'he counted', siper 'he told', sefer 'book', 
sipur 'story', lamad 'he studied', limed 'he taught', limud 'a subject of 
study'. The sequence R + n occurs in the positions of A'^ {hen 'son', 
bdyit 'house', etc.): haben sel axi 'my brother's son (lit. the son of my 
brother)', hasefer sel axi 'my brother's book'. R -\- v, however, occurs in 
positions in which no single morpheme occurs: there is no single mor- 
pheme which occurs in the position of lamad {Imd + -a-a-) in hu lamad 
hetev 'he studied well'. We may extract from i? -f n, a two-unit com- 
ponent N, which has much the same distribution as iV; and from R + v, 
a two-unit component v, nvhich has a distribution different from that 
of any other class. These two components would be useful in our descrip- 
tion, because of the syntactic equivalence of n and A', and because of 
the importance of the n and v positions for our description. With each 
N or V there would then occur two differentiations: one from the original 
R class (usually 2 or 3 consonants), and one from either the n or f classes 
(usually 1 or 2 vowels). These differentiating elements could be mor- 
phemically identified with any elements which do not occur with x or v 
and which are therefore complementary to these.'' 

* In particular, the differentiating elements of the second n position 
can be morphemically identified with those of the second v position. I.e. 
each of the original v (say, -is-) can be paired with some n (say, -i-u-), 
into a single morpheme (say, -i-x-). The difference between the paired 


In all cases where chapter 16 shows XY = Z we can say that the re- 
sultant Z is a long component whose first position is differentiated by 
various members of X and the second by members of }'. Alternatively, 
we can say that Z is composed of two parts Zi and Z2, the former being 
a member of X and the latter of }'. The second method is the more con- 
venient when the Z is a class of only one or a few morphemes. Thus in 
Moroccan Arabic dial = DP (1G.32), dia was assigned as a new member 
of D, and I as a new member of P; this was especially advantageous since 
I could be identified with a known member / of P, and dia be considered 
an alternant of a rare member d of D. This second method may not be 
desirable in the case of large classes, e.g. English proper names = TN, 
or I = TN (Clarkson or he substitutable for a young fellow in — can't 
make good here)} If we sought to assign one part of each proper name, or 
of each member of /, to T, and the remainder to A'', we would have to 
make a great many arbitrary divisions into T and A' elements which 
would occur only with each other. ^ We therefore leave each proper name 
or member of / as a whole morpheme, and say that it equals TN, i.e. 
N* (16.21). 

The use of the higher numbered, more inclusive, symbols of 16.21 thus 
parallels the first method of the preceding paragraph: Given I = TN^ = 
A''', we can say that A'^ is a long component extending over the sequence 
(of one or more morphemes) TN^, the residue in whose first position are 
members of T and the residues in whose second position are members 
of A'^ (e.g. any AN, ANs, etc.).^ 

17.32. Restrictions among Sub-classes 

A frequent type of limitation is that in which members of one sub- 
class of a general class occur with each other, but do not occur with 

elements (-i-e- and -i-u-) is now attributed to the occurrence of the newly 
unified member -r'-x- with v in one ctuse and with N in the other. This is 
particularly useful in certain Semitic languages, e.g. Arabic, since par- 
ticular n may be similar to particular v in the selection of particular R 
with which both occur. One of the n and one of the v can also be con- 
sidered zero for the second positions of N and v respectively. 

^ Similarly, Semitic morphemes for 'he', 'his', 'him', and the like are 
substitutable for article plus noun. Cf. chapter 16, fn. 29. 

* I'lg. we might have to divide he into /h/, as a morpheme member of 
T, and /iy/, as a morpheme member of N. 

' Just as we derived general statements concerning all the sequences 
in which a particular class appears, or all the resultants to which these 
sequences are equated (16.5), so we can here derive general statements 
concerning all the sequences which are equated to a particular resultant. 


members of another sub-class of the same general class. Thus in the Eng- 
lish general class A' we have book, artist, author, cow, bull, king, queen, 
etc. In the sequence class A'* (16.21) we have this old-fashioned artist, 
our cow, he, she, I, etc. Of all these, certain members occur together in 
the two -V positions* of A'^Fb^V*. One group of members which occur 
together in these A' positions may be called A'/ and contains she, cow, 
queen, etc. : She's a good cow, She will remain as queen. The cow is the queen 
of farm animals. Another group of members which occur together in 
these A' positions may be called A'^ and contains he, bull, king, etc.: 
He's a fine bull. He'll remain as king, The bull is the king of farm animals. 
Members of A^ hardly ever occur with members of A'^ in the environ- 
ment NVbN. If the first A' of A'T'^A' is he or the bull, the second A' will not 
be she or queen: our corpus will not contain What breed of bull is she? 

Utilizing the operation of 17.3, we may define an element f which is 
common to all members of Nf, and which extends over both A' positions 
in ATfcA". Then each member of A'/ is differentiated from each member 
of Nm by that fact that the former contain the morphemic component f. 
Since no member of A'^ contains this component, each A'^ is comple- 
mentary to each Nf. We may therefore identify each Nf with some one 
Nm, on the basis of their occurrence in identical environments except for 
the F component (13.43). On this basis we would associate cow with bull, 
queen with king, she with he.^ She would then be he plus the f component , 
queen would be king plus f, etc.'° 

* Vb indicates the sub-class of V which contains be, remain, and in 
general such others as occur in A'^ — A: Your share will remain large. 

^ E.g. king and queen are among the few members of A^ which would 
appear in The -present — of England has reigned for fifteen years. In some 
cases the member of Nf and the member of A'™ which we would naturally 
pair together on the basis of meaning turn out to occur mostly in differ- 
ent environments: cow and bull do not substitute for each other in: That 
cow's a good milker. We've got twenty cows and one bull on our farm, bull- 
fight, cock-and-bull story. Even here, however, it will usually be possible 
to show that it is distributionally simpler to associate coiv with bull 
than with any other member of A'^. 

'° This is similar to w}iat was done in the case of other morphemic 
elements which always occurred together in particular environments 
(12.323). In the environment fili — bon — the two positions are both 
filled by a, or both filled by us (just as in Ar6A^ the two A' positions are 
both Nf or both A'„). In that case we set up a single long morpheme 
. . . a . . . a extending over both positions; similarly we set up here a 
single long component f extending over both A' positions. Two major 
differences distinguish the present case from that of 12.323. First, the 
extended . . . a . . . a by itself filled the two positions of fili — bon — , 


In considering the particular case of the N/ sub-class, we find further 
that there are several morphemes which may not themselves be members 
of Nf, but which have the following property: the sequence consisting 
of some member of Nm plus one of these morphemes is a member of Nf. 
Thus -ess and -ix are not members of N/, but authoress, princess, aviatrix, 
are (while author, prince, aviator are members of N^)- She, woman, lady, 
madam, are themselves members of Nf; but their combination with 
members of N^ yields new members of Nf: she-elephant, woman writer, 
lady dog, Madam Secretary}^ It is only particular members of Nm that 
combine with particular ones of these Nn or Nf morphemes to yield Ihe 
new Nf members. The morphemes, such as -ess and woman-, which trans- 
pose A'm into Nf can be considered (in these environments) members of 
the F component. Given such members of Nf as cow or queen, we cannot 
say what part of them represents the f component which they contain. 
Given such members of Nf as authoress or woman writer, we say that -ess 
and woman- are the respective morphemic members in these environ- 
ments of the morphemic component f. 

Finally, there are many members of ^V which occupy one of the A'' 

so that it could be regarded simply as a morpheme occupying both places. 
The two N places, however, are filled by various particular members of 
Nf, so that it is not enough to set up the extended f element over both 
positions; we must also indicate which member of N occurs in each posi- 
tion. Second, the phonemic differences between each Nm and its paired 
Nf are highly variegated (e.g. between bull and cow, hoy and girl). It 
would therefore be inefficient to assign some phonemic part of cow or 
girl as member of an f morpheme (as, one might say, a phonemic part 
of filia is assigned to the a morpheme), while leaving the remainder of 
cow and girl as members of the bull and boy morphemes respectively. 
The F is thus a morphemic component, not. a morpheme. The general 
problem of the grammatical concord that is involved here as well as in 
12.323 has been widely discussed. Cf. for example, Edward Sapir, Lan- 
guage 100; V. Mathesius, Double negation and grammaticuil concord, in 
Melanges J. van Ginnoken 79 83 (1937). 

" Note that these morphemes which transpose Nm into A^ are not 
members of one position class in terms of chapter 16. She, woman, etc. 
are members of A', and their combination with writer, etc., is a case of 
'NhN = N (or uN'N = N); in particular 'Nf„Nm' = Nf (where the 
Nf and Nm' represent the particular members of Nf and A^„ respec- 
tively which enter into these sequences). In contrast, -ess and -ix are 
members of Nn, together with the -eer of engineer and -hood of boyhood 
which do not yield Nf; and their combination with prince, aviator, etc., 
is a case of A^ Nn = A', or in particular Nm"Nnf = Nf (where Nm" and 
Nnf represent the particular members of A'„ and vVn respectively which 
enter into these sequences). 


positions in A' TkA' whether the other A'^ is Nm or N/: She is an artist, 
He is an artist. We may say that the f component fails to extend to the 
second A' when that A'^ is one of these members. Or, if we wish, we can 
say that these morphemes are members of A'm and also members of A7; 
then the f is contained in she and in artist in the utterance She is an 
artist, but is not contained in he or in artist in the utterance He is an artist. 
The relative limitation of distribution among the members of A7 can 
thus be expressed by extracting a long morphemic component f which 
extends over the positions in which the limitation applies; the residue of 
each Nf can then be identified, if convenient, with some A'^, i.e. with 
some member of the class which is excluded from the limitation of distri- 
bution in question.'^ 

17.33. Sub-classes Representable by Several Components 

We often find a number of morphemes or sub-classes each of which 
occur in a different utterance environment, but all of which occur always 
with some one other class. This is seen most generally in what are called 
noun case-endings, or tense and person conjiagations of verbs. Thus if 
we compare Latin hortus bonus est 'It is a good garden', and campus 
bonus est 'It is a good field' with ego in horto fui 'I have been in the gar- 
den' and ego in campojui 'I have been in the field', we see that -us occurs 
in certain utterance environments and -0 in certain other ones, but that 
both us and occur always with one or another member of the class 
containing hort-, camp-. Following 17.3, we would say that there is a 
morphemic component common to hart-, camp-, -us, and -0, and that there 

^^ Following 17.2, the substitutions involved in extracting the com- 
ponent are limited to a stated domain. If we identify cow (when it occurs 
without another A'/) as bull + f, and so on, the f will in this case operate 
only on the morpheme {bull) with which it is associated. The component 
F is long, of course, only in the environments such as A'T'tA' for which 
it is defined as long. 

On the basis of this component, the analysis of 16.33 can be made 
in a somewhat different way, more related to the immediate constituents 
of 16.54. If we consider the domains over which f can extend, we find 
that there are two domains available to f in She made him a good husband: 
in one domain (she) F is present, in the other domain (him a good hus- 
band) F is absent. Similarly, there are two domains available in she made 
him a good wife: in one domain (she ... a good wife) f is present, and in 
the other (him) f is absent. The two clauses differ (somewhat in the sense 
of chapter 16, fn. 34) because in one case the noun following good is in 
one domain with (and so refers to) him (in respect to f), while in the 
other case it is in one domain with (and refers to) she. We can consider 
the first clause as consisting of she -\- made -\- him a good hisband, and 
the second of she ... a good wife -\- made + him. 


is a component (which we may mark n) common to -us and its utterance 
environment, and another component (a) common to -o and its utter- 
ance environment. 

The analysis becomes more complicated, however, when we compare 
horti bom erant 'they were good gardens' and vlginti hortl erant 'there 
were twenty gardens there' with ego in hortlsfui 'I have been in the gar- 
dens' and ego in mginti horils fui 'I have been in twenty gardens'. We 
have here additional members of the class which occurs with hart-. But 
while many of the utterance environments of -i and -Is are identical with 
those of -us and -o respectively, we find certain partial environments, 
such as mginti, in which -% and -Is occur while -us and -6 do not. Hence, 
-us and -I have certain features of environment in common as against -6 
and -is; and -us and -o have other features of environment in common as 
against -% and -is. We say that -o and -is each contain the component a 
as against the component n of -us and -i; but also that -I and -Is each 
contain a component p as compared with the component s contained in 
-us and -6}^ 

Additional complications appear when these morphemes occur with 
only a sub-class of some general class, while parallel sets of morphemes 
occur with other sub-classes. This is the case in what is called noun 
gender or different verb conjugations. Thus by the side of hortus bonus est 
we have mensa bona est 'It is a good table', and by the side of hortl bonl 
erant we have mensae bonae erant 'They were good tables'. This can be 
treated after the manner of the Appendix to 12.323-4. Since hort- never 
occurs without one of the morphemes -us 'nominative', -o 'ablative', -uni 
'accusative', etc., we can say that one of these is automatic (dependent) 
in respect to hort-; i.e. it occurs whenever hort- occurs, and is therefore, 
despite its apparent independence, not a distinct morpheme. If we select 
-us,^* our morphemes are now hortus 'garden', -us -^ -o 'ablative', -m 

'' Such interrelations, as that of us with I on the one hand and with o 
on the other (not to mention with camp-), are discussed in Edward 
Sapir, Language, ch. 5, especially p. 101. 

'■* In some cases of classification it is not essential to select one of the 
members as primary in respect to the other members classified with it. 
E.g. in grouping complementary segments into one morpheme, we may 
regard one member as representing the morpheme, and call the other 
members positional variants of that member in stated posit ions. Alterna- 
tively, we can regard the morpheme as a class of members, all equally 
limited to particular positions. However, in selecting a member of the us, 
6 class to be considered part of hort-, we cannot avoid deciding for one 
member as against the others. We can select that member which occurs 
in the most general environments. E.g. if o occurs only in the neighbor- 


'accusative', etc.'* Similarly, since mens- never occurs without one of the 
parallel set of morphemes -a 'nominative', -a 'ablative', -am 'accusative', 
etc., we take one of these as automatic in respect to mens- and set up the 
morphemes mensa (and mensa . . . a, etc.) 'table', -a -^ -a (and . . . a . . . a 
-^ ... a ... a) 'ablative', -m (and ... to ... m) 'accusative', etc. We 
further note that -us -^ -o occurs in the same total utterance environ- 
ments as -a — > -a, except that the former follows morphemes ending in 
us while the latter follows morphemes ending in a. Therefore -a —* -a 
can be morphemically identified with -us -^ -o, being complementary 
to it in the phonemically definable preceding environment.'* 

The unified -us —>■ -o and -a -^ -a 'ablative singular', and also the uni- 
fied -us -^ -Is and -a — > -Is 'ablative plural', contain, together with their 
utterance environment, a component a. The component extends over 
this morpheme position immediately following the N and also over part 
of the utterance environment (that part of the environment which oc- 
curs with these case morphemes but not with the other morphemes of 
the case class). Similarly -to 'accusative singular' and -T -^ -os, -ae — » -as 
'accusative plural' contain, together with their differentiating utterance 
environment a component c. Again, -us — * -I and -a —> -ae 'nominative 
plural', -vs — > -Is and -a -^ -is 'ablative plural', -us —> -os and -a ^> -as 
'accusative plural', all contain together with the environments which 
differentiate them from the other morphemes, a component p. The mor- 
pheme -us — > -6, -fl ^ -a is thus represented by a; the morpheme -us —^ -is 
and -a -^ -is by ap; the morpheme -us -^ -i and -a — > -ae by p; and so on. 
The components x and s may be eliminated, since -us/-a 'nominative 
singular' is no longer a morpheme, but a phonemic part of the A'^ mor- 

hood of certain morphemes, and um in the neighborhood of others, while 
us occurs in a great variety of environments, it is clearly convenient to 
select us as the member to be included with hort-. The criteria for se- 
lecting a basic alternant are not meaning or tradition, but descriptive 
order, i.e. resultant simplicity of description in deriving the other forms 
from the base. 

'^ In some environments, e.g. before an adjective, the first morpheme 
will be hortus . . . us, which with bon yields hortvs bonus. The forms 
-MS -^ -0, etc., are necessary because once the N morpheme is no longer 
hori- but hortus (and hortus . . . us, etc.), the addition which is made to 
it in the .\ environment (the environment of the ablative) consists in 
dropping of -us and adding of -o (or dropping of ... us ... ws and adding 
of ... ... o). 

'* In other cases, the difference in preceding N environment between 
the various gender forms of a case-ending (us —* o, a —> a, etc.) is not 
so simply stated. 


phemes. Every case morpheme is thus identifiable as a unique combina- 
tion of the presence and absence of a few components. And the A^ mor- 
phemes no longer are restricted to occur with the case morphemes, since 
hortus is taken as the A'' morpheme itself. Furthermore, the same com- 
ponents which, when they occur immediately after N, identify the case 
morphemes may now be used to identify those other features of the ut- 
terance environment which do not come immediately after the N but 
are diagnostic for the particular case components, i.e. occur only when 
these components occur. In all such cases, we say that the component 
extends not only over the position next to the N but also elsewhere in 
the utterance. 

17.4. Result: Components Indicating Patterned Concurrences of 

We now have sets of morphemic components (and residues), so set up 
that as nearly as possible all sequences and combinations of them occur, 
each sequence or combination identifying one or another of our mor- 
phemes or morpheme sequences. 

For purposes of morphological description, these components are a pre- 
ferred set of basic elements. We can state the morphology in terms of 
them, and then add a dictionary-like itemization of what morphemes are 
represented by each particular combination of our components. In most 
languages many morphemes will remain without being reduced to com- 
binations of components; this will include morphemes and sub-classes 
which have unique limitations of occurrence of a type that does not lend 
itself to component representation (17.5). All non-componentally repre- 
sented morpheme classes will, of course, be included together with the 
components as elements of the morphology. The particular limitations of 
the classes and sub-classes (and of the components) which have not been 
expressed in the definition of each of these will have to be included as 
minor relations among elements, e.g. in such equations as those of 17.5.'^ 

Each component represents not only any morpheme which occurs in 
a particular environment, but also the features which differentiate that 

'■^ In view of the similarity between these components and the higher- 
numbered inclusive symbols of 16.21 (as noted at the end of 17.31), we 
can state the morphology in terms of these inclusive symbols and the 
components, both of which are our basic long elements, extending over 
any number of morpheme positions. The relation between such a general 
morphological description and the individual utterances of our corpus 
is given by statements of the morphemic differentiators in each position 
of these long elements. These positional differentiators can be indicated 


environment from the environment of other morphemes.'* Therefore, 
each component is long, though in some environments it may involve 
only the morpheme class in question (in positions where that morpheme 
sub-class does not differ environmentally from other sub-classes of its 
general class); in the latter environments the component has one-mor- 
pheme length. It is therefore necessary to state, in the definition of each 
component, not only what morpheme (or phonemic sequences) it repre- 
sents in each environment, but also what its domain is (i.e. over which 
morpheme classes, residues, or positions it operates) in each environment. 
Each morpheme can now be identified by a combination of components 
(plus its own particular residue, if any), each component indicating some 
of the special limitations of occurrence which this morpheme (or its 
class) has, but which other morphemes or classes do not. However, it is 
not in general convenient to identify particular phonemic parts of each 
morpheme with particular components, because no regularity can be 
obtained in the phonemic sequences that would be associated with each 
component. This results from the fact that the components, like the 
morphemic segments of chapter 12, are elements which are independent 
of each other. The method used here in setting up the components is 
comparable to that used to establish the morphemic segments: all inter- 
dependent features are included in one element, and each element is 
therefore independent (in as many environments as possible) of the other 
elements.'^ Since the independent phonemic sequences were already 
represented by the morphemic segments of chapter 12, and since the 
simpler groupings of phonemic sequences into independent morphemes 
was carried out in chapter 13, it follows that any search for yet more 
fully independent elements, such as is attempted in 17.3, would lead to 
little phonemic regularity for the new elements. By the same token that 
the new components are far less restricted in distribution than the origi- 

by means of the equations of chapter 16 (e.g. in TX^ = X*, T and N^ 
cover the two positions over which X* extends), or by residues which 
are left after the components are extracted (e.g. in authoress, author 
is the residue after f is extracted in 17.32), and so on. 

'* In particular, those other morphemes which have least environ- 
mental difference as against the morphemes in question, e.g. the other 
sub-classes of the same general class. 

" In the case of the morphemic segments, all interdependent phonemes 
in our utterance were included in one segment. In the case of the mor- 
phemic components, all interdependent morphemic choices in an utter- 
ance arc included in one component. 


nal morphemic segments, they are far less regular in their phonemic con- 
tent. Therefore, rather than define the components in terms of their pho- 
nemic content in each environment, and thus supersede the morphemes 
entirely, we define the components in terms of morphemes and mor- 
pheme classes, leaving these, as before, to be ultimately defined in terms 
of phonemic sequences in particular environments. 

The fact that in many positions one or another of the components 
may extend over more than one morpheme (over the morpheme in ques- 
tion and over at least one morpheme of the diagnostic environment) 
makes it all the more undesirable to identify the components phonemical- 
ly. The length variability of the components, which raises them above 
the restrictions of the single morphemes, is the feature in which they 
differ fundamentally from a mere noting of the relations among mor- 
pheme classes.^" Because of their length, these components express not 
only the relation among morphemes which substitute for each other in 
a particular environment, but also the relation between these morphemes 
and the differentiating feature of that environment." 

17.5. Restrictions Not Represented by Components 

It may not be convenient to represent by means of components such 
limitations of occurrence among morphemes as do not intersect with 
other limitations involving the same morphemes (as in 17.33 and its Ap- 

^° Aside from this, the components can be considered as indicating 
relations among morphemes or classes. They thus closely parallel, though 
in the form of elements rather than of relations or clasS'es, the gram- 
matical constructs known as categories: cf. E. Sapir, Language ch. 5; 
L. Bloomfield, Language 270-3; B. L. Whorf, Grammatical categories, 
Lang. 21.1-11 (1945). 

" While the components continue the search for independent elements 
which was begun in chapter 12 and advanced in chapter 13, they do so 
with a method essentially identical to that used in chapter 10 for pho- 
nologic elements. The similarity between the relations among morpho- 
logic elements and the relations among phonologic elements has been 
recognized by several writers, e.g. L. Hjelmslev, Proceedings of the Third 
Congress of the Phonetic Sciences 268 (1938). As in the case of many 
of the procedures discussed previously, the method of this chapter en- 
ables us to state on distributional bases results, such as paradigms, which 
are often (and much more easily) obtained by considerations of meaning. 
However, again as in the case of the other procedures, the method en- 
ables us to check the distributional relevance of the meaning differentia, 
and enables us to find patternings over and beyond those whole meanings 
we consider 'grammatical'. The fact that distributional methods are able 
to bring out the major grammatical meaning categories is merely an in- 
dication that the old results are not lost in the new methods. 


pendix), or as do not lead to the division of a class into sub-classes clearly 
differentiated on that basis (as in 17.32). In this respect the criteria as to 
what restrictions on the freedom of morphemes are to be represented by 
components correspond to the criteria as to what restrictions on the 
freedom of phonemes are to be represented by morphemes (12.233). 

This is frequently the case for morpheme classes which are grouped 
together into a general class on the basis of major similarities, but which 
have small and unpatterneddifferences in distribution. E.g. the morphemes 
close, erase occur with -ure (in closure, etc.) but not with -ion or -ment; 
relate, protect occur with -ion (as in relation) but not with -ure or -ment; 
curtail, retire, appoint occur with -ment but not with -ion or -ure; and so 
on. Although these forms differ in some of their distribution," they have 
many environments in common (e.g. They'll — it soon) and are all in- 
cluded in the general class V. Similarly, -ure, -ion, and -ment are included 
in a general class Vn. Rather than extract a component common to each 
set of co-occurring morphemes (e.g. to -ment and curtail, retire, appoint), 
it is more convenient merely to state co-occurring sets. In chapter 15, we 
would recognize sub-classes Vure (including close, erase), Vion, Vmtni, etc. 
In chapter 16 we would state: Vure + -ure = N , F,on + -ion = A^, 
Vmcni + -ment = N , etc. This can be summed up by AVTiting: 
{V ure /V ton/ V ment) + {-urc / -ion / -ment)"^^ = N, where some technique, 
such as the matching of sub-class members in the two classes, indicates 
which sub-class of T' occurs with which (single-morpheme) sub-class 
of Vn}^ 

Appendix to 17.32: Sub-classes Consisting of Single Morphemes 

It frequently happens that a morpheme has unique restrictions upon 
its occurrence relative to certain other morphemes, and would thus 
properly constitute a class, or sub-class, by itself. 

In some cases it is most convenient, in terms of the present methods, 
to consider the morpheme in question as a specially restricted member of 
the recognized morpheme class (within whose range of distribution its 

^^ For example, close occurs in The — ure is complete, but curtail occurs 
in The — ment is complete. 

23 Or (Vnurc/Vn^JVn^cni). 

^* Statements about sub-classes would also be most convenient for the 
scattered limitations of distribution of single members or small groups 


own distribution falls). This was done for the boysen of boysenberry (Ap- 
pendix to 12.22). We assign boysen to the class Nb (straw, goose, etc.) 
which occurs before — berry. The uniqueness of boysen appears in the fact 
that most of the other members of A'";, occur also in other A" positions 
while boysen does not. 

In other cases, it is more convenient to analyze the partially depend- 
ent sequence as due to independent but special components whose defini- 
tion contains peculiar applicability. This was done for the -s of he thinks 
(12.324). We could say that -s is the morpheme meaning 'third person' 
and that he, she, Fred, my uncle, etc. (in — thinks) are morphemes (or 
morphemic components) of individual differentiation within the 'third 
person'. We could then try to associate each of the differentiating com- 
ponents (he, she, etc.) with other morphemic components w^hich never 
occur with -s and are therefore complementary to our he, she. For ex- 
ample, we might associate he with / and say that these are complemen- 
tary members of a single morpheme. Then 'I' would be indicated by /, 
and 'he' by / -|- -s, the -s sufficing to indicate that the person in question 
is 'third'. However, such analysis is of no use in this case. First, because 
there is more than one complementary to he: had we carried out this 
analysis we would have obtained a wider distribution for /, which would 
now occur with -s (in I + -s 'he') as well as without it (in / 'I') ; but you 
remains restricted, as the old / was, since it never occurs with -s. Second, 
we have only two morphemes which do not occur with -s (/ and you, not 
counting the plural), to match against the extremely great number of 
morphemes or morpheme sequences which occur with -s (he, she, Fred, my 
uncle). There are thus no sets of morphemes or morphemic components 
comparable to the individual differentiators within the 'third person' 
and complementary to them. Finally, the fact that he, Fred, etc. occur 
with -s only in a highly restricted environment (in Fred walks, but not in 
Fred will walk, Fred walked, I'll ask Fred) makes it less convenient to 
reduce these morphemes to the status of differentiators (morphemic 
components) within the -s morpheme. 

of members of general classes. This includes groups of morphemes whose 
special limitation of distribution cannot be correlated with any other 
distributional or phonemic feature, but at best with some feature of 
meaning. As an example of such sub-classes, consider school, bed, jail, 
pokey, etc. which occur both in He was in —, and He was in the — , while 
house, prime of life, city occur mostly in He was in the — , and troxMe, 
good form occur mostly in He was in — . 


Appendix to 17.33: Morphemic Components for Intersecting 

For a larger number of intersecting limitations than those of Latin 
-us, -0, as compared with -us, -I, we consider the morphemes for T, 
'you', etc., in Modern Hebrew.^^ 

If we consider the following 17 utterances, and many sets of utterances 
of the same type.^* we would set up a class (C) of 17 morphemes -ti 'I 
did', a- 'I will', y . . . u 'they will', etc. :^'' 

didn't teach him a thing. 

lo limadti 

oto davar 


" limadta 



you (m.) 

" limadt 



you (f.) 

" limed 




" limda 




" limadnu 




" limadtem 



you (m. pi.) 

" limadten 



you (f. pi.) 

" limdu 




" alamed 




" tlamed 



you (m.) or 

" tlamdi 



you (f.) 

" ylamed 




" nlamed 




" tlamdu 



you (m. pi.) 

" tlamedna 



you (f. pi.) or 
they (f. pi.) 

" ylamdu 


• II 

they (m.) 


^* For a comparable set of intersecting components in Eskimo, cf . Z. S. 
Harris, Structural Restatements I, Int. Jour. Am. Ling. 13.47-58 (1947). 
Note also the smaller Bengali set in the Appendix to 13.43 above. 

'* Such as ma Hmddti otxa 'What did I teach you', kvar kaidvti lo 'I 
wrote him already', malay bdta hena 'When did you come here'. 

^^ The differences in the vowels of limed etc. would all be expressed by 
the operations of 13-4. If a vowel adjoins limed with no intervening junc- 
ture (i.e. within the same word) the preceding vowel is replaced by zero 
(limdu); aside from that, if any phonemes (except the unstressed na) 
adjoin limed with no intervening juncture, the vowel of limed which is 
nearest them is replaced by o. The forms are cited here in phonemic tran- 
scription, so that such segments as the [d] between two initial consonants 
are not shown. The last vowel of a word is stressed, unless otherwise indi- 
cated; X is the voiceless velar spirant, and c a post-dental [ts]. 


Every member of the class V (katav 'write', 60 'come') occurs with 
every one of these C morphemes. At this stage of the analysis, the 17 
would constitute a class of separate morphemes restricted to occur only 
with V. We could say that one long morphemic component extends over 
the two positions of V and these morphemes, but it would still be neces- 
sary to indicate which V and which member of C occurs in the two parts 
of that long component in any given utterance. 

However, we find additional environments in which some members of 
C occur while others do not. The first 9 occur in lo limad — oto davar 
etmol ' — didn't teach him a thing yesterday,' but do not occur in 
lo — lamed — oto davar maxar '■ — ■ won't teach him a thing tomorrow'; 
the last 8 occur in the latter but not in the former.^* We therefore ex- 
tract a component t common to the first 9 and to their differentiating en- 
vironments, and another component i common to the last 8 and to their 
differentiating environments. The residues of the 9 t morphemes may 
be identified with the residues of the 8 i morphemes if we find a con- 
venient way of matching these residues pair-wise. 

This pairing may be carried out on the basis of the particular mem- 
bers of the N class^^ with which each member of C occurs (since not 
every member of A'^ occurs with every C). 


+ T 

+ 1 



-ti 'I did', 


'I wUI' 


'you (m.)' 

-ta 'you (m.) did', 


'you (m.) will' 


'you (f.)' 

-t 'you (f.) did', 

t . 

. . i 'you (f.) will' 



zero 'he did'. 


'he will' 



-a 'she did'. 


'she will' 



-nu 'we did\ 


'we will' 


'you (m. pi.)' 

-tern 'you (m. pi.) did'. 

l . 

. . ri 'you (m. pi.) will' 


'they (m.)' 

-u 'they (m.) did'. 


. . . xi 'they (m.) will' 


'they (f.)' 

-u 'they (f.) did', 

t . 

. . na 'they (f.) will' 


'you (f. pi.)' 

-ten 'you (f. pi.) did', 

i . 

. . na 'you (f. pi.) will' 

We therefore identify the residue (X) of -ti with the residue (X) of a-; 
similarly, we pair -ta with t- and say that they each leave the same resi- 
due Y, and so on: X + t = -ti, X + i = a-, Y + T = -ta, Y -\- i = t-, 

^* E.g. lo limddnu oto davar etmol 'We didn't teach him a thing yester- 
day', lo alamed oto davar maxar 'I won't teach him a thing tomorrow.' 

" Where A'^ indicates a class of morphemes containing ani T, hu 'he', 
hamore haxadaS 'the new teacher', etc. 


etc.'" By 17.3, we consider A' to be also contained in the differentiating 
environment ani (which occurs only with the A'-bearing -ti or a-), Y con- 
tained also in ata (which occurs only with 1' -|- t or }' -f i), and so on. 

We now have 10 joint residues. These may be divided into two smaller 
sub-classes on the basis of the fact that they have different restrictions 
in respect to particular environments which have not as yet been con- 

In the environment ani vdhu — • oto hdydxad 'I and he will — him to- 
gether', the only members of C which occur are -nn (limddnu) and fi- 
(nlamed).^^ In ala vdhem — oto bdydxad 'you (m.) and they (m.) — him 
together' only limadtem and tlamdu occur, and in at vdhen — only 
limadten and tlamedna. In h^i vdhi — oto bdydxad 'He and she — ■ him to- 
gether' only limdu and ylamdu occur, and in hi vdisti — 'she and my 
Avife — ' only limdu and tlamedna occur. If we consider only the presence 
of vd 'and' in A^ vd N, we find that only the last 5 of the 10 morphemic 
residues occur in N vd N — ■. We may therefore extract a p component 
from these 5 and from their environment N vi N. 

Five of our residues contain p and 5 do not. We therefore seek a basis 
for identifying the secondary residues of these 5 (what is left after their p 
component is extracted), each one of them with one of the remaining 5 
(from which no p was extracted). 

The basis for pairing the residues of these two new sub-classes, of 
those morphemes which contain p and those which do not, may be found 
in a more detailed consideration of the restrictions of occurrence of our 
10 residues with respect to particular members of N . The residue of 
-nuf n- 'we' occurs not only with andxnu 'we' but also with any A' vd N 
where one of the two N is ani T or andxnu 'we' and the other A^ is any 
other member of the A^ class : ani vdhi limddnu oto 'I and she taught him', 
andxnu vdhamore haxadas nlamed otxa 'We and the new teacher will 
teach you'. No other one of our 10 morphemes occurs in these environ- 
ments. Analogously, the residue of -tern/ t . . . u 'you (m. pi.)' is the 

^^ If we wish to assign some particular feature of these morphemes to 
the T component and another to the i, we may say that position purely 
after V is represented by the t component and position before V is repre- 
sented by I. Then the phonemic sequences ti and a are positionally de- 
termined members of a morpheme (morphemic residue) X which occurs 
with T and i. As examples of utterances for the list above: ani limddti oto 
'I taught him', ani alamed oto 'I will teach him', ata limddta oto 'You 
taught him'. 

^' Literally, 'I and he, we taught (or: will teach) him together'. There 
is no /,/ juncture or intonation in this utterance in Hebrew. 


only one that occurs with any N vd N where one A'' is ata or atern, and 
the other N is any member of N (including these two) except ani and 
andxnu: e.g. ata vdhu tlamdu oto 'you (m.) and he will teach him'. Similar- 
ly, only -ten/ t . . . na 'you (f. pi.)' occurs with N V3 N where one A'' is at 
or aten and the other is at, aten, hi, hen, or any member of A'^ containing 
the F component defined below: e.g. at vaaxoti tavona 'You and my sister 
will come'. Again, the residue of -u/y . . . u 'they (m.)' is the only one 
that occurs with any N vd N where neither A^ is ani, andxnu, ata, at, or 
atem and where not more than one A^ includes f: hu vdhi ydabru ito 'He 
and she will talk with him', habanai vdozro sidru et ze 'The builder and his 
helper arranged it'. Similarly, only -u/t . . . na 'they (f.)' occurs with 
N Vd N where each A'^ is either hi or hen or an A^ including f : hi vahabaxura 
tdaberna 'She and the girl will talk'. 

Of the five residues containing p, then, only the first {-nu/ n-) occurs 
with ani in either A' position of A'^ ra A'^; we therefore pair it with the 
-ti/a- morpheme which also occurs with ani. Only the second ever oc- 
curs with ata or atem in both A' positions together; we therefore pair it 
with the -ta/t- morphemes which occur with ata. An analogous restric- 
tion to at leads to the pairing of -ien/t . . . na with -t/ t . . . i. The third 
morphemic residue occurs only with hu, hi, hem, hen or the members of A'^ 
not listed here (is 'man', etc.), in either A' position:'^ we pair it with 
the zevo/y- morphemes, which occur with hu. Analogously, we pair 
-u/t . . . na with -a/t- on the basis of hi. We can express the matching 
of these 5 pairs of residues by means of 5 residual morphemic com- 
ponents: 1 contained in -ti/a- and -nu/n-, 2 contained in -ta/t- and 
-tem/t . . . u, A contained in -t/t . . . i and -ten/t . . . na, 3 contained in 
zero/y- and in -u/y . . . u, B, contained in -a/t- and -u/t . . . na. These 
components, of course, occur not only in these members of C but also in 
the particular members of A^ in respect to which these members of C 
were diiYerentiated. Hence the component 1 is also contained in any A'^ 
(including A'' vd N) which includes ani or andxnu; 2 is contained in any 
A'^ (or A^ Vd N) which includes ata, at, atem or aten but not ani or andxnu; 
3 is contained in any A'^ other than these." In ani limddti 'I taught' we 

^^ But only one of the two A'^ positions can be occupied by any one of 
hi, hen or A' plus -a 'feminine'. Before -u/t . . . na both A"" positions are 
occupied by morphemes of this group. 

" It may be noted that some phonemic features are common to sev- 
eral of the morphemic segments which contain a particular component. 
Thus all segments containing the component 2 have the phoneme /t/, 
but so do some segments which do not contain 2 have this phoneme. Only 


have a long component 1 extending over ani . . . li, in ala vdhu tlamdu 
'you and he will teach' a long 2 over ata vdhu t . . . u, and so on. 

If we consider the limitations of occurrence of these morphemes or 
their segments in respect to the -a 'feminine' morpheme, we find that A' 
occurring with A or B always has the -a morpheme, whereas N occurring 
with 3 or S does not.'* The restriction upon B as against 5 is clear: 
habaxura sidra et ze 'The girl arranged it', habaxura vdhaxavera sela tsa- 
derna et ze 'The girl and her friend (f.) will arrange it' as against habaxur 
sider et ze 'The fellow arranged it', habaxur vdhaxavera selo ysadru et ze 
'The fellow and his friend (f.) will arrange it'. No A^ with the -a 'feminine' 
morpheme substitutes for habaxur in the last two utterances, nor can 
baxur substitute for baxura or xavera in the first two.'* We may therefore 
say that the -a/t- and -v/t . . . na residues, hi 'she' and hen 'they (f.)', 
and -a 'feminine' all contain a component f which is absent in zero/?/-, 
-u/y . . . XI, hu 'he', and hem 'they (m.)'.'^ 

The same component f can be extracted from A as against 2. Just as hi 
contains f, so does at 'you (f.)': hi baxura hagu7ia 'She's a decent girl', 
at baxura haguna 'You (f.) are a decent girl' as against hu baxur hagun 
'He's a decent fellow', ata baxur hagun 'You (m.) are a decent fellow'. 
Since A occurs with at but not with ata, we extract f component from 
.4 also. 

segments which contain 5, but not all of these, have the phoneme /y/; 
and only segments which contain p, but not all of these, have the pho- 
neme /u/. 

" And A^ occurring with 1 sometimes has the -a and sometimes does 

'= A'' + -a may substitute for A' without -a, e.g. habaxur in such en- 
vironments as A^ ra A^ {habaxur vdaxi sidru et ze 'The fellow and my 
brother arranged it', habaxura vdaxi sidru et ze 'The girl and my brother 
arranged it'); or in the A" of VN = V {limddti et habaxur 'I taught the 
fellow,' limddti et habaxura 'I taught the girl'); or in the second A^ of 
A^ se PN = N {ze hamakom sel habaxtir 'That's the fellow's place', 
ze hamakom sel habaxura 'That's the girl's place'); etc. 

'^ We say that this component is present in hi not only because of 
hi sidra 'she arranged' as against hu sider 'he arranged', but also because 
of hi baxura haguna 'She's a decent girl' as against hu baxur hagun 'He's 
a decent fellow.' 

In baxura haguna we have a single repeated morpheme ... a ... o 
(12.323). Since hi occurs with baxura as against baxur, we extract from 
hi an f component, identical with the . . . a . . . a morpheme, and say 
that it extends over the whole utterance hi baxura haguna. In the first 
morpheme, this component yields hi instead of hu; in the remaining 
morphemes, this component adds parts of the repeated ... a ... a. 


Further consideration shows a limitation of occurrence of 2 and 3 in 
respect to at containing A and hi containing B, as well as to ala contain- 
ing 2 and hu containing 3, respectively. Before 3, hi or' hen sometimes 
constitute one member of N vd N (see fn. 32) whereas at does not: hi 
vdaxi 'she and my brother' occurs before 3; at vdaxi 'you (f.) and my 
brother' occurs before 2. Similarly, ata vdat 'You (m.) and you (f.)' oc- 
curs before 2, whereas ata vdani 'You and I' occurs before 1. Hence the 
component 2 may be extracted from at, aten, and from the A morphemes 
which occur with these, while 3 may be extracted from hi, hen, and from 
the B morphemes which occur with them. 

Component A is thus replaceable by the combination of components 2 
and f; and B by the combination 3 and f. 

We now have a set of components in terms of which each member of C 
may be identified and differentiated from each other one, without residue. 



-ti 'I did' 

-ta 'you (m.) did' 

-/ 'you (f.) did' 

zero 'he did' 

-a 'she did' 

-nu 'we did' 

-tern 'you (m. pi.) did' 

-ten 'you (f. pi.) did' 

-u 'they (m.) did' 

-u 'they (f.) did' 






F (t) 

(3) (T) 


F (t) 


p (t) 


p (t) 


F P (t) 


p (t) 


F P (t) 

a- 'I will' 

t- 'you (m.) will' 

t . . . i 'you (f.) will' 

y- 'he will' 

t- 'she will' 

n- 'we will' 

I . . . u 'you (m. pi.) will' 

t . . . na 'you (f. pi.) will' 

y . . . u 'they (m.) will' 

t . . . na 'they (f.) will' 

1 I 

2 I 

2 F I 

(3) I 


1 PI 

2 p I 

^ F P I 

(S) PI 

(3) F P I 

It would have been possible to extract a component s from those mor- 
phemes which do not contain p, and from their differentiating environ- 
ments: habaxur ha 'The fellow came' (both parts containing s) as against 
habaxurim bdu 'The fellows came' (both containing p). Similarly, it would 
have been possible to extract a component m from those morphemes 
which do not contain f, and from their differentiating environments: 
habaxur ba 'The fellow came' (both parts containing m) as against habnxii- 
ra baa 'The girl came', (both containing f). However, since every occur- 
rence of V is associated with the occurrence of some one of these mor- 
phemes," so that we can always tell by its position (next to V) if a mor- 
pheme is a member of C, and since all non-F morphemes contain m and 

" Or with the occurrence of a few other morphemes such as b 'to' 
{blamed 'to teach') or command intonation {lamed oto! 'Teach him!'). 


all non-p morphemes contain s, we can neglect m and s and consider 
them automatic for this class C (and for A'). If the position of one mem- 
ber of C (or of* A'') is not occupied by f or p we know that it has the 
M or s characteristics: if we write V -\- 2\, we know it is T' -|- t- 'you (m.) 
will' (not t . . . i, which would be ^ F i). By the same token, we can omit 
indication of the components t and 3 (given above in parentheses); we 
will still be able to differentiate each member of C from the other, so 
long as we know from its position that the morpheme indicated by the 
components is a member of C (as it must be if it follows V, since after 
every T' there is a C). The morpheme 'he did' which is phonemically zero 
would thus be represented by an absence of all components: limed 'he 
taught' is now represented by T' alone, but ylanied 'he will teach' by 1' + i 
and limddnv 'we taught' by 1" + ^ p. 

Each of the C morphemes which substitute for each other (in some en- 
vironments) after V is now a unique combination of the presence or 
absence, after F, of the components 1, 2, v, f, and i. 

These components can be used further to identify other morphemes 
which constitute the differentiating environments of the 17 one-mor- 
pheme combinations of these components. Thus andxnv, the only single 
morpheme which differentiates -nu (1p) and n- (Ipi) from the other 
members of C, can be identified as 1 p. We may analyze andxnv katdvnu 
'we wrote' as i p + V -\- 1 v. Ov we can say that andxnu . . . nu'i^ iden- 
tified by one long component 1 p which extends on either side of the Y . 
Then both andxnu katdvnu and the equivalent katdvnu 'we wrote' are 

V + 1 f; the difference between the two may be considered free or stylis- 
tic, or may be indicated by a component representing emphasis or the 
like in the case of andxnu. ^^ Similarly, ani V-ti and V-ti 'I — ed' would 
both be V + 1, while ani a-V and a-V 'I will — •' would both be T' i + 1; 
both hen t-V-na and t-V-na alone 'they (f.) will — ' would be V i + f p 
(in the former case with an emphatic component); hen V-u would be 

V + F p (hen katvu 'they (f.) wrote'); hem y-V-u and y-V-u 'they (m.) 
will — ' would be Vi -f p; hem V-u would be T^ + p; and V-xi by itself 
(which is the same after hem or hen would be T' -|- p or f p (katvu 'they 
(m. or f.) wrote'). Finally, hu V and V by itself would both be just T^ 
(hu katav or just katav 'he wrote'). 

When these components occur without ]', the part affixed to ]' (and 

^' In terms of the constructions of chapter 18 there is, of course, a 
difference between the two utterances: katdvmt consists of one word, and 
andxnu katdvnu consists of two. 


which had been included in the C) is, of course, absent. We can still 
identify ani 'I', hu 'he', etc. by the same components, but the com- 
ponents are not long in this case. Thus andxnu po 'we (are) here' may 
be analyzed as ^ p + po. The component combination 1 p thus indicates 
-nu and andxnu . . . nu next to V, but andxnu elsewhere.^' 

In another class L, containing 10 bound morphemes which occur after 
A'' and P, the members can be differentiated by means of these com- 

The components f and p may also be used to identify certain mor- 

^^ In the case of the third-person pronouns {-u, hem . . . u, -zero, 
hu . . . zero, etc.) a special difficulty arises. In the neighborhood of V, 
the component 3 which had distinguished these morphemes had been 
considered equivalent to the absence of 1 and 2, and therefore was not 
written. This was unambiguous in that environment, since V never oc- 
curred except with the accompaniment of .Z or ^ or 3. In other environ- 
ments, however, the absence of both 1 and 2 does not necessarily indi- 
cate the presence of 3, because all three may be absent. Thus the utter- 
ance po 'Here' (e.g. in response to eyfo ala? 'Where are you?') is not iden- 
tical in distribution or meaning with the utterance hu po 'He is here'. 
Hence, whereas andxnu in andxnu po can be represented by the same 
mark as in andxnu V-nu, namely 1 p, hu in hu po cannot be represented 
by the zero which indicated it in h\i- V . We can therefore represent the 
third-person pronouns (when not adjoining V) by the component 3, or 
else by the class-mark A^ (as distinct from any particular member of N: 
see the Appendix to 18.2) : hu po would be S -j- po or A' + po; hem po 
'they (m.) (are) here' would be 5 p -|- po or A^ p -f- po; and so on. 

^° The 10 morphemes of class L are: 

-i 'my, me' -enu 'our, us' 

-xa 'your, you (m. sing.)' -xem 'your, you (m. pi.)' 

-ex 'your, you (f. sing.)' -xen 'your, you (f. pi.)' 

-o 'his, him' -am 'their, them (m.)' 

-a 'her' -an 'their, them (f.)' 

They substitute for each other, and for any A', in the following environ- 
ments: P — (as in li 'to me' PL, labaxur 'to the fellow' PN), ,N^ — {beti 
'my house' NL, bet sefer 'school house' ,N^N), rarely VC — ■ (bikaUixa 
'I asked you' VCL, bikdSH tova 'I asked a favor' VCN). However, only -i 
occurs in ani acm — 'I — -self, only -xa in ala acm — ■ 'you — ^self, and so 
on. We therefore indicate -i by /, -xa by 2, -ex by 2r, and so on. Then 
heii is iNal, bet sefer is j.Vo' Nb, and so on (the subscripts a, b indicate dif- 
ferent particular meml)ors of the class A'^). In the case of the third-person 
pronouns we again have difficulty in indicating them by zero, since ab- 
sence of / or ^ does not necessarily mean presence of 3 (except usually 
after P): N and VC often occur without any member of L after them. 
It is therefore necessary, as before, to represent these either by 3 or by 
the undifferentiated class .symbol ;V (see Appendix to 18.2): lo 'to him' 
is Pa -|- S or Pa + N, roso 'his head' is Na + 3 or Na + N, ros haxevra 
'the head of the company' is Na + Nb, roS 'head' is A^o- 


phemes (class A') which occur not in .V position but immediately after .V. 
In baxur "fellow', haxura 'girl', baxuritn 'fellows', baxurot 'girls', we have 
three such morphemes: -a 'feminine (singular)', -im 'masculine plural', 
-ot 'feminine plural' (and zero 'masculine singular'). When N + -a oc- 
curs before 1', the V is always T' + f or 1' + f i: habaxxira sidra 'The 
girl arranged', habaxura tsader 'The girl will arrange'. When A' + -im 
occurs before V, the V' is always V + p or T + p i : habaxiirim sidru 'The 
fellows arranged', habaxurim ysadru 'The fellows will arrange'. In A" -ot V, 
the T^ is always F + f p or T + f p i: habaxurot sidru 'The girls ar- 
ranged', habaxurot tsaderna 'The girls will arrange'. When N followed 
by none of these three (i.e. N + zero) occurs alone before V, the V is 
always just V or V -\- i: habaxur sider 'The fellow arranged', habaxur 
ysader 'The fellow will arrange'. Hence we represent -a by the com- 
ponent F, -im by p, -ot by F p (and the zero by our previous zero 'he'), 
all after A'. 

A large number of morphemes have now been identified by various 
combinations of 5 components, in various utterance positions. 

One component, i, occurs only after T' and has no relation to any other 
morpheme class. ^' 

The components 1 and £ occur in A' position. As has been seen, they 
are substitutable for any particular A' in the environments P — ■, ,N^ — , 
VC — ■; in these environments they represent the L morphemes. They also 
occur in other N positions, e.g. ' — ^'A' {yosef nagar 'Joseph (is a) carpen- 
ter' 'A'a'A'h/ ani nagar 'I (am a) carpenter' 'i 'A'i), where they represent 
the morphemes ani, ata, etc. In all these positions the original compo- 
nent 3 (which when ne.\t to T' has been replaced by zero) may be re- 
placed by the undifferentiated class-mark N (as distinct from particular 
members (A\) of the class A' , which are individually marked Xa, A'b, etc.) : 
roso 'his head' lA'a'A'; hv nagar 'he (is a) carpenter' 'A''A'5. When i or ^ or 
zero occur with V we have two forms, e.g. andxnu katdvnu and katdvnu 
'we wrote', hu katav and kaiav 'he wrote'. With this we can match only 

■*' If we find a component which is complementary in position i.e. never 
occurs after V, we may group it with i in one component having two (or 
more) positionally determined members. Note that V is not restricted to 
occurring before i, since we can also have katvu 'they wrote' which does 
not contain i. This is so because we eliminated t by writing V for V -\- t, 
so that kaivu is not T' + t p but just T' + p, and katav 'he wrote' is 
just V. The phonemic form and the meaning of i are correspondingly 
changed. In nsader 'we will arrange' V + i p i as compared with siddrtiu 
'we arranged' V + 1 p, the i component does not consist in the adding 
of n-, but in the replacing of a suffix by a prefix; and the meaning is the 
change from 'did' to 'will'. 


one NiV form: hais katav 'the man wrote'. The -nu of katdvnu and the 
zero of katav may therefore be considered as replacing the hat§ (the par- 
ticular N,) of hais katav. If we represent andxnu katdvnu hy 1 P V 1 p,we 
will have no A" position comparable to the position of the second 1: com- 
pare hais katav lo mixtav 'The man wrote him a letter' NaVaPNNb, and 
andxmi katdvnu lo mixtav 'We wrote him a letter' 1 p Val p PN Nb. There 
is no member of A^ which can occur after the Va in the way that the 
second 1 does. It is therefore convenient to consider the andxnv . . . nv as 
represented by a single 1 p extending on either side of the V; it is thus 
merely a long form of -mi, which is also 1 p but on only one side of the V. 
Similarly, hu katav would be merely a long form of katav 'he wrote', both 
represented by V alone (i.e. V plus absence of 1 or ^). The occurrences 
of 1 and 2 next to V can now be considered to be occurrences of particu- 
lar A'',, since 1 and 2 can now be substituted (in their long or short form) 
by any particular A',-.- around katav 'wrote' we find -ti T (1), ani . . . ti 
T (/ + emphatic), hu 'he' (zero + emphatic), zero 'he' (zero), hais 'the 
man' (A'a), etc. We consider 1, 2, zero, Aa as particular members (A',) of 
the class A\ The original component S in these A' positions may now be 
indicated by absence of 1 or 2, and does not have to be indicated by the 
undifferentiated class-mark A.'*^ The zero may be considered either as 
a member of A, or as absence of A in the positions for which A is defined. 
In the former case, both hais katav and katav are NV; in the latter, katav 
is just V, so that some utterances would then consist of V alone, with- 
out A." 

*' The use of 1 and 2 both for the L morphemes and the C morphemes 
makes it necessary to consider possible confusion arising in the case 
where C is zero. For example, hirseti 'I permitted' is V + 1, hirSa 'he 
permitted' is V, hirsdni 'he permitted me' would be V' (+ zero) + /. In 
order to avoid the confusion of two different V -\- 1, we define the 1 and 
^ of C as occurring before the V, and the 1 and ^ of L as occurring after V: 
hirSeti is 1 -\- V, hirsdni is V + 1. Note that the A which will be seen 
below to be substitutable for the 1 and 2 of C also occurs usually before 
the V rather than after it; haiS hirsa 'the man permitted' Aa + V has 
the same utterance status as hirseti 'I permitted' 1 -\- V. Similarly, the 
A which is substitutable for the / and 2 of L occur after the V: hu hirsd li 
'he permitted me' V + PI, hu katav mixtav 'he wrote a letter' V + Mb, 
have the same utterance status as hirsdni 'he permitted me' V -\- 1. Note 
also that the 1 and 2 of C themselves occur before the V rather than after 
it whenever i is present: ar^e 'I will permit' I -\- V i, narse 'we will per- 
mit' ; p + Fi. 

*^ If the component 3 is replaced by zero (instead of by the undifferen- 
tiated class mark A') after p as well as after \', we would have la 'to her' 
as P + zero -\- f (i.e. P f), and so on. 


The components f and p occur, singly or together, only after A', in- 
cluding the /, 2, and zero next to T'. However, if the zero is considered 
not to be an element (member of A^), we would have to say that f and p 
also occur next to V or i {ysadru 'they will arrange' T' + i + p), and next 
to the i.V, ,A>, or liVp of ,A''iV (bnam 'their son' A' + ?; bnotehen 'their 
daughters' A' + f + p -f f + p). 

The result of this whole analysis is thus the representation of the high- 
ly restricted morphemes of classes C, L, and A' by 5 components: i, re- 
stricted to V — ; F and p, restricted to N — and V — (or, if zero is an 
element, to A'— alone); 1 and 2 (or, if zero is an element, /, 2, and zero) 
as new members of A', which have both a short and long form when 
next to T'. The elimination of 3, t, m, and s, as being automatic in 
stated environments, removes from V and A' any dependence upon 
the various 'plural', 'feminine', 'person', 'tense' morphemes: e.g. V oc- 
curs with I, but also without it. 


18.0. Introductory 

This section considers the relation between a morpheme class in one 
position and the same class in other positions. It leads to the recognition 
of constructions such as word and phrase. 

18.1. Purpose: Recurrent Arrangements of INlorphenie Classes 

We note recurrent ^ets of similar morpheme classes, independently of 
how these classes or arrangements fit into the utterance. 

The considerations of 16.5 covered the relations of a morpheme class 
to the sequences which contained it and to the inclusion numbers which 
marked the resultants of these sequences. The procedure of chapter 17 
expressed the relations between one class and the other classes which 
accompanied it. It remains to survey all the sequences, of any length, in 
which a morpheme class A enters, and to see what similarities there are 
among all these sequences, and what sequences of other classes are analo- 
gous in various respects to all or certain ones of the sequences contain- 
ing A. 

To a large extent this attempt to summarize the recurrent arrange- 
ments of classes combines, or may conveniently begin by combining, the 
results of 16.5 and 17. The considerations of both of those sections lead 
to recognizing various larger-than-one-morpheme-length portions of ut- 
terances: in 16.5, these portions are the immediate constituents (at suc- 
cessive stages of analysis) of an utterance or stretch of speech; in chap- 
ter 17, the domains of the components. Here we will go beyond these 
combined results, in seeking identities and similarities in other features 
as well as in those previously considered. For example, we may note 
similarities among classes holding corresponding positions in various 
sequences, or junctures and contours involving particular sequences and 
not others. 

18.2. Procedure: Substitution in Short Environments 

We classify into one construction all sequences which are similar in 
respect to stated features. 

Given the Semitic morpheme-class sequences R -\- v -\- C (Hebrew 
katdvli 'I wrote') and R -\- n -\- K (baxurim 'fellows': 17.31 and Appendix 
to 17.33), we note a number of connections between the two sequences: 
R occurs in both ; v and n are complementary to each other, and both oc- 



cur only with R (and in the same position: staggered in respect to it; 
phonemically, both chisses together represent the only morphemes which 
consist of broken sequences of vowels, rarely with a consonant added) ; 
C and K are complementary classes in respect to v and n} 

These two sequences have entirely different statuses in respect to the 
utterance, since Rv C = V and Rn K = N, so that the similarity be- 
tween them could not be treated in chapter 16. Here, however, this simi- 
larity can be expressed by setting up a sequence R -\- p + H, where p 
is a class of vowel morphemes including v and n, while H includes C and 
K. A fixed sequence such as this may be called a construction. 

The setting up of such constructions without reference to utterance 
status causes us to miss here some of the results obtained in chapter IG, 
so that this procedure cannot replace that of 16. Thus in 16 we would 
have for Semitic Rn = N {baxur 'fellow' substitutable for av 'father'). 
Here, however, we can state no such connection between R + p + H 
and A' (or N + K). 

However, it is possible to find other similarities between the RpH con- 
struction and other sections of utterances. RpH and NK have this in 
common, that only in these two does the class K appear, and only in 
these two is there a sequence of free form with or without bound form. 
Rp and A^ occur by themselves (katav 'he wrote', baxur 'fellow', ben 'son') 
or with C or X (katdvti 'I wrote', baxurim 'fellows', banim 'sons') ; C and K 
never occur unaccompanied by Rp or N. We may therefore recognize 
this as a free form-bound form (stem + zero or more affi.xes) construction 
FB even when no member of B occurs (as in ben 'son' where the free 
form occurs alone). 

The FB construction occurs also with the accompaniment of L and 
of P: laben 'to the son' PN, basipur 'in the story' PRn, bdsipuri 'in my 
story' PRnL, sipuray 'my stories' RnKL. Since both L and P are bound 
forms, we can include the sequences containing them with Rp or A' (e.g. 
PNKL) in the FB construction.^ 

' In the Appendix to 17.33, the class C is broken down and parts of it 
become identical with parts of A^- however, in chapter 18 we consider not 
the morphemes or components of C and A' but their class domain. I'^ven 
without this breaking down, C and A' can be said to have one morpheme 
in common: -a in katva 'she wrote', yalda 'she gave birth' (RvC), and in 
baxura 'girl', yalda 'female child' (RuK). However, the classes C and K 
remain distinct, because in katv — ■, yald — we can replace -a bu -u 'they 
did', etc., while in baxur — , yald — ■ we can replace it by -im 'plural', etc. 
C never occurs without Rv, but K occurs with A^ as well as with Rn. 

^ We may also recognize a compound ,FB'FB construction in which 
the F represents only Rp or A', and in which the tFB part may be re- 


However, the sequence PL also occurs without any free form: li 'to 
me'. It is the only sequence which occurs (in all but exceptional circum- 
stances) occasionally as a complete utterance, and none of whose mem- 
bers is a free form.' 

Each occurrence of FB (including ,FB^FB) and PL has one main 
stress, and some of the occurrences constitute utterances by themselves. 
The only other construction which has these two features is the class U 
of unchanging morphemes: 7na 'what', ze 'this', etc. The members of this 
class never occur with any bound forms except those mentioned below. 

All of these constructions occur occasionally after unstressed bound 
morphemes of the class Q: vd- 'and', se- 'which', etc. These morphemes, 
which differ from each other in utterance status (i.e. in the considerations 
of chapter 16), never occur except with FB, PL, or U following them. 

We may now say that every utterance can be divided into successive 
portions in such a way that each portion is occupied by either FB, or PL, 
or U, each with or without preceding Q. Each such portion may be called 
a word. FB, PL, and U are then the three constructional types of a 
Semitic w^ord, and Q may occur at the beginning of a word of any con- 
struction.* Every word has precisely one main stress, and occurs occa- 
sionally by itself as a complete utterance. No word is divisible into small- 
er sections each of which occurs by itself (except in special circumstances) 
as a complete utterance.* 

The construction ,FB'FB (or ,FB,FB'FB, etc.) differs from a word in 
that each of its two (or more) parts also occurs as a word (except that 
the stress of each part is secondary /,/ instead of primary /'/ when it 
occurs in other than last position in this joint construction). On the other 
hand, the two parts differ from the FB word construction in that Q oc- 
curs only with the first iFB, while L and ha- 'the' occur only with the 

peated: bet hasefer 'school (lit. the house of the book)', ro§ bet hasefer 'the 
head of the school.' L (and the morpheme ha- 'the') occur, if at all, only 
with the last CFB) of the series: bet sifri 'my school.' 

^ PL can be considered a special case of PNL, with zero N. In that 
case it would be included under the FB construction. 

'' There are various limitations as to which particular Q occurs with 
which particular word. 

* Using this property, Bloomficld defined the word in general as a 
minimum utterance: L. Bloomfield, A set of postulates for the science of 
language, L.\ng. 2. 150 (192G). 


last ^FB.^ We may call this a compound word, extending over two for 
more) word-length portions." 

18.21. Features of Construction 

Whether we take a construction such as RpH, or a more inclusive one 
like FB, or the domain of various constructions such as the word, we 
always do it on the basis of certain features of relation among the mor- 
pheme classes (and sequences) involved. We take all instances of the 
construction or domain in question as being identical in respect to these 
features. Such features would be stated for the construction as a whole, 
i.e. for all instances of it. 

These features will often be the types of classes, sequences, or com- 
ponents involved; their order (including such unusual orders as the 
staggered R and p: katav from k-t-v plus -a-a-); stress and intonation; 
which classes are occasionally free and which are always bound; what is 
the smallest, largest, or usual number of classes that occur in instances 
of the construction; etc. A feature of a construction may also be the pri- 
macy of one of its classes over the others. For example, A' could be con- 
sidered primary and Y secondary in a construction if X occurred in every 
instance of the construction while Y appeared only in some instances. 

^'arious different constructions, including construction types grouped 
together in some one domain such as a word, may be similar to each 
other in some of these features, or in particular aspects of them; for ex- 
ample, certain constructions may all have their primary class first in the 

A particularly frequent features of constructions, and of all construc- 
tions having the same domain, is their relation to contours and junctures 
(Appendix to 18.3). Thus in the Semitic example above, all word con- 
structions, and only word constructions, had precisely one main stress; 
and compound w^ord ahvays had a secondary stress on each sub-construc- 
tion (FB) before the last. 

* I.e. Q, L, and ha- may be said to apply (or refer in meaning) to the 
whole joint construction: het sefer 'a school (lit. house of books)', 
bet hasefer 'the school (house of the book)', sebet sifri . . . 'that my 
school . . .' K occurs with each FB independently: bet aii 'house of my 
brother', batey axi 'houses of my brother', bet axay 'house of my broth- 
ers', baley axay 'houses of my brothers'. 

^ A compound word functions as a long component, determining cer- 
tain restrictions (as to stress, Q, L, etc.) in the several word lengths over 
which it e.xtends. The particular FB constructions in it may be con- 
sidered residues in each of these word lengths. 


18.22. Successively Enclosing Constructions 

It is possible to investigate the relation of each construction type to 
longer constructions which enclose it, and to the whole utterance in 
which it is contained. 

A step in this direction is taken when we state whether a construction 
contains free or bound forms; for this means that members of the con- 
struction sometimes or never constitute by themselves the whole utter- 
ance in which they are contained. Or we can say that almost all English 
utterances contain at least one of the free classes (A, N\ V\ D, etc.) or 
the bound class S of 17.31, with zero or more morphemes of the other 
bound classes {Na, several T and P, etc.) grouped around each of these. 
If each of these free classes, and the sequence of bound classes S + E, 
each with or without any of its accompanying bound classes, is not divis- 
ible into smaller sections which occur as complete utterances, then each 
of these constructions satisfies the two conditions for being a minimum 
utterance of the language.* 

Noting whether a given utterance contains various minimum utter- 
ances or other constructions is a departure from the methods of chap- 
ter 16. Those methods enabled us to equate sequences on the basis of 
their elements without regard to the type of constructions that these 
elements comprise. Moroccan Arabic x^iia 'my brother' is A''W^ = A^'' 
and as such has the same resultant as xu diali 'my brother (brother of 
me)' which is A'Z) P N^ = N^. The fact that diali has a main stress in- 
dependent of the stresses in the environment, and that it occasionally 
occurs by itself as an answer, whereas neither of these facts is the case 
for -ia, is not noted in chapter 16. 

If we speak in terms of words, Moroccan ana Sflu 'I saw him' is pro- 
noun plus verb, and sft rr'azl 'I saw the man' is verb plus noun; but in 
substitutable morpheme sequences each of these is AH'W^, with the 
meaning of subject-action-object. The relevance of the NW^N^ analysis 
to utterance structure is seen in the fact that that sequence, or rather the 
N^V^ of which it is a special case, is true of almost all utterances contain- 
ing the /./ intonation, whereas the particular word sequences pronoun- 
verb, verb-noun, etc., are far less general and each occurs in a smaller 
number of utterances. On the other hand, the analysis of utterances into 

* The two conditions being: first, that the construction (class or se- 
quence) occur occasionally as a complete utterance by itself; second, that 
it not be completely divisible into smaller parts each of which meets the 
first condition. 


word DP minimum utterance domains is again very general, being true 
of all utterances. It follows that the most general xnethods for dividing 
utterances into sections are in terms of the utterai>ce-status elements of 
chapter 16, or in terms of the construction domains of chapter 18, but 
not in terms of particular sequences, constructions, or morpheme 

If we arrange the various constructions and their domains in such a 
way that the domain of one set of constructions encloses that of another, 
we will often obtain results parallel in part to the sets of increasingly 
higher inclusion numbers (for A', V, etc.) of 16.21. 

It is possible to establish first the lowest construction domain longer 
than that of a single morpheme, by inspecting all the constructions of a 
language, and taking all those which include, or can be described as in- 
cluding, no smaller construction.'" This group of constructions may or 
may not have a common contour which is lacking in larger constructions, 
e.g. precisely one loud stress, or the domain of vowel harmony. The por- 
tions of the utterance occupied by each of these constructions may now 
be considered to constitute a unit length for constructions — the minimal 
construction length; and the boundaries of each of these constructions 
are boundaries of this unit length. 

We now proceed to group together all those constructions which can 
be described as containing more than one occurrence of the construc- 
tional unit length (i.e. of any of the smallest constructions) but of no 
other. For example, the English construction consisting of two free 
morphemes, each with zero or more bound morphemes, plus the ' — ,i — ■ 
contour (e.g. hookworms, get-up) can be described as containing the 
' — „ — ■ contour with two constructions, each of which is one free mor- 
pheme plus zero or more bound morphemes plus loud stress. Then hook- 

' Languages differ, of course, in the degree of correlation between 
minimum-utterance construction and substitution class sequence. In 
Arabic, single-word sentences have sequences identical with those of sen- 
tences of several words; kthtu and ana ktbt lih both mean 'I wrote him' 
{N'V^X*). In English this is rare. When an English minimum utterance 
occurs as a whole utterance it usually does not have a sequence structure 
comparable to that of longer utterances. We have one-word sentences 
like This. (A'), Going? (A), No! (Indep.), as compared with several-word 
utterances like We need some rain. (ATA'). Such differences between 
minimum-utterance and long-utterance constructions give different ut- 
terance-status to the various morpheme classes of the language (cf. 
Edward Sapir, Language 116). 

'" Following Bloomfield's terminology, this would be a minimum con- 


worms is 'book + 'worms + the ' — ,, — contour, and get-up is 'get + 
'up + the ' — li — contour. The ' — i, — contour thus covers the domain 
of a construction which encloses two (or more) unit-length constructions, 
but which encloses no other, longer, construction than the unit-length." 
All constructions which enclose more than one one-unit-length construc- 
tion, but no others, may be said to have the next higher (second order) 
constructional domain. In many languages, this may be the domain of 
compound words. ^^ 

We may now proceed to those constructions which occasionally en- 
close constructions of the second-order domain. For simplicity, we might 
take certain sequences^ ^ comprising English A'^'' as covering a third-order 
domain, since some of the constructions enclosed in such a sequence may 
be of the second order: that old bookworm. Note that not every construc- 
tion enclosed within A^^ is of the second order: that is not, nor is old (al- 
though old can be replaced by one which is, as in that sour-faced bookworm). 
None of the constructions in a particular A^^ (that old fellow) need be of 
the second order; but the fact that several of them can be replaced by 
second-order constructions {that old bookworm) makes it desirable to con- 
sider A^* even then as being of the third order. 

Similarly, sequences comprising F'* may also be shown to cover a 
third-order domain, since the constituent members of these sequences 
are V constructions of the first or second order. 

As we establish the constructions of some particular order, we define 
them in each case as possible sequences of constructions of lower orders. 
Thus, for English, the first -order word construction was defined as con- 
taining one member of A' or V or A, etc., with or without certain ac- 
companying bound classes. The second-order compound word was de- 
fined as containing two or more words plus the ' — n — contour. The 
third-order phrase constructions could be very roughly defined in terms 
of words and compound words: e.g. the noun phrase would usually con- 

" Similarly, the iFB'FB compound-word construction of 18.21 covers 
the domain next larger than unit length; we may call this the second- 
order domain. The successively higher orders are often called successively 
higher morphological levels. 

'^ Compound nouns may be A'' or A'^ in terms of chapter IG (bookuwyn 
or bookworms). Thus constructions of the second order do not necessarily 
have inclusion number 2 (N^ or V^) in 16. 

'^ This applies to sequences like TDAN-s = A^ {some very old book- 
worms), but not to / = A'"* {you) which is a simple word (first-order con- 


tain T D A N , where each class and each partial sequence of classes can 
be repeated with a member of & before the second occurrence;''' any 
D, A, or N could be a compound word. 

This procedure may be repeated until we find no larger construction 
or domain, in any utterance no matter how long, which we can describe 
as a regular combination of the last previously established domain. 

18.3. Result: Constructions Included in the Next Larger Con- 

We now have a new set of elements, constructions and their members. 
These do not replace all the morpheme classes, nor do they express all 
the relations among the classes. Nevertheless, they furnish our most 
compact way of describing many of the facts about the occurrence of 
the morpheme classes. These constructions are not in general the most 
convenient elements for the utterance analysis of chapter 16; but they 
satisfy other types of utterance description: e.g. we can say that the 
minimum utterance of a language is a word construction, and that al- 
most every utterance of the language is a succession of one or more v,-hole 
word constructions. 

Constructions are particularly useful for various purposes because they 
can be defined, for each language, in a series such that all constructions 
of one order enclose constructions of lower order. We can thus identify 
any morpheme class, group of classes, or construction, in terms of the 
next higher construction in which it participates and the position it oc- 
cupies in it.'^ 

Furthermore, since each construction is a regular combination of next 
lower constructions, we can conversely take any utterance and describe 
it as containing such and such longest constructions, each containing 
such and such next lower constructions, and so on down to the mor- 
pheme classes of the utterance.'* This is similar to the process of deter- 
mining immediate constituents (16.54), except that the present procedure 
is more general. The constructions and domains defined in chapter 18 
make it possible to state a single procedure for analyzing the correspond- 
ingly ordered immediate constituents of various utterances, even though 

'"Yielding TDA&AX, TDAX&DAX, etc. 

'^ Much along the lines of identification in terms of genus and spe- 
cific difference. 

" Cf. the description for classical Hebrew in Z. S. Harris, Linguistic 
structure of Hebrew, Chap. 6. Jour. Am. Or. Soc. 61.164 (1941). 


the morpheme classes of the utterances and the utterance statuses of the 
constructions may be quite different.^'' 

18.4. Reconsideration of Previous Results 

Various operations in the course of chapters 16-8 may effect changes 
in the elements set up by means of the earlier procedures. Any such 
changes constitute a further approximation beyond the results originally 
obtained. Thus the setting up of zero segments (Appendix to 18.2) in- 
volves the addition of new elements (morphemic segments, morphemes, 
etc.) to our stock, or the extension of the distribution of our old elements 
(by adding a new zero member, in a new environment, to a morpheme 
or morpheme class, etc.). The definition of our stock of elements and of 
the environments in which each occurs must now be corrected on this 
basis. Similarly, the voiding of elements (Appendix to 18.2) eliminates 
from our stock some elements which had previously been set up to repre- 
sent particular segments, and changes the distribution of classes, mem- 
bers or residues of which have been voided. 

Other cases may also arise. For example, the new division of mor- 
phemes resulting from 16.32 (e.g. Moroccan dial into dia and I) corrects 
our morphemic segmentation and stock of morphemes, and also the mem- 
bership of some of the morpheme classes. 

Aside from these required corrections, the detailed consideration of 
relations among elements which led to the equations of chapter 16, the 
components of chapter 17, and the constructions of chapter 18, may en- 
able us to reconsider our previous work and to carry out the earlier pro- 
cedures to a closer approximation. This may be done so as to yield some- 
what different elements, in terms of which the equations, components, 
and constructions would be more simply stated.'* Thus if a word-con- 

'^ Cf. the discussion on constructions in R. S. Wells, Immediate Con- 
stituents, L.^NG. 23.81-117 (1947). 

"* Neither this nor the considerations of fn. 33 below imply that the 
previous work is inadequate for 16 7 or has to be reorganized. The mor- 
pheme classes of 15 satisfy 16 by definition, since the operations of both 
sections depend upon substitutability within the utterance. These 
classes also satisfy 17, since the requirements of 17 (substitutability 
within any stated part of an utterance) are included in those of 15-6. 
The classes of 15 can therefore enter as wholes into the groupings of 17, 
though the sequence resultants of 16 of course may not. In none of those 
cases is it necessary to take into consideration the individual members 
of these classes, except insofar as there may be distribulionally different 
sub-classes which were disregarded tentatively in 15. The work of 16-7 
may nevertheless lead to changes in the membership of the classes of 15, 
and may require a comparison of the distribution of one member of a 


St ruction is set up for a particular corpus, and if morphemes are found to 
have different members at the boundary of this construction and else- 
where, in a way that had not been previously expressed, we can now set 
this difference up as marking a construction juncture. 

Appendix to 18.2: Zero Segments and Voided Elements 
1. Zero Segments Represented by Elements 

In setting up the constructions of chapter 18, as elsewhere in the carry- 
ing out of the linguistic description, we may have occasion to accord the 
status of linguistic elements to zero stretches of speech. True, this pro- 
cedure is never unavoidable: in any corpus of material, it is possible to 
identify every linguistic element solely in terms of non-zero stretches of 
speech. For that matter, any corpus could be described in terms of ele- 
ments each of which represents only the addition (to the utterance in 
which it occurs) of some stretch of speech. In some cases, however, it is 
convenient to identify an element as representing an interchange of seg- 
ments (i.e. the omission of one stretch of speech and the addition of an- 
other in its place, e.g. in 12.331), rather than a simple addition. In other 
cases, we may recognize just the omission of a stretch of speech as indi- 
cating a linguistic element (e.g. 12.333). 

Even in the simpler situations, when we say that two stretches of 
speech contain an identical additive element, we may not wish to state 
what part of each stretch represents that element (e.g. the components 
of chapters 10 and 17) ; in such cases the linguistic element represents the 
difference between two segments (as f represents the difference between 
Hebrew hu 'he' and hi 'she'). Finally, we may wish to set up a linguistic 
element to indicate the non-addition and non-omission of anything in a 
particular environment, i.e. to indicate a segment consisting of zero." 
All such elements are possible, in terms of our present methods, because 
they are all definable in terms of segments : they are all relations among 

class with that of the other members (e.g. in the Appendix to 18.2 and in 
fn. 33 there). We reconsider our work here because our later results 
show us what choices it would have been most convenient to make at 
various earlier stages. 

'^ Cf. e.g. R. Jakobson, Signe zero, in Melanges delinguistique oflferts 
a Charles Bally 143-52 (1939) ; also Das Nullzeichen, in Bulletin du cercle 
linguistique de Copenhague 5 (1938-9) 12-4 (1940). 

2° All such elements may be looked upon as extracted from simple 
representations of segment sequences by element sequences (20.22, cf. 
chapter 20, fn. 12, 13). 


The basic condition for setting up a linguistic element representing a 
zero segment is: Given a class X containing stated members in stated 
environments A, the class may also be defined as always occurring in 
certain other environments B where its other members do not occur. 
Then zero segment, as a member of that class, occurs in each of these 
other environments B.^' 

Giving the status of a linguistic element to zero segments can be car- 
ried out in a great many situations. It can be used in such a way as to 
blur the differences between two sets of morpheme-class relations. Note 
must therefore be taken of the descriptive effect of each zero segment 
that is recognized in t'he course of an analysis. In keeping with the pres- 
ent methods, it would be required that the setting up of zero segments 
should not destroy the one-one correspondence between morphological 
description and speech. Hence a zero segment in a given environment can 
only be a member of one class. Defining a zero segment may be useful in 
a case such as the following: Suppose that the sequences AXaYa, AXbYt, 
and AXc occur (where Xa and Xb are either the same or else different 
morphemes of class X), and that Xa and Xb have no descriptively rele- 
vant difference as against Xc except for the relation to Y stated here. 
Then we recognize a zero segment after ^X,. as a member (Yz) of Y, and 
thus obtain the element sequence AXcY^. We can now say that each oc- 
currence of the environment AX — contains some member or other of }'. 
Techniques of this type are especially useful when we wish to set up 
AXY as a construction, in terms of chapter 18, and do not wish to ex- 
clude therefrom the AXc sequence. 

Examples of such zero segments in phonology were the phonemic junc- 
tures (chapter 8), which set up a new phonemic element indicating a zero 
segment in a unique environment of other segments. The juncture ele- 
ments do not, of course, represent zero; they represent particular fea- 
tures of neighboring segments. But the position they occupy is that of a 
segment of zero length in the utterance. 

Examples of zero morphemic segments grouped into a morpheme: the 
zero after cut in / have cut can be included as a member of the {-en} mor- 
pheme of / have taken, I have walked (Appendix to 13.42); the zero after 
sheep in The sheep are being shorn can be included as member of the { -s \ 
morpheme of the boys are returning, The children are here. In both cases, 

^' I.e., if ^X occurs, and B occurs (without X), while BX does not oc- 
cur, we may define B as representing the elements BX. The segment 
represented by X in the environment B — is zero. 


the morpheme can be considered to occur (inchidinp; in its zero member) 
in every / have V — or The N — are . . . On the other hand, matching 
/ cut it with / picked it and / saw it does not enable us to define after cut 
a zero segment member of {-ed], because I cut it can also be matched 
environmentally with / pick it, I see it. We cannot set up a zero member 
of {-ed] in the first / cut it and not in the second, because the one-one 
correspondence would be lost thereby: upon hearing / ciit it without 
further environment, we would not know if the morpheme {-ed\ occurs. 
Similarly, we cannot set up a zero {-s} in / see the sheep which can be 
matched both with I see the dogs and / see the dog. In view of this situa- 
tion, we might not wish to recognize even the zero {-en} after cut, since 
cut would anyhow have to remain an exception in respect to the {-ed\ 
morpheme. Similarly, we might not wish to recognize a zero {-s] for 
sheep in The — are . . . since sheep remains an exception in respect to 
the same {-s\ in / see the ■ — ■. However, there is some reason for setting 
up the zero segment in the cases where it is possible to do so. The recog- 
nition of the zero shows that in these positions one can distinguish the 
grammatical category (and meaning) marked by the morpheme in ques- 
tion from that marked by its absence: / have cut is distinct from I cut just 
as / have taken is from / take. 

Examples of zero morphemes in a morpheme class : If we consider the 
iV and V general classes in English, we find that they have a great many 
morphemes in common. We find further that some of these morphemes 
occur with greater frequency in N positions than in V (e.g. book) and 
others occur with greater frequency in V positions than in N (e.g. take). 
We also note that there is a class A'V of morphemes such that N Nv = V 
(e.g. lionize), and a class Vn such that V Vn = N (e.g. punishment). It is 
therefore possible, though not necessarily convenient for most purposes, 
to define a zero morpheme of the Nv class which occurs after the primarily 
.V morphemes when they are in F position, and a zero morpheme of the 
Vn class which occurs after the primarily V morphemes when they are 
in A' position. Then book could be A^^ in a fine book, and Na + Nva = V 
in Better book this fellow; take could be V in Better take this fellow and 
Va + Fa = A' in o fine takc.^^ 

A somewhat different case of zero morphemes may be seen in such ut- 
terances as the clock he fixed N^N*Vd* and / know he is N*Vd^NH'*. 
These vary freely (except for stylistic differences) with the clock that he 

^^ Cf. also the zero stress stem in Leonard Bloomfield, Menomini mor- 
phophonemics, Travaux du Cercle Linguistique de Prague 8.108 (1939). 


fixed N^ + Na^ + A'M'rf^ and with / know that he is A'^V + Na^ + NW* 
respectively (Appendix to 16.31). In terms of chapter 16 we would merely 
state that N^N'^V/ = N^ that N'^Vd^ = A^^. Here, however, we can say 
that the two equated sequences may be considered identical as to their 
class composition, if we define a zero member of the class A'"o^ which 
includes that, who, etc. Then the clock he fixed is also N^NJN*Vd^, and 
/ know he is is A^^TV.Va^A^^V^; and A^o' is defined as having members 
that, who, zero, etc.^^ any of which may occur in this position.^'' 

2. Voided Elements: Non-zero Segments Represented by Absence 
of Element 

In contrast with zero segments we may consider a technique which is 
in some ways its converse: the voiding of elements (i.e. replacing an ele- 
ment by zero). The setting up of an element for a zero segment had regu- 
larized the class involved, i.e. had made its distribution similar to that 
of some other class. ^^ To effect this, a portion of speech containing no seg- 
ment or no change of segments, is represented by a linguistic element.^® 
Conversely, there may be situations in which we wish to say that a por- 
tion of speech which contains an observed segment is represented by ab- 
sence of a linguistic element. That is, we may take a non-empty stretch 
of speech and say that it has no independent descriptive status and is in 
itself represented by no linguistic element, i.e. by a voided, or zero, ele- 
ment. All the cases in which this can be done are cases of partial inde- 
pendence, and the effect of this technique is to change these to cases of 
complete independence. 

'^^ Many details are omitted here. For example, the classes in the clock 
— he fixed and / know — he is are not identical (as the A^a^ used in both 
would suggest) : zero or that occur in both, but ivhat occurs only in the 

^* In contrast, the use of zero would not be desirable in such a case as 
I'll make him a party N* V/N/Ni^ matched with ni make a party for 
him N*V/Nb*I^ci\'a^- It would seem that we could set up a zero member 
of Pc between the two A' of the first utterance. But it would then be 
necessary to say that NJPcN b* when Pc has zero member is equal to 
Nb*PcNa* when Pc is not zero. We (;ould not simply write N*PcN* since 
unless we know whether the Pc is zero or not we do not know which of the 
two A'^ is the direct object of the verb. 

^* E.g. if Y occurs in all the positions of A' (or in corresponding ones) 
except for certain of these positions, we define a zero member of )' in 
those positions and thus remove the exception. The distribution of )' now 
corresponds fully to the otherwise recognized distribution of A'. 

^* The setting up of zero segments may be useful also in larger con- 
structions, e.g. in some of the cases of what is known as ellipsis. 


The basic condition for representing an added segment by no added 
linguistic element at all is: Given a class A' which never occurs (in stated 
environments) without some member (^i, A^, etc.) of another class A ac- 
companying it, we can say that X occurs sometimes without accompany- 
ing A (i.e. is independent of it), by choosing some member A^ of A and 
voiding its linguistic status. That is, we say that the segment sequences 
XA\, XAz, etc. are cases of the element sequence XA, but that the seg- 
ment sequence XA, is represented by the single linguistic element X. 
The segment A^ in the environment X—, is void so far as descriptive 
elements are concerned. ^^ 

This technique is useful if we wish to free the distribution of X. It is 
basically an extension of the methods of chapter 12. For if every time X 
occurs it is accompanied by some member of A , then X is not independ- 
ent of A and the two should not constitute two separate linguistic ele- 
ments. If X and A were individual morphemic segments, this would be 
a case of partial or complete dependence (according as whether A for its 
part ever occurred without X), and the two might be set up as constitut- 
ing together one linguistic element (Appendix to 12.22). However, when 
X and A represent classes (of segments, morphemes, etc.) it is impossible 
to consider their sequence as one element, because though X is dependent 
upon A (i.e. always occurs with it), it is independent of any particular 
member of A (since X sometimes occurs with Ai, sometimes with A2, 
etc.). Our methods seem inadequate for the expression of this relation 
between the classes X and A.^^ This crux is eliminated by the present 
technique of voiding one of the members of ^. In effect, this procedure 
replaces the class-dependence of X on J. by a complete dependence of X 
upon a particular member ^^ of yl . This complete dependence is expressed 

^^ The absence of the class symbol A after X (when we find X by itself) 
represents the occurrence of the segment .4^. In selecting which member 
of A should be the voided A^, we consider how we may obtain greatest 
simplicity of description, or which member has special restrictions or 
least statable specificity of environment: cf. the component 3 below, 
and chapter 16, fn. 32, and chapter 17, fn. 14. 

^* If we say that X and A are interdependent, and constitute one ele- 
ment, we would leave unexpressed the difference between XAi and A^42; 
and if we say that X and A are two independent elements, we would 
leave unexpressed the fact that A' never occurs without .4. Since the ele- 
ments of descriptive linguistics are the distributional independencies we 
have here a case which is satisfied neither by setting X and A as separate 
elements, nor by setting them up as a single element. It will be seen that 
the desired elements are the new^ class A' and the new member X^ of X, 


in the manner of chapter 12 by saying that the segments XA^ together 
constitute one element X (or rather, its new member XJ. A new class A' 
may now be defined, containing all the members of the old class A, ex- 
cept for Az. The element X has various members before various members 
of A'; when no A' follows, the new member of X is A'^. X is now no longer 
dependent upon the class A': it often occurs with A', but it also occurs 
without A'.^^ 

In some cases the environment in which an element is void (considered 
zero) is simply statable, and the segment which is no longer independent- 
ly represented is indicated by the absence of a stated number or class of 
other elements. Thus in the Appendix to 17.33, instead of saying that V 
always occurs with either i or t, we identify Ft as a single element V, 
and say that V is free: sometimes it occurs with i, and sometimes it does 
not.^° The segments which had been previously represented by t are now 
represented by the absence of i after V. 

In other cases the environment in which an element is void may be 
variegated or unstatable, but is always identical with the environment of 
a particular morpheme class. For example, the Hebrew component 3 
'third person' (Appendix to 17.33) occurs in A^ — ■, P — , V — (e.g. -o 'his, 
him' roso 'his head', lo 'to him', etc.); in all these positions it is replace- 
able by A^ (ros haxevra 'head of the company', laxevra 'to the company', 
yi&ddti xevra 'I establish a company'). We cannot, however, state a diag- 
nostic environment for 3, in such a way that every time the environment 
occurs we would know that 3 occurs too, even if we don't explicitly indi- 
cate the occurrence of 3. For N and V also occur without following N or 
3: ros 'head' yisddti 'I established'.'' Hence, 3 cannot be indicated merely 

2^ In voiding elements, one-one correspondence of our representation 
is preserved by defining the segments XA^ as the element X only in those 
environments in which segment X without following segment A does not 
occur. When, in these stated environments, we see the element X, we 
know it indicates only the segments XA^. In spite of this limitation, the 
indiscriminate use of zero segments and void elements can make many 
different language structures seem sterilely similar. Caution in their use 
is therefore necessary. 

^^ In the latter case, the element V indicates the segments Vt. Similar- 
ly, N or V when not followed by f indicate the segments Am and Vu 
respectively, and when not followed by p indicate the segments A's and 
Fs respectively. 

'' P does not occur without following A' or 3. We could therefore con- 
sider 3 as void A' (i.e. as represented by mere absence of A'^), and write 
P for lo 'to him', PN\ for li 'to me', PN\V for Idnu 'to us', PNa for 
laxevra 'to the company', PNb for lai^ 'to the man', and so on. 


by the absence of A' in A" — or V — . But it can be indicated by absence 
of any particular member of A'^:" A'^rA'i would be roH 'my head', NrN't 
roSxa 'your head', A^rA^a ro§ haxevra 'head of the company', but A'^rA'^ 
(with no subscript to indicate any particular member of A'^) would be 
roSo 'his head', and Av with no A'^ (either particular member or general 
class mark) following would be ro§ 'head'. 

The case of t and the case of 3 may be considered to be similar, if we 
say that in each case there is a diagnostic environment plus a class of 
elements which occur in it: the occurrence of each member of that class 
indicates some corresponding segment, and the occurrence of no member 
of that class indicates the one remaining (voided) segment. In the case 
of T, the environment was V — , the class occurring in it was the com- 
ponent I, and the segment indicated by absence of that class was t. In 
the case of 3, the environment was A^A' — ■ or VN — (or PN — ■), the class 
occurring in it included the differentiators of the particular members of A^ 
(subscripts i, 2, a, b, etc.), and the segment indicated by absence of that 
class of differentiating subscripts was S." 

3. Relation between Zero Segments and Voided Elements 

There is no necessary relation between zero segments and voided ele- 
ments. A zero segment may be represented by a non-void element: e.g. 
the zero member of j -en ! in have cut — -. Or it may be represented by a 

^^ Or any member of A'^ other than 3, if 3 is taken as being a member of 
A" (which it would be on grounds of substitutability). 

" A certain extension in the use of our symbols is involved here. Pre- 
viously, (in chapter 13) we had considered the individual segments such 
as ros 'head', ani T to be our elements. Later (chapter 15), we took all 
those elements which substituted for each other and considered them 
members of one larger element : ros and ani became merely members of A', 
and the difference between them no longer mattered; all that mattered 
was their distributional similarity, i.e. their occurrence in .V position. 
Now we recognize both the element A' and the elements which are dis- 
tinguished among its members (ros, ani, etc.): A' is an undifferentiated 
class element, and the differences among ros, ani, etc. are residual dif- 
ferentiating elements <,, i, etc. These residual differentiating elements 
occur only with X, and are therefore complementary to the residual dif- 
ferentiating subscripts of other class elements such as V; hence, we can 
pair them, if convenient, and consider the „ subscript of A' to be the same 
residual element as the „ subscript of V. The inefficiency of considering 
both the class A' and also all its members (ros, ani, etc.) as independent 
elements is avoided by taking one of the old members of A' (in this 
case 3) and considering it to be identical with, and indicated by, the uri- 
differentiated class-mark A", so that when no subscript follows, A^ indi- 
cates that which we would have otherwise written A'3. 


void element. The void element may be the absence of a class: e.g. zero 
'masculine' and 'singular' after Hebrew A'^ are indicated by absence of 
F or p (baxur 'fellow', baxura 'girl', baxurim 'fellows'); zero 'third per- 
son' after V is indicated by absence of / or ^ (katav 'he wrote', katavti 
'I wrote', katdvta 'you wrote'). Or the void element may be the absence 
of any member differentiator of a given class: e.g in Moroccan Arabic, 
absence of vowels in R may be regarded as a zero segment, meaning ac- 
tion, which constitutes the void member of the class v of vowel mor- 
phemes (with action-type meanings: ktb 'he wrote' Rv, katb 'he corre- 
sponded' RVa). 

Analogously, non-zero segments or classes of segments may be repre- 
sented by non-void elements, as is usually the case: ani T represented 
by the component 1 (or the residual differentiator A'^i of the class A'^). 
Or they may be represented by a void element. Here again, the void 
element may be the absence of a class: e.g. hu 'he' before V is repre- 
sented by absence of A'^ (A'^iFo is ani katavti or katavti 'I wrote', NaVa is 
hais katav 'the man wrote', Va is hu katav or katav 'he wrote'). Or the 
void element may be the absence of any member differentiator of a given 
class: e.g. hu more 'he (is) a teacher' is A^ Na as against ani more 'I (am) 
a teacher' A^iA^a. 

Nevertheless, there are frequently special relations between zero seg- 
ments and void elements. Thus the voiding of elements is especially con- 
venient when the segment represented by the void element is at least oc- 
casionally zero. For example, void A'^ in the environment — V repre- 
sents the component 3 'third person' which appears in several mor- 
phemes, one of them zero: katav 'he wrote' V, hu katav 'he wrote' V, 
yixtov 'he will write' Vi. Elimination of 3 is particularly desirable because 
when the ^V preceding V is any member other than / or 2, the affix of V 
is always 3: ani katavti 'I wrote', hu katav 'he wrote', hais katav 'the man 
wrote', hais yixtov 'the man will write'. If 3 were recognized as a member 
of N, on a par with / and 2, we would have to say that when Na occurs 
before V, 3 occurs affixed to the V: katavti and ani katavti are both NiVa, 
katav and hu katav both N^Va, but hai§ katav would be Na + 3Va, 
haiS yixtov Na + 3Vai. There would thus be a special limitation upon 
1 and 2 as compared with 3 in the environment A'o — Va. However, if 3 
is represented by no element at all in the environment — V, then haiS 
yixtov is merely Na + V'qI, just as ani extov 'I will write' is A'^i -j- I'ai and 
hu yixtov 'he will write' is Vai. Instead oi 1 , 2, and 3 constituting a differ- 
ent class than the other A', and having one of their number, 3, occur with 
these other A'^ (as in A'^o -f- 3Va), we now have /, 2, and all other A'^ mem- 


bers in one class. The non-occurrence of other .V with / or ^ is due not to 
a restriction of distribution but to the fact that in general the members 
of A^ (including / and 2) replace each other rather than occur together. 
The only difference between 1 or 2 and the other \ is one of phonemic 
constitution rather than of distribution: the form of the verb-prefix i, 
which is yi after other members of N (and after absence of A'), is differ- 
ent after 1 and 2.^* 

The similarity and difference among zero segments, void elements, and 
other elements of our representation of a corpus may be summed as fol- 
lows: in the case of other elements, segment X is represented by element 
X; in setting up a zero, segment X is represented (in given environments) 
by elements XY; and in a voided element, segments XY are represented 
by element X.^^ 

Appendix to 18.4: Correlation with Previous Results 
1. With Phonemic Features 

The various new elements and classifications of chapters 16-8, the 
sequence resultants and successive inclusion numbers, the components, 
and the constructions, are often found to correlate with phonemic se- 
quences and the like. This may happen if all the members or every do- 
main of some sequence, component, or construction (or of some position 
within these) have some feature in common. For example, if we have a 
number of morpheme classes which regularly have zero stress, ^^ and if 
these classes occurred in a particular construction position, we could say 
that zero stress is a feature of this construction position: this position 
always has the feature in question. Furthermore, we no longer need say 
that these morpheme classes have zero stress; it is enough to say that 

^* A partially similar special relation of one member of a class to the 
other members is seen in the substitutability of one for almost any mem- 
ber of English X: a long experiment, mid a short one. Cf. Leonard Bloom- 
field, Language 251. 

^= The method of the Appendix to 18.2 is thus essentially comparable 
to that of chapter 12; and the operation of 12 may be considered a first 
approximation to it. In both cases the operation is one of according ele- 
ment status to segments. For example, in the Appendix to 12.233 it would 
have been possible to segment ran into r — n + /ae/, run into r — n + /a ■', 
walked into walk + ed, walk into walk + zero. It would then be possible 
to void the /9; of run and the zero of walk. 

" Which w^ould have been noted in the Appendix to 15.5. 


that construction position has zero stress, and make no stress correlation 
for the individual morpheme classes. 

This correlation becomes of greater interest if the phonemic feature in 
question does not occur outside of that construction position. We can 
then recognize the occurrence of the construction, with its morpheme 
classes, from the presence of the phonemic feature." 

The very fact that a sequence, as it appears within an utterance, also 
occurs by itself, or does not, may be a characteristic feature. 

2. With Boundaries 

Frequently, the phonemic features characteristic of a construction 
occur at the boundaries of its domain.^* Certain phoneme sequences may 
occur only or never if part of the sequence is the end of one construction 
while the rest is the beginning of the next. Thus in Hidatsa of North 
Dakota, kk occurs across the boundary of two constructions, but never 
otherwise: ha^ruk kaWak 'then running' (two constructions in that there 
are two loud stresses; the morphemes are he 'say, do', ru subject changer, 
ak indicator of immediately preceding action), but a'ah^kic 'he brought 
him to it' (one construction with one loud stress; the morphemes are 
e-'e 'have', ak indicator of immediately preceding action, with k replaced 
regularly by h before close juncture + k, kv 'get back', c indicator of ac- 
tion). From the presence of kk we know not only that there are two con- 
structions present, but also where their boundary lies. Such distributions 
limited to boundaries are frequent in many languages, and are among the 
justifications for the operations of chapter 18; for if a number of se- 
quences which differed in chapter 16 (e.g. sixths is A'^ and glimpsed is V) 
are structurally similar in chapter 18, (both stem plus suffix), and if 
some phonemic feature (clusters like /ksfls/, /mpst/) is limited precisely 
to these structurally similar sequences (not to all A^ or V, but only to 
any construction containing suffixes after the stem), then it is most con- 
venient to describe all these structurally similar sequences as constituting 
one construction, and the phonemic feature as a characteristic of that 

" E.g. the number of times the construction occurs can be gauged from 
the number of times the phonemic feature occurs. 

^* And are often included in what are called sandhi features. 

" The term construction type will sometimes be used for construction 
to make it clear that the reference is not to a particular combination of 
morpheme classes but to a combination of groups of classes as defined in 
chapter 18. 


In many languages such special phonemic features occur at the bound- 
aries of various sequences, components, or constructions, including con- 
structions one of which occurs within the other. The special phonemic 
features at these boundaries in some cases are the same as those at other 
boundaries, and in other cases are different. In some cases, a special 
phonemic feature occurs both at the boundary between morphemes and 
at some or all boundaries between the larger stretches. 

For each special phonemic distribution, it is possible to reconsider the 
grouping of phonemic segments into phonemes (chapters 7-9) and see 
if it is possible to find a grouping which will obviate the phonemic differ- 
ence at these boundaries without sacrificing any of the advantages of the 
e.xisting grouping. If phonemic segment or sequence X occurs near a 
particular morphological boundary, and phonemic segment or sequence 
y never doeSj we may make X and Y phonemically identical, while recog- 
nizing the boundary as a phonemic juncture.''" This is exactly analogous 
to chapter 8, and means merely that here, as in chapter 14, we may find 
new candidates for the status of phonemic juncture.'" 

This regrouping of phonemic segments is of course the more important 
if the boundary and the non -boundary ones, which at present are mem- 
bers of different phonemes, occur in different members of one morpheme. 
For then, if we put the two segments into one phoneme, we have elimi- 
nated the morphophonemic difference between the two (phonemically) 
variant members of the morpheme and left one member (phonemically) 
in their place. 

In addition to the phonemic features, constructions in some cases have 
phonetic boundary features which though positionally limited are not 
phonemic because they appear only occasionally, i.e. their occurrence is a 
free variant of their non-occurrence. Thus pauses (for breath, for hesita- 
tion, for emphasis, for interruption, etc.) rarely occur within a mor- 
pheme, but in many languages will occur with increasing frequency at 
successively larger morphological boundaries. These are the intermittent- 

*° If X also occurs otherwise than at the boundary, it could be assigned 
in those other positions to some other phoneme to which it would be 
complementary in those positions. 

^' If some occurrences of a construction boundary can be made into a 
phonemic juncture (say, those where a particular phoneme precedes the 
boundary), while other occurrences of it cannot, we say that next to the 
phoneme in question the new juncture indicates the construction bound- 
ary, but that next to other phonemes this construction boundary has no 
phonemic mark. 


ly present distinctions of the Appendix to 4.3, and one of the con- 
veniences of setting up the domains of sequence resultants, components, 
and constructions is that it is these domains which usually correlate with 
the point of occurrence of these intermittently present elements. 

We can group together all sequence resultants, components, and con- 
structions whose domains involve the same boundary junctures, inter- 
mittently present phonologic elements, contours, or other features. The 
juncture, contour, or other feature involved would then be said to apply 
to the domain common to all these, or to the class of morphological ele- 
ments (and sequences) which includes all these. ^^ In many languages 
we will find several such sets of features, marking several types of do- 
mains (usually one enclosing the other). 

3. With Contours 

In many languages there are phonemic or phonetic features which ex- 
tend over the length of various morphological domains. E.g. in English 
and in other languages, the constructions which sometimes occur by 
themselves (FB: a free morpheme with zero or more bound forms, oc- 
casionally doubled or trebled with jj stress) have exactly one loud stress; 
and in general no stretch of speech contains precisely one loud stress ex- 
cept one of these. A domain of this type is often called a word. There may 
be several contrasting construction types all having the same contour 
(and boundary) features, and hence constituting the same domain.'" 

This, or a somewhat similar domain may also be the domain of other 
and less frequently noticed features. E.g. in some languages, the duration 
of phonemes is longer if the word in which they are contained is shorter. 
The phonemes /taeb/ are longer in The number on this tab has to be registered 
than in The number on this tabulating -machine has to be registered. If we 
find, for example, that the domain of vowel harmony in a language is 
somewhat larger than the domain of word-tone contours, or that its 

*^ If the various features all occur whenever the domain occurs (i.e. 
are automatic in respect to it), they are not phonemic but are included in 
the definition of the juncture or contour marker of that domain. If vari- 
ous grades of them occur in various occurrences of the constructions of 
that interval, then these grade contour differences are phonemic and are 
marked as in chapter 6. 

■" Constant checking on the morphemic contours may show that cer- 
tain of them correlate in a general way with other features of sequences. 
E.g. any English sequence which is covered by a ' — „ — morpheme is 
equatable to some one morpheme class. 


effect carries for one morpheme class in the sequence longer than do 
the morphophonemic changes of tone, we merely recognize two domains, 
one being that of tone contours and the other, including or overlapping it, 
being that domain of vowel harmony. Both would roughly be the length 
and character of what is often called the word/'' 

The contours indicate in general the boundaries of the domains over 
which they extend, although in some cases, e.g. the duration of pho- 
nemes, the boundary points are much harder to discover, and perhaps 
less fixed, than the character of the contour. 

4. With Morpheme Classes 

Constructions and other domains may correlate with particular mor- 
pheme classes in a sufficiently general manner to merit special note.^^ For 
example, a number of morpheme classes may appear together in various 
combinations, but never in constructions with other morpheme classes. 
The constructions in which the former participate may differ in many 
respects from the other constructions. This is often the case in languages 
which have a large stock of morphemes borrowed from foreign sources 
with some of the grammar of their original language, e.g. English words 
of Latin and Greek derivation. Many of these constructions and classes 
may occur primarily in special styles or social dialects of the language, 
e.g. the use of the above English vocabulary in learned speech and in 

In extreme cases such situations may be best handled by independent 
grammatical systems for the distinct sets of material, below a certain 
level (e.g. in domains of word or smaller); the constructions resulting 

■•^ A language may thus have two different domains, one enclosing the 
other but only slightly larger than it, and both of which are close to 
what would usually be called the word. Rather than make one of these 
domains basic and say that the other is based upon it but with some 
change, we can simply speak of two different related domains. Such is the 
case for the domains of vowel harmony and of word in Turkish. 

*^ Thus the special Semitic classes of non-contiguous-consonant mor- 
phemes (R) and non-contiguous- vowel morphemes (n and v) had to be 
treated separately in chapter 16: R + n = X, R + v = V. In 18.2 how- 
ever, both sequences, i.e. all cases of these discontinuous morphemes, are 
grouped together into one construction. Most Semitic words are of the 
structure Rp (where p indicates either n or f indifferently). 

*^ Cf. Leonard Bloomfield, The structure of learned words, in A com- 
memorative volume issued by the Institute for Research in English 
Teaching (Tokyo 1933). 


from each system at the highest level at which they are distinguished 
would be used in identical manner (though perhaps with differences in 
stylistic environment) in the larger domains of the utterance. ^^ 

5. With Meaning 

The meaning of any domain, whether morpheme or larger, may be de- 
fined as the common feature in the social, cultural, and interpersonal 
situations in which that interval occurs. It is often impossible to state 
such a common feature of meaning; we can then say that the meaning 
of an element in each linguistic environment is the difference between the 
meaning of its linguistic environment and the meaning of the whole ut- 
terance (i.e. the whole social situation). Thus the meaning of blue in 
blueberry might be said to be the meaning of blueberry minus the meaning 
of berry and of the ' — ,i — morpheme: blue here therefore does not mean 
simply a color, but the observable differentia of blueberries as against 
other berries. 

However, in some languages, including English, the easily observable 
variation of meaning is very great. The correlation with meaning can then 
be made directly with the sequence of morpheme plus its environment, 
using as much of the environment as is necessary to differentiate the spe- 
cial meaning: we use blueberry but not blueberries, since the meaning of 
blueberries is simply the meaning of blueberry plus the meaning of -s. 
Thus the dictionary which would ordinarily list only the morphemes and 
their meanings and individual special selections, would also list these 
constructional sequences of morphemes, instead of discussing the par- 
ticipating morphemes separately. 

In some cases it would be possible to show that an aspect of the mean- 
ing is common to all the occurrences of a particular construction, no 
matter what the individual morphemes involved. That much can then 
be taken out as the meaning of the domain arrangement, and need not be 
given as due to any of the morphemes involved. Of course, if there is a 
morpheme characteristic of the construction, like the ' — ^ii — morpheme 
of the FBFB construction, the constructional meaning can be assigned 
to that morpheme. If there is no such common morpheme, the meaning 
is assigned merely to the arrangement of morpheme classes which con- 
stitutes that construction.** 

*^ Compare the occurrences of foreign phonemic sub-systems within a 
person's speech, as in 2.31 and chapter 2 fn. 9. 

** Cf. tagmemes in Leonard Bloomfield, Language IGO, 27(), and the 
Appendix to 12.3 4 above. 


All the classes which occupy a particular position in a sequence or 
construction or in a group of partially similar construction types may 
be said to have a feature of meaning in common: Hebrew H (which in- 
cludes C and A': 18.2) in RpH may be said to have the meaning of 
'word inflection.' 

In some cases the participants in a particular position of a given con- 
struction have such common meanings as 'plural,' 'object (of transi- 
tive verb).'^^ 

*^ These would be examples of what may be called semantic categories 
of the grammar. Cf. John Lotz, The semantic analysis of the nominal 
bases in Hungarian, Travaux du Cercle Linguistique de Copenhague 
5.185-96 (1949). 


19.1. Purpose: Stating What Utterances Occur in the Corpus 

Up to this point the morphological procedures, and all the procedures 
with the exception of chapter 11, have only stated various relations 
among parts of utterances — methods of segmenting utterances and classi- 
fying the segments. We have no check on what a whole utterance, or all 
utterances, consists of. We can now show how the previous procedures 
serve to identify the utterances. 

19.2. Procedure: Sequences of Resultants or Constructions 

We state which sequences of the resultant position classes of chapter 16 
or the constructions of chapter 18 occur as utterances in the corpus. 

This procedure, like that of chapter 11, consists of making an assertion 
of occurrence rather than a relational statement : not that X occurs next 
to or is substitutable for Y, but that utterances consisting of XY occur. 
In order to make these assertions as condensed and as general as possible, 
they are put in the most general terms: i.e. they state the occurrence of 
the most general classes or constructions. E.g. for English it is possible 
to say that utterances consisting of NVX'^ occur. This would represent 
the great bulk of English utterances, each of which will contain .V and V 
in that order, although of course they need not each contain any particu- 
lar member of A'^ or V. To obtain from the NVX formula any of the ut- 
terances which it represents, we substitute for A^ or T" the various mor- 
phemes by which they are defined. This may be done by substituting 
first an equivalence in terms of other variables, e.g. TN for A'^, then AN 
for the N, and DA for the A, and so on until we finally substitute a par- 
ticular morpheme for each variable, i.e. each class mark, in our final 
form of the formula. The expansion of the formula, along the lines of the 
restrictions upon concurrence among cla.sses, is carried out by applying 
the relations of chapters 16 and 18, equations and constructions, to the 
formulae of chapter 19. In this manner we can derive from the formula 
any utterance of the class which the formula represents. Correspondingly, 
given an utterance we can say by what formula it is identified, by apply- 
ing to that utterance the equations and construction results of chap- 
ters 16, 18. 

' Where X indicates the class of utterance contours. 




19.3. The Selective Substitution Diagram 

The formulae of 19.2 have the form of a horizontal sequence of vari- 
ables (class or construction marks), the succession from left to right being 
used to indicate succession in time^ (in the utterance which the formula 
represents), and the choice of what variables fill the successive places 
being the indication of the restrictions upon concurrence among the ma- 
jor resultant classes or constructions. They thus leave unexploited all 
geometric dimensions above one. These further dimensions could be used 
to indicate other relations among morphemes, classes, and constructions 
than those of selective succession. The second dimension, that of the 
vertical line, can be used to indicate substitutability of these elements; 
i.e. it can be used to show various equivalences which the variable has 
in given conditions. Thus instead of saying ,YFX we could say 

meaning that both XV and .VFA" occur, i.e. that the variable T' can also 
be replaced by the variable VX which is equivalent to V in terms of 
chapter 16. The condition under which we can have VX is indicated by 
what goes with it horizontally, since horizontal sequence represents con- 
currence. In this case, the condition (or environment) is a preceding X: 
i.e. both T' and VX have the same condition, .Y — , and after A' we may 
get either one of them. 

19.31. Different Conditions for Different Substitutions 

The use of this vertical dimension is more important when we indicate 
different conditions for different substitutable equivalences of a variable. 
These different conditions are always indicated by the sequential vari- 
ables which appear on the same horizontal level as the substitution in 
question. E.g. instead of XV we could say, if we wish to detail only what 
occurs after the verb-phrase:^ 






^ Except for marks indicating simultaneous morphemes, where the 
position of the mark indicates a boundary of the domain of the simul- 
taneous morpheme, or some other point related in a stated way to that 

^ E.g. the object of the sentence. The substitutions recognized in this 
example are only selection of the most general classes or sequences which 
occur after the V. 


This diagram, like the comparable one in chapter 11, represents the oc- 
currence of all sequences indicated by any line which proceeds from left 
to right without crossing a horizontal bar (and without turning back to 
the left). Thus it indicates the occurrence of: 

NV {Our best books have disappeared.) 

NVP (The Martian came in.) 

NVPN {They finally went on strike.) 

NVN {We'll take it.) 

NVb {He is.)* 

NVbP {I can't look up.) 

NVbPN {The mechanic looked at my engine.) 

NVbN {He's a fool. I looked daggers.) 

NVbA {He's slightly liberal. They look old.) 

All the information about the substitutability indicated by the vertical 
relation (zero above A^ above A, etc.) is of course indicated in the pro- 
cedures of chapters 16, 18. The substitutions deriving from those pro- 
cedures may be shown in these diagrams merely for convenience of in- 
spection, and in order to utilize the second dimension permitted by the 
two-dimensional face of the paper and not e.vploited by the formulae 
of 19.2. 

19.4. Result: Sentence Types 

We now have a way of stating, in as much or as little detail as we 
please, what utterances occur. The most detailed diagram or model may 
state the occurrence of each actual morpheme sequence. The most simple 
formulae, such as the NVX of 19.2, are couched not in terms of mor- 
phemes but in terms of the broadest position classes resulting from chap- 
ter 16, or in terms of the components of chapter 17 and the constructions 
of chapter 18. The formulae do not state that, say, NV occurs, but that 
if iV and V each represent any of the sequences equated to them re- 
spectively in chapter 16, then NV occurs. I.e. utterances occur consisting 
of any sequence which can, on the basis of chapter 16, he equated to A^ 
followed by any sequence which can, on that basis, be equated to V. 

Many languages will have more than one basic utterance formula. E.g. 
in English not only does NVX occur, but also any sequence containing 
at least one free morpheme^ occurs as an utterance with utterance con- 
tour. If all these different utterance types contain the same contour 

* Vb indicates a class of morphemes like be, seem whose distribution 
is similar to that of V except that they also occur before A. 

^ In a manner described by the procedure of chapter 18 as constituting 
a word or minimum utterance. 


class, say A', we may omit X from the formulae or diagrams and say that 
it is an automatic feature of utterance structure.* 

The utterances, or the sections of larger stretches of speech, which 
satisfy these formulae may be called sentences. Any stretch of speech in 
the language, no matter how long, can then be identified as a sequence of 
sentence domains (no matter how long or short, or of what formulaic 
type). It may be that regularities can be found in the consecution of 
sentences in a stretch of speech or in a conversation, showing that sen- 
tences of one type are usually followed by others of the same type, or 
otherwise. Such regularities may perhaps be shown for one style of speak- 
ing in the language, and not for another. 

Appendix to 19.31: Detailed Diagrams 

Diagrams, or comparable geometric and physical models, can be con- 
structed so as to represent all the substitutions or equivalences, condi- 
tioned by particular concurrent environments, recognized via chapters 
16, 18. However, such diagrams would in most cases be extremely com- 
plicated, and the advantage of easy inspection would be lost. For par- 
ticular constructions or parts of utterances the diagram may provide a 
convenient summary of the relations of chapters 16 or 18. Thus the 
minimum utterance or word in Moroccan Arabic, i.e. those sequences 
which occur by themselves (with complete utterance intonations), but 
also as parts of longer utterances (in which case they have only some 
section of the utterance intonation), can be described by the diagram^ 
on the opposite page : 

* More generally, we do so if it is possible to determine from the struc- 
ture (i.e. the sequence of classes) of the utterance what contour class oc- 
curs with it. In English X also occurs by itself, without other morphemes, 
e.g. {! with automatic [m rp ] (written M?n.) and {?!j with automatic 
[ha] or [n] (written huh/ and Hmm!). 

' The function of these diagrams is to present inter-element relations 
which would take up far more space, and be far less inspectable, if they 
were stated in English sentences. Each diagram indicates a large number 
of combinations. Hence, far too much space would be required to give an 
example of every sequence permitted by this diagram even if we were to 
use throughout but one member of each of the large classes (S, .S", R, 
P", P"), i.e. even if we were only to indicate the various combinations 
of the morphemes and morpheme classes explicitly listed in the dia- 
gram. A few examples of these combinations are afforded by the various 
Moroccan Arabic words cited in this and previous chapters. This dia- 
gram is not stated in terms of the reduction to components mapped out 
in the Appendix to 17.33, because those components do not differentiate 
between elements which occur within a single word and elements which 
occur over several words. 



As in the diagram of chapter 11, if we draw any line from the left end 
to the diagram to the right end, without crossing any horizontal bars 
(e.g. the line cannot go from Hi to ma), and without going to the left 
(e.g. the line cannot go from fi to ma or mn), the line will pass through 
a sequence of morphemes or morpheme classes which occurs as a word 
(minimum utterance) of the language. Column 1 indicates that every 
word (minimum utterance) may begin with u- 'and' or without it. Col- 
umn 2 indicates that every word, whether or not it began with u-, may 

..•1 j-'tht' 

— ' 



U-| — 

1 - 




'my, me* 



|( 'f«m.' 

1 II 

i 1' 






1 -• 
'his' 'mdsc* * 

1 ra- 

1 ~* 



1 '(«st' 

'third I 


^ , t .' 




' prtflx- 

/ 2 3 

8 9 



then contain any member of S,^ or else one of the relatives Hi- 'that 
which,' ma- 'which', or none of these. If the word contains S it will con- 
tain nothing further. If it does not, it will then have some one element 
included in column 3: either some one of the prepositions^- 'in', etc., 
or else zero. Column 4 shows that following the sum of the three pre- 
ceding columns (one of whose possible sums is zero) we will find either 
any member of the .S" class or else a member of R; if a morpheme (not 
zero) was selected in column 3, it may also be followed by zero in col- 

* iS is the class of morphemes which occur in one word with nothing 
but Ii- or zero: iams 'yesterday', hua 'here', daha 'soon', etc. iS" is the 
class of morphemes equivalent to the sequence /^P"; tamubil 'automobile', 
fm 'father'; R is the class of morphemes which occur only and always 
with P" and P": k-t-b 'write' in klab 'book', klab 'he wrote'. P* is the 
class which occurs with R and with -I T, -hi 'you', etc.: e.g. zero in ktbt 
'I wrote'. P" is the class of the remaining morphemes which occur with 
/E;' — o- in ktab 'book'. Cf. the forms in the Appendix to 16.22. 


umn 4. No word will have zero for all of columns 2 4 inclusive. Column 5 
indicates that members P" and P'' occur with all occurrences of R. Col- 
umn 6 shows that in- 'nominalizer' sometimes occurs next to P" and P", 
and that any sequence containing P" but not m- will contain either the 
suffixation morpheme 'past' or the prefixation morpheme 'imperfective'; 
and column 7 shows that every occurrence of the 'past' or 'imperfective' 
morphemes is followed by one of the personal affixes, either \t\ 'you,' 
or jn| T, or {ip 'third person'. Column 8 indicates that all the se- 
quences not containing {-/} T, have either the feminine suffix \-a\ or 
the masculine suffix zero. Column 9 shows that all sequences containing 
R or 5" have either the plural suffix or the zero masculine suffix. Col- 
umn 10 shows that all sequences containing R, S", or one of the mor- 
phemes (prepositions) of column 3, have one of the personal possessive- 
objective suffixes {-i\ 'me, my', {-k\ 'you, your', {-u] 'that one, that 
one's', and that those with R or S" may have none of these; and further 
that any sequence containing P" -\- m-, or P", or S" (i.e. the same se- 
quences which contain yn- or zero in column 6) will sometimes have l- 
'the' instead of the personal possessive-objective suffixes. Column 11 in- 
dicates that any sequence containing {-k} or {-u} from column 10 will 
have either the feminine or the zero masculine suffix following it; and 
column 12 shows that any sequence containing one of the three personal 
suffixes of column 10 will have either the plural or the zero singular suffix 

A word begins therefore with any morpheme from columns 1-4 in- 
clusive, and ends either with a morpheme of column 9 or with a mor- 
pheme of column 12 (or with S of column 2)}° Every word contains 
either some member of some class out of columns 3 or 4, or else some 
member of S of column 2. 

Limitations of the diagram. Diagrams of this type are adequate for 
representing most of the relevant facts brought out in chapters 16, 18. 
Morphemes or morpheme classes which occur one above the other are 
those that substitute one for the other (e.g. -k for -u). Morphemes or mor- 
pheme classes, between which a line permitted by the diagram can be 
drawn, have between them the relation of concurrence: i.e. there are 
words in which both of them occur (e.g. mn with -i in mni 'from me'; Hi 

« When the morpheme of column 6 is '', these three morphemes 
of column 7 have alternants (suffixed): -ti 'you', -t T, zero 'he'. Zero is 
marked by an asterisk. 

'° Each morpheme is included under the first column in which it ap- 
appears: e.g. S is included in column 2. 


with -t T only in presence of RP'", in Hi ktbt 'which I wrote'). Morphemes 
or classes which are never connected by a line permitted in the diagram 
are mutually exclusive; i.e. they never occur together within the domain 
covered by the diagram (e.g. fi and -t 'V, or the mutual substitutes). If 
the line cannot reach a certain morpheme or class P without going 
through some other one Q, then P never occurs without the occurrence 
of Q (e.g. P" never occurs without R; though R occurs without P"). If 
the line cannot reach from one morpheme or class A to another B with- 
out going through a third C, then AB never occurs without C (e.g. P" 
never occurs with -ti 'you' without the accompaniment of the feminine or 
masculine morpheme; though P" occurs with -t T without the feminine 
or masculine morphemes, and P" occurs with the feminine or masculine 
morphemes and without -ti 'you' or zero '3rd person' if m- precedes 
the P\ 

The time order of the morphemes or classes in the utterance can usual- 
ly be indicated by their order from left to right in the diagram. Situa- 
tions may arise, however, in which that order cannot be represented. 

Such is the case when the relative order is not simply sequential, e.g. 
in the staggering of R and P" or P"; k-t-b and — a- in ktab 'book'; here we 
can only indicate R before or after P" and P". 

Such also is the case when there are special restrictions upon concur- 
rence among morphemes which are not contiguous. Since special restric- 
tions involve special horizontal bars in the chart, it is usually convenient 
to have the two classes involved placed right next to each other. Thus 
column 7 should be next to column 8 because one of its members {-t T) 
does not occur with column 8 whereas the others do. Column 6, in turn, 
should be next to column 7 because only the two bottom members of 
column 6 occur always and only with the members of column 7. We might 
further want to put column 6 next to column 3, because the prepositions 
of column 3 occur before m + P" but not before P' alone. However, col- 
umn 6 has to come next to column 5, because the two bottom members of 
column 6 occur only with P" while 7n- occurs with both P" and P". And 
column 5 has to come next to column 4 because R occurs always and only 
with column 5. Since we have reason to put the 6-7-8 sequence both next 
to 3 and next to 5, we place it after 5 and indicate the special restrictions 
between column 6 and column 3 by projecting the bar at the bottom of 
m- until it reaches column 3. Other arrangements of columns would re- 
quire more horizontal bars. 

Departures in the diagram from the order of morphemes in the utter- 
ance occur also when morphemes which are subst itutable for each other 


occur in different relative positions within the utterance. Thus two of the 
members of column 6 occur before columns 4-5, while the third occurs 
after them. Similarly the prefix I- is substitutable for the suffixes of col- 
umns 10-12. Some indication of order is given in the diagram above by 
placing a hyphen after each (prefix) morpheme which occurs before R 
and S", and before each (suffix) morpheme which occurs after them. 

The vertical substitution columns in the diagram generally indicate the 
grammatical categories" noticed as components or construction types in 
chapters 17-18: e.g. for the Moroccan word, the categories of tense (past 
and imperfective, column 6),'^ person (subjective and possessive-objec- 
tive, columns 7, 10), gender (columns 8, 11), number (columns 9, 12), 
definiteness (column 10, including the 1-). 

The domain of each category is indicated by the horizontal bars in 
its neighboring columns: tense occurs only with P"; gender with S", P", 
P" except when n 'first person' adjoins. 

Generalizations about the categories can be made from the diagram: 
e.g. those members of categories which are usually zero in phonemic form 
are third person, masculine among the genders, singular among the 

Categories relevant to position in the whole utterance, i.e. the position 
classes of chapter 16, may also be correlated with the vertical columns 

" This result is obtained because the utterances of the languages have 
been subdivided into their smallest parts necessary for substitutability 
and restrictions on occurrence. For instance, Moroccan n- 'I will' and -/ 
'I did', and t- 'you will' and -ti 'you did' are divided into {n\ 'I as sub- 
ject', {t\ 'you as subject', Iprefixationj 'imperfective', {suffixationj 
'past', with the statement that the members of {n) are n next to {pre- 
fixation } and t next to { suffixation | ; and so for the other morphemes. 

'^ In order to exclude the m- of column 6 we say that the tense cate- 
gory consists of the members of column 6 which occur only with P". In 
some Semitic languages the m- of column 6 when next to P" indicates 
present tense as well as nominalizer. 

" The diagram also shows, for example, that one element (or more, if 
concurrences with neighbors are used for differentiation) can be voided 
(in the manner of the Appendix to 18.2) for each of columns 4-9, but not 
from all together (since absence of elements over 4-9 occurs near the top), 
and not from columns 1-3 and 10-2 (where absence of element is one of 
the choices). It also shows that one element can be defined for zero seg- 
ment in each column, although not all these zeros would be useful in the 
general statement. Columns 11 and 12 have zeros in addition to absence 
of segment or element: when column 10 is filled we take zeros following 
it as indicating the masculine and singular components; when no form 
from column 10 occurs, then zeros following are taken as absence of any 
element (marked by the empty corner of the diagram). 


if we keep track of such horizontal concurrences as are necessary for 
these correlations. E.g. columns 4-6 together with »S of column 2, yield 
the position classes A'', V, A of Moroccan Arabic.'"* All sequences contain- 
ing S are in a position class A; all those containing P" without m- from 
columns 5-6'* are in position class V; all the remaining sequences, con- 
taining m -j- P", or else P", or »S", or zero from columns 5-6 are in posi- 
tion class A'^.'* Finally, a member (not zero) of column 3, plus anything 
that follows it (which will always be N), adds up to ^. 

Since the diagram uses only two dimensions, vertical and horizontal, 
a third dimension in depth could be utilized to indicate restrictions 
among variant members of the morphemes recognized in the diagram. 
The various members of a morpheme listed in the diagram would be 
placed one beneath the other in depth, all occupying the spot of their 
morpheme in the two dimensional diagram. If the morpheme \n} T in 
column 7 has the member n when the morpheme { prefixation } of col- 
umn 6 occurs, and the member t when {suffixation) occurs, we can de- 
fine directions in depth along which our utterance-making line can go, in 
such a way that when our line goes from | prefixation ) in column 6 to {n\ 
in column 7 it will reach the n- member of {n\, and when the line comes 
from {suffixation) to [n] it will reach the -t member of {n). 

However, it may be desirable to make more limited use of the third 
dimension to indicate those concurrences which cannot be expressed by 
the rules of these diagrams (as will be seen below). '^ 

In some cases the intersections of restrictions are such as cannot be 
expressed by the rules for the two-dimensional diagram. E.g. I- 'the' is 
substitutable for columns 10-12 when *S", P", or mP" occur in the se- 
quence, but not otherwise. It is impossible to include /- in column 10 
and yet indicate that it occurs only with these sequences. If we placed l- 
directly next to the sequences with which it occurs, we would be unable 
to show that when it occurs, columns 10-12 do not. Since I- is mutually 

'^ Cf. the Appendix to 16.22. 

'=■ Equivalently: all sequences containing a morpheme of column 7. 

'® Sequences like^z'o 'in me' may thus be described as consisting of ^ 
from column 3, zero from column 5, -i from column 10, and are mutually 
substitutable (in the terms of chapter 16) withyi dari 'in my house' which 
consists of fi, dar from column 4-5, and -i. 

" No gain in representation can be obtained by manipulating the ex- 
ternal boundaries of the diagram. Since the area of the diagram repre- 
sents the universe of discourse for this representation, no differences can 
be derived from varying the shape of the area as a whole, but only from 
varying the deployment of symbols and lines within the area. 


exclusive both with (.-ohimn 7 ami witli cohimns 10 12, we would place 
it across all these columns, if that were possible. The relations involved 
here can be expressed if we put /- in column 10 but connect it with 
N", P", mP" by a direction (in the third dimension, or otherwise outside 
the previous diagram lines) which carries our utterance-making line 
across the otherwise unpermitted horizontal bars. (In the diagram above, 
the special direction for I- must, of course, lie within the scope of two- 
dimensional representation; this specially permitted direction for I- is the 
dotted line.) Alternatively, we can simply repeat the members of col- 
umns 10-12, so that they will occur both at the top and the bottom of the 
columns while I- occurs in the middle. This would satisfy the actual rela- 
tions, but at the cost of putting some morphemes twice within a column. 
We seek to avoid this, since the whole point of the diagram is to state 
geometrically the interrelations of each morpheme with each other one, 
rather than to have a morpheme appear in various sequences.'* 

This difficulty will occur in general whenever we have three mor- 
phemes or classes, each pair of which has some privilege of occurrence in 
common. In this case, P" and P" have in common, as against prepositions 
(the morphemes of column 3), the occurrence with columns 4, 8, and 9; 
P" and prepositions have in common, as against P", their non-occurrence 
Avith 1-; and P" and prepositions have in common, as against P", their 
occurrence with these prepositions (since prepositions occur alone or 
with P", ?«P^', but not with P"). If we try to place, on contiguous hori- 
zontal areas, those pairs which have a privilege of occurrence in com- 
mon, so that both should occur on one side of the horizontal bar, we can- 
not satisfy (on plane surfaces) all three pairs simultaneously. 

There are also other concurrences which cannot easily be indicated in 
these diagrams. Thus the m- of column 6 occurs with every member of 
P" but with only certain members of P". This fact could be indicated if 
the members of P" and P" had been listed individually in the diagram. 
Diagrams which deal entirely with the individual morphemes are pos- 
sible, especially when they are restricted to particular small parts of ut- 

'* The fact that column 11 in effect repeats column 8, and 12 repeats 
column 9, is a different matter: the morphemes in these columns may 
actually occur twice in a word, in the different concurrences and orders 
indicated by these columns. Nevertheless, we may seek to indicate even 
such recurrences with only one occurrence of the morpheme in the dia- 

'^ A diagram of the word in Delaware is given in Z. S. Harris, Struc- 
tural Restatements II, International Journal of American Linguistics 


In some constructions or types of utterance, part of the sequence may 
be repeated once or several times. Such repetition may be indicated by 
some additional mark in the diagram. In the diagram above, the upper 
part of columns 4-6, with its necessary selection from columns 8-9, may 
be repeated, with secondary stress on all except its last occurrence. I.e. 
every sequence with >S", P", or mP", contains between its column 3 and 
column 4 zero or more secondary-stressed sequences consisting of 
S", i2P", mR P", or niRP" plus feminine or masculine and plural or singu- 
lar. In the chart this is marked by the double vertical line: the material 
between double vertical lines is repeatable, with secondary stress on all 
but the last occurrence (which has primary stress). 

It is possible to say that column 10, and for that matter column 7, 
contain members of S",^** and that the occurrences of columns 10-12 
after columns 4-6 plus 8-9 is simply a special case of the repetition of 4-G 
plus 8-9 recognized above. A word containing P" plus column 10 would 
then parallel a P'' word plus a new S" word indicating the object; and a 
word containing P" plus column 7 would parallel a P" word plus a new 
(preceding) »S" word indicating the subject. 

However, eliminations of columns 10-12 and of column 7 would leave 
us with a large number of special features involving these new members 
of S" and not indicated in the diagram. E.g. these new members of S" 
would never occur without an accompanying S", P", or P" and would 
have no main or secondary stress beyond that of their neighbor. They 
would, in short, not constitute minimum utterances, and the diagram 
would thus cease to represent all minimum utterances and only these. 
Furthermore, there would be such special restrictions as the fact that 
the morphemes indicating 'I' in columns 7, 10, do not have columns 8 or 

11 following them; and the morphemic variants included in columns 10- 

12 are not identical with those of columns 7-9. 

Certain contours and long components will often be found to cover all 
sequences indicated by a diagram, or a particular number of columns in 

13.175-86 (1947). For an application to Bengali verb suffixes, see C. A. 
Ferguson, Chart of the Bengali verb. Jour. Am. Or. Soc. 65.54 5 (1945). 
Cf. also the chart for Japanese inflection in M. Yokoyama, The Inflec- 
tion of 8th Century Japanese (Language Dissertation No. 45) 46-7, 
with the reformulation in H. M. Hoonigswald, Studies in Linguistics 
8.79-81 (1950) and 9.23 (1951). Floyd Lounsbury has also prepared a 
chart of Iroquoian. 

^° And that the morphemes of column 7 are variant members of the 
morphemes of column 10. 


it. These can be described as automatic in respect to the diagram, and 
may be stated as its phonologic definition or characteristic. In the case 
of the Moroccan Arabic word, every sequence extending from column 1 
to column 12 (with any number of zeros and any number of repetitions) 
is the domain of one main stress contour; and every repeated sequence 
of columns 4-6 and 8-9 is the domain of a secondary stress contour. 
Every sequence containing within it exactly one stretch from columns 4 
to 9^^ is the domain of the a contour.^^ 

These contours can be indicated by adding columns which contain 
them (and through which every utterance-making line must pass), e.g. a 
column containing ' for the stress contour, and a column 4a containing 
3 for the shwa contour. This parallels the inclusion of the contour class X 
in formulae such as those of 19.2. 

" I.e. every sequence from columns 1 to 12 excluding repetitions, or 
every repetition of columns 4-6, 8-9. 

" Whereby a occurs before every CC (CCV or CC3 or CC#). 


20.1. Summary of the Results 

The preceding chapters have indicated a number of operations which 
can be carried out successively on the crude data of the flow of speech, 
yielding results which lead up to a compact statement of what utterances 
occur in the corpus. 

20.11. Phonology 

The flow of sounds recognized by the ear, or the succession of vibra- 
tions recorded on some instrument, is represented by a succession of seg- 
ments (3), which may be divided into simultaneous components (6, 10). 
This is done in such a way as to obtain successive (segmental) and simul- 
taneous (suprasegmental) parts each one of which is independent of the 
others (4, 5) in its occurrence within utterances (over a relatively short 
stretch of speech). Utterances or parts of them are considered equivalent 
to each other if they are repetitions of each other; they are distinct from 
each other if they are explicitly not repetitions (4).' Parts which are not dis- 
tinct from each other are then grouped into classes in such a way that all 
the members of a particular class either substitute freely for each other in 
stated environments (4) or are complementary in environment to each 
other (7-9) . When the grouping is such that the distinctions between classes 
are in one-one correspondence with the distinctions between contrasting 
(i.e. distinct) segments, the classes are called phonemes (7). When each 
member of a phoneme is broken up into simultaneous portions some of 
which extend, at least in some environments, over more than one pho- 
neme length (10), the classes may be called components; each phoneme 
is then definable as a unique combination of components. Cases may arise 
in which two non-contrasting segment sequences (i.e., in a given environ- 
ment, two phonemically identical sequences) are represented by two dif- 
ferent component-combination sequences; we then say that these two 
component-combination sequences are (phonemically) equivalent. In 

• Utterances and part.s of utterances which do not occur in the same 
environment cannot be directly tested in order to see if they are or are not 
repetitions of each other (cf. 4.31). Even where the test is possible we 
may have an ambiguous result, in the case of features which appear in 
some repetitions of an utterance and not in others; these are the inter- 
mittently present distinctions of the Appendix to 4.3. 



terms of these phonemes and components we can identify what sound 
sequences occur in the corpus and, to a large extent, what sound se- 
quences do not occur (11). 

20.12. Morpholofiy 

The sequences (not necessarily contiguous) of phonemes or of com- 
ponents which represent the flow of speech are now divided into new seg- 
ments (12) each of which is uniquely identifiable in terms of phonemes (or 
components). 2 This is done in such a way that each of these parts is inde- 
pendent of the others in its occurrence over a stretch of any length 
(covering the whole utterance). The criteria for determining independ- 
ence are selected in such a way as to yield a number of parts having iden- 
tical or analogous distributions. These parts (morphemic segments or 
alternants), or rather the occurrences of such parts in stated environ- 
ments, are then grouped into classes (called morphemes) in such a way 
that all the members of a particular morpheme either substitute freely 
for each other or are complementary in environment (13). The inter- 
change of phonemes or components in corresponding sections of the 
variant members of each morpheme can then constitute a class called a 
morphophoneme (14).^ 

We may therefore say that each morpheme is composed directly of a 
sequence of morphophonemes, each of which in turn is a class consisting 
of one or more complementary phonemes or components. Each mor- 
pheme has only one morphophonemic constituency but the distinctions 

"^ I.e. the addition of any one of these segments in an utterance can in 

the last analysis be described as the addition or subtraction (or arrange- 
ment) of a sequence of phonemes or components. 

'Thus, in the morpheme {nnypj consisting of knife, knive- we may 
speak of four morphophonemes: /n/ whose definition is always the pho- 
neme n/, /a whose definition is always the phoneme 'a/, /y,' whose 
definition is always the phoneme /y/, /f/ whose definition is the pho- 
neme /v/ before \-s\ 'plural', and the phoneme f/ otherwise. Alterna- 
tively, we may say that the phonemes which replace each other in vari- 
ant members of a morpheme are grouped into a class; e.g. the /f/ and 
/v/ of knife, knive, are grouped into a class f ' whose members are /v/ 
before \-s\ 'plural', /f otherwise. 

Phonemes, intermittently present distinctions, and morphophonemes 
are thus all defined as classes of corresponding .segments, but under dif- 
ferent conditions: phonemes are classes of corresponding segments in 
stretches of speech which are equivalent by the test of chapter 4; inter- 
mittently present features are classes of substitutable segments in many 
repetitions of an utterance: and morphophonemes are classes of corre- 
sponding .segments in stretches of speech which are equivalent in their 
morphemic composition. 


between sounds are in general only in one-many correspondence with the 
distinctions between morphophonemes : two distinct morphophonemic 
sequences may represent identical segment (or phoneme) sequences ; such 
different morphophonemic sequences are phonemically equivalent. 

It may be noted here that the morphemes are not distinguished direct- 
ly on the basis of their meanings or meaning differences, but by the result 
of distributional operations upon the data of linguistics (this data includ- 
ing the meaning-like distinction between utterances which are not repeti- 
tions of each other). In this sense, the morphemes may be regarded either 
as expressions of the limitations of distribution of phonemes, or (what 
ultimately amounts to the same thing) as elements selected in such a 
way that when utterances are described in terms of them, many utter- 
ances are seen to have similar structure. 

The morphemes are grouped into morpheme classes, or classes of mor- 
phemes-in-environments, such that the distribution of one member of a 
class is similar to the distribution of any other member of that class 
(15). These morpheme classes and any sequences of morpheme classes 
Avhich are substitutable for them within the utterance (16), are now 
grouped into larger classes (called position or resultant classes) in such a 
way that all the morpheme sequences (including sequences of one mor- 
pheme) in a position class substitute freely for each other in those posi- 
tions in the utterance within which that class occurs. All subsidiary re- 
strictions upon occurrence, by virtue of which particular members of one 
class or sub-class occur only with particular members out of another, are 
stated in a series of equations. The final resultant classes for the corpus, 
i.e. the most inclusive position classes, serve as the elements for a com- 
pact statement of the structure of utterances. 

It is possible, however, to study other relations among the morpheme 
classes than those of substitution within the utterance. The investiga- 
tion of the relations between a class and sequences Avhich contain it leads 
to a. hierarchy of inclusion levels and to the analysis of immediate con- 
stituents (16.5). The relations between one class and any other class 
which accompanies it in an utterance may be expressed by long com- 
ponents of morphemes or of morpheme classes (17). And the investi- 
gation of substitution within stretches shorter than a whole utterance 
leads to morphological constructions and hierarc^hies of increasingly 
enclosing constructions (18). 

While these investigations yield many of the results (raditionally 
sought in morphology and syntax, there are other results of this nature 
which are not explicitly presented here (e.g. determination of the various 


forms and positions of the 'object' of the verb). Such further results can 
be obtained by more detailed application and extension of the above 
methods (e.g. after the manner of chapter 16, fn. 34). 

Compact statements as to what utterances occur in the corpus can now 
be made either in terms of the final resultants of 16 or in terms of the 
class relations of 16-8 (19). 

20.13. General 

The various operations, then, yield various sets of linguistic elements, 
at various levels of analysis: phonologic segments, regularly and inter- 
mittently present phonologic distinctions, phonemes, contours and pho- 
nemic long components, morphophonemes, morphemic segments, mor- 
phemes, morpheme-occurrence and position (morpheme-sequence) 
classes, morphemic long components and constructions.^ An element at 
any of these levels may be defined as consisting of an arrangement of ele- 
ments of some other level, or as constituting together with other elements 
of its level some element of another level. 

Given the elements of a corpus at a particular level, we state what 
limitations there are on the random distribution- (within utterances of 
the corpus) of each element relative to each other element at that level. 
For phonemic elements, the limitations are stated over a short range 
of a few elements before and after it and those simultaneous with it ; for 
morphemic elements, the limitations are stated over the whole utterance 
or (as in 17 8) over any given part of it. The procedures of the preceding 
chapters do not attempt to state the limitations of distribution of any 
elements over stretches of speech longer than one utterance (2.32). 

Each stretch of speech in the corpus is now completely and compactly 
identifiable in terms of the elements at any one of the levels. Except 
where the elements at a particular level are stated to be otherwise, a 
one-one correspondence is maintained between spoken or heard speech 
and its representation in terms of the elements at any level.* 

It may be noted that there are not just two descriptive systems — 

■• Some of these sets of elements are relatively small, e.g. the list of 
phonemes and their chief members; such sets are listed in grammatical 
descriptions of a language. Other sets are very large, e.g. the list of mor- 
phemes or of particular constructions (such as words); such sets are 
listed in a morpheme class list (15.51) or dictionary. 

' In general, the representation is in one-one correspondence with each 
occurrence of the represented speech. In the case of intermittently pres- 
ent distinctions, however, it is in one-one correspondence only with a set 
of repetitions of the represented speeches. 


phonology and morphology — but a rather indefinite number, some of 
these being phonologic and some morphologic. It is thus possible to ex- 
tend the descriptive methods for the creation of additional systems hav- 
ing other terms of reference. For example, investigations in stylistics and 
in culture-language correlations may be carried out by setting up sys- 
tems parallel to the morphologic ones but based on the distribution of 
elements (morpheme classes, sentence types, etc.) over stretches longer 
than one utterance. 

20.2. Survey of the Operations 

As was seen in 2.1, the only over-all consideration which determines 
the relevance of an operation is that it deal with the occurrence of parts 
of the flow of speech relative to each other. Beyond that, there is freedom 
in the choice of operations.^ 

20.21. To State Regularities or To Synthesize Utterances 

There is in general a choice of purposes facing the investigator in lin- 
guistics. He may seek all the regularities which can be found in any 
stretch of speech, so as to show their interdependences (e.g. in order to 
predict successfully features of the language as a whole) ; or he may seek 
just enough information to enable anyone to construct utterances in the 
language such as those constructed by native speakers (e.g. in order to 
predict the utterances, or to teach a person how to speak the language). 

In the search for all regularities in a language, the investigator would 
seek all correlations between linguistic forms, i.e. between features of 
sound in the flow of speech. Phonemes or components would be set up in 
such a way as to represent all regular phonetic diff'erences, and the limi- 
tations upon their occurrence would be noted. Morphophonemes would 
be set up so as to represent all relations between morphemes and their 
variant members. Correlations would be made between morphemes, mor- 

^ In determining the morphemes of a particular language, linguists 
use, in addition to distributional criteria, also (in varying degrees) cri- 
teria of meaning difference. In exact descriptive linguistic work, how- 
ever, such considerations of meaning can only be used heuristically, as a 
source of hints, and the determining criteria will always have to be stated 
in distributional terms (Appendix to 12.41). The methods presented in 
the preceding chapters offer distributional investigations as alternatives 
to meaning considerations. The chief means whereby such distributional 
operations can take the place of information about meaning is by taking 
ever larger environments of the element in question into consideration. 
Elements having different meanings (different correlation with social 
situations) apparently have in general different environments of other 
elements, if we go far enough afield and take enough occurrences. 


phcme classes, morphemic components or constructions and the pho- 
nemes they contain, or the morphophonemic or other features common 
to them. In addition to the equations leading to resultant position classes, 
morphemic components and constructions would be discovered. 

The investigator might also seek correlations between linguistic ele- 
ments and other features, e.g. various interrelations among acoustic or 
articulatory events, or social situations (meaning). On this basis he would 
obtain classifications and relations of phonemes; he would obtain such 
facts as the meaning similarity among English morphemes beginning 
with /si/. 

If on the other hand the investigator seeks only enough information 
to enable one to construct utterances in the language, he will set up 
phonemes (phonemic segments) or components only to the extent, and 
in the manner, necessary to distinguish contrasting utterances. He need 
only determine the phonemic distinctions, and would not have to group 
complementary segments together, or to state the distribution of each.' 
Morphemic segments would be determined, and the variant members of 
morphemes would be stated, but morphophonemic symbols to indicate 
them would not be used except if it is desired to shorten the total descrip- 
tion. For that matter, but for the convenience of brevity and clarity, the 
whole description of the language could begin with a list of morphemic 
segments, or with a list of morphemes, each with its varying phonemic- 
segment constitutions in various environments. This would sidestep pho- 
nemics and morphophonemics, and disregard the fact that parts of one 
morpheme are phonetically similar to parts of another. Morphemes with 
very similar distributions, members of the same sub-class, would not have 
to be distinguished, even though they differ slightly or greatly in mean- 
ing (e.g. hu katav and katav 'he wrote' in the Appendix to 17.33). Further- 
more, the classification of morphemes and morpheme sequences would 
have to be carried out only in respect to the whole utterance (as in 15-6) : 
the components and constructions of 17-8 would not have to be set up. 

20.22. Operations of Analysis 

The over-all purpose of work in descriptive linguistics is to obtain a 
compact one-one representation of the stock of utterances in the corpus. 

' For compactness of statement, the investigator would undoubtedly 
group the more obvious sets of complementary segments into phonemes, 
and determine the more important junctures. But finesse in this work 
would not be required, and the distributional limitations upon each pho- 
neme would not have to be expressed. 


Since the representation of an utterance or its parts is based on a com- 
parison of utterances, it is really a representation of distinctions. It is this 
representation of differences which gives us discrete combinatorial ele- 
ments (each representing a minimal difference). A non-comparative 
study of speech behavior would probably deal with complex continuous 
changes, rather than with discrete elements. 

The basic operations are those of segmentation and classification. 
Segmentation is carried out at limits determined by the independence of 
the resulting segments in terms of some particular criterion. If X has a 
limited distribution in respect to Y, or if the occurrence of A" depends 
upon (correlates completely with) the occurrence of a particular environ- 
ment Z, we may therefore not have to recognize A as an independent 
segment at the level under discussion."* Classification is used to group 
together elements which substitute for or are complementary to one 

* The length of environment over which independence of A' in respect 
to Y is examined may vary with our immediate purpose (e.g. shorter for 
determining phonemes, longer for determining morphemes). The handling 
of partial dependence may vary. In one case, when we seek a first ap- 
proximation, we may set up partially independent segments as distinct 
elements. Later, we may return to the same segments and extract a com- 
mon element which expresses the degree of dependence of one upon the 
other, having residual elements which express the degree of independence 
of the segments in respect to each other (e.g. in chapter 17). The criterion 
of independence thus determines not only the segmentation of our repre- 
sentation into successive or simultaneous portions, but also the setting 
up of abstract elements which can not be readily identified in terms of 
acoustic or physiological records but which express particular features of 
the complex relations among the segments or the other elements. 

^ The class of elements then becomes a new element of our description, 
on the next higher level of inclusive representation. It is not necessary for 
the class members to be 'similar', i.e. for the class to be distinguished by 
any feature either than that in respect to which the class was set up. E.g. 
quite distinct segments may be grouped into the phoneme /t/; highly 
dissimilar morphemic segments are grouped into the morpheme \be\; 
and there is no formal similarity among the morpheme sequences which 
are included in the class N. However, it is sometimes convenient to con- 
sider one of the members to be the symbol of the new class; that member 
is then said to be primary (or the base) while the other members are de- 
rived from it by a set of environmentally (or otherwise) conditioned 
'rules' or operations. For example, we may say that the phoneme /t/ is 
the member segment [t] plus various changes in various positions. Or we 
may say that the morphophoneme /f/ is the phoneme /f/ plus the 
change to voicing before |-.s} 'plural'. We can even say that the Semitic 
position class A^ is the (void) morphemic component 8 'third person' plus 
various residues (for 'first person', for 'book', etc.) in various of its oc- 


Both of these operations are performed upon an utterance or upon its 
parts, but always on the basis of some comparison between these andsome 
other utterances: e.g. morpheme segmentation is carried out before and 
after s in What books came? but not in What box came? because of com- 
parison with What book came? and so on. If we were analyzing a corpus 
without any interest in its relevance for the whole language, we could list 
all the environment of each tentative segment in all utterances of the 
corpus, and on this basis decide the segmentation in each utterance. 
Usually, however, we are interested in analyzing such a corpus as will 
serve as a sample of the language. For this purpose we bring into our 
corpus controlled material for comparison. Given What books came? we 
do not compare it with arbitrary other utterances, but search for ut- 
terances which are partially similar, like What book came? What maps 
came? What books are you reading? Ideally, we seek a group of minimally 
different utterances for comparison. In eliciting such comparative ut- 
terances from an informant, or from oneself, or from some arranged or 
indexed body of material, we have an experimental situation in which 
the linguist tests variations in the utterance stock in respect to a selected 
utterance; the only danger being that the utterance stock may be artifi- 
cially modified due to the experimentally asked question (as when an 
informant accepts an utterance proposed by the linguist even though it 
is a bit different from anything he would say on his own as a speaker of 
the language). 

Once we have a number of comparisons available as bases for setting 
up segmentations or classifications, we select those comparisons which 
apply to large numbers of elements, or to otherwise recognized groups of 
elements, as noted in 12.233. This selection, of course, derives not from the 
nature of the comparisons but from our purposes: if we want compact 
statements about the combination of parts in the language, we prefer to 
set up as elements those segments or classes which enter into the same 

currences. In all these cases, we could consider one member a as primary 
if we can state the conditions in which the other elements b, c, replace it 
(are derived from it). The choice of o is clearer if we can not reversibly 
derive o from b or c; i.e. if we can not state the exact conditions in which 
b is replaced by a. When no member of a class can be set up as primary, 
it may be possible to set up a theoretical base form from which each mem- 
ber can be derived (cf. in morphophonemics). In all these cases, however, 
whether we set up a primary member, or a theoretical base form, or a new 
class of the old members, we have essentially the same relation: a number 
of elements, classified together on some basis, into a new element which 
represents the occurrence of each of them. 


combinations as do other segments or classes. The work is thus naturally 
circular : we see from our controlled set of partially similar utterances that 
certain elements (such as the walk, talk, -ed of 12.233, or the joining of [J] 
and [1] of [plley] play into one segment) could be subject to further clas- 
sification and general statements, whereas other elements (such as the 
/as/ of notice) could not ; we then presume that this will be the case for 
the rest of the corpus, and so set up the former as elements. 

As a result of these operations, we not only obtain initial elements, 
but are also able to define new sets of elements as classes or combinations 
(sequences, etc.) of old ones.'" While the successive classifications are 
based on differences in occurrence, these differences are expressed in the 
particular definitions of each class, and the relations among these classes 
can then be investigated without regard to the differences in their defini- 
tions. This is possible because of the stratification of the successive classi- 
fications : the unique properties of one element or another at a particular 
level are neither eliminated nor disregarded; they are merely embodied 
in the definition of the next higher set of elements," and need not be 
taken into account unless we wish to deal with the elements at the par- 
ticular level first mentioned. Each element is defined by the relations 
among elements at the next lower level. 

This leads ultimately to sets of few elements having complex defini- 
tions but as nearly as possible random occurrence in respect to each other. 

'" In the operations of the preceding chapters each new class or com- 
bination of elements is treated as a new set of elements, at a 'higher' 
or more inclusive level than the elements of which it is composed. The 
whole material of our corpus can be re-identified in terms of the new ele- 
ments. This method, however, is not essential : we could consider all our 
procedures as stating relations among our original phonologic (and mor- 
phemic) segments, and keep those segments as our sole element.s through 
to the end. The successive setting up of new elements was used only for 
convenience, since we then express in the definition of each set of elements 
all the relevant relations among all the previously defined elements. A 
frequently useful technique in expressing these relations in the form of 
definitions of a riew level of elements, is to indicate what is the minimum 
domain 6f that level of elements, defined as the domain containing a 
certain property and not containing any smaller domains which them- 
selves have that property. 

" From phonologic segment up to resultant position classes with the 
highest inclusion numbers. An important factor in the compact statement 
of relation among elements is the specification of the domain over which 
the relation occurs. Within the domain, we state not only the occurring 
together or the substitution of elements but also their relative order, and 
any variation in these which depends upon the outer environment. 


rophu'ing the original sets of many elements having simple definitions but 
complexly restricted distribution. We obtain elements having many and 
varied members (e.g. the sequences in a resultant position class, or the 
segments included in a phoneme); these members may consist of zero, 
omission or interchange of segments, or conversely no element (absence 
of element) may be used to represent a particular occurring segment. 
And although unit length is established for both phonemes and mor- 
phemes, there are cases in which elements or their members extend over 
several integral unit lengths,'^ i.e. cases in which sequences of segments 
are represented by a sequence of elements, without an explicit represen- 
tation being determined between each individual segment and each in- 
dividual element.'^ 

We may indeed say that our representations are in theory not of unit 
lengths, but of arbitrarily long portions of the utterance. Rather than 
say that segment [p*"] is represented by phoneme p in environment 
[# — V\, as in park, we can say that the stretch [#p''a] is represented by 
/#pa/. The correlation of [p''] with 'p / may then be derived by com- 
paring and indexing all these representations of stretches. If now /#pl/ 
is used for [#pl] (play) we need not hesitate because the segment is [p] 
not [p*"], since the correlation is not between [p**] and p/ but between the 
whole stretch and the phonemic sequence. 

Since each element is identified relatively to the other elements at its 
level, and in terms of particular elements at a lower level, our elements 
are merely symbols of particular conjunctions of relations: particular 
privileges of occurrence and particular relations to all other elements. It 
is therefore possible to consider the symbols as representing not the par- 
ticular observable elements which occupy an environment but rather 
the environment itself, and its relation to other environments occupied 

'^ When the segments represented by an element are successive, it is 
conventional to let their position in the stretch of speech determine the 
position and domain of their representation along the line of writing. 
When they are simultaneous, long, or discontinuous, special conventions 
are made in order to set the position and domain of their representation 
relative to the other elements. Such problems are also involved in the 
case of zero segments, void elements, and the like. 

'•' As was seen in the Appendix to 18.2 (esp. fn. 20) both zero segments 
(including junctures) and void elements are representations of sequences 
of segments, as are also phonemic and morphemic components and, if we 
will, the resultant classes of chapter 16. The only status that the .symbol 
|-en| has in the representation XVbV-en for / have cut is what can be 
extracted from the difference in the XVb\'-en representation of / have cut 
and the X\' representation of / ait. 


by the element which occupies it. We may therefore speak of inter- 
environment relations, or of occupyings of positions, as being our funda- 
mental elements. 

Various techniques of discovery may be used in applying these opera- 
tions, and they may be used over and over again. One of the most impor- 
tant is the attempt to find regularities and parallel or intersecting pat- 
terning among our elements, ''' so that, e.g., if an element is similar to a 
class in some characteristic feature, we test to see if it is similar to that 
class in all features and so a member of the class. 

Another method that has yielded new results is the generalization of 
operations from one operand to another. If the classification of comple- 
mentary variants when applied to phonemic segments yields phonemes, 
it can be applied to morphemic segments to yield morphemes. If the 
independence of a sequence of contiguous phonemes establishes it as a 
morpheme, we can also set up any sequence of non-contiguous phonemes 
or any interchange of phonemes as a morpheme so long as it is equally 
independent of the other morphemes. If morphemes which substitute for 
each other are considered equivalent from the point of view of the ut- 
terance, so are morpheme sequences which substitute for them. In all 
these cases the operation is not changed, but the old operand becomes a 
special case of a new and larger class of operands. 

In many cases it is at first impossible to obtain the desired results com- 
pletely. The operation may then be carried out first in a simplified form, 
or on a selected set of operands, and the approximation obtained from 
these first results may then be corrected or extended by repeated cariy- 
ing out of the operation.'^ 

The utUity of these operations is compromised, however, if any results 
are recognized other than those obtained by means of the stated opera- 
tions. If the operations do not suffice for a decision in a particular mat- 
ter, e.g. how a sequence is to be divided, then either a new operation or 
definition has to be added by application of which the matter will be de- 
cided, or else the results have to be stated in such a way that the alterna- 

''' E.g.- Edward Sapir, Sound patterns in language, Lang. 1.37 51 

'^ Cf. the Appendix to 4.5. In this way unit segments were first estab- 
lished in phonology and morphology; and then the more detailed appli- 
cation of the same criteria which had been used in setting up the unit 
segments later enables us to recognize segments of more than unit longt h. 
Analogously, morphemes consisting of the omission of a mora can only bo 
recognized after we have set up the other morphemes and are able to 
compare their distribution, cf. 12.3. 


lives among which the operation cannot decide are immediately equiva- 
lent. E.g. if there is no basis for assigning the [p] after /s/ to /p/ or to 
/b/, then p and / b, should be equivalent marks in the environment 
/s — /. (The issue can, however, be decided in terms of components, be- 
cause the voicelessness component extends over all contiguous consonants 
in a morpheme.) Similarly, if we analyze was as {be} + {-ed}, but have 
no basis for placing the boundary, we do not place it arbitrarily, but 
recognize no phonemic correlation for the alternants of {be} and {-ed} 
in each other's environment, but only for the sequence of them together, 
which is /waz/. 

The considerations of discovery furnish one of the reasons for avoid- 
ing any classification of forms on the basis of meaning. Similarities in 
meaning may or may not serve as useful signposts in the course of 
investigation, and some test of social situation may be unavoidable in 
determining morphemes, but the methods presented here could not make 
use of any classes of, say, morphemes which are not differentiated from 
other morphemes by any common distinction ex'cept meaning.'* 

20.3. Description of the Language Structure 

Although our whole investigation has been in a particular corpus of ut- 
terances, we may consider this corpus to be an adequate sample of the 
language from which the corpus was taken. With this assumption, the 
methods of descriptive linguistics enable us to say that certain sequences 
of certain elements occur in the utterances of the language. This does not 
mean that other sequences of these elements, or other elements, do not 
occur; they may have occurred without entering into our records, or they 
may have not yet occurred in any utterance of the language, only to oc- 
cur th