Skip to main content

Full text of "Studies in the linguistic sciences"

See other formats





The person charging this material is re- 
sponsible for its return to the library from 
which it was withdrawn on or before the 
Latest Date stamped below. 

Theft, mutilation, and underlining of books are reasons 
for disciplinary action and may result in dismissal from 
the University. 
To renew call Telephone Center, 333-8400 


MAR 1 2 1986 

3EC 2 1990 
FEB 2 2 Bg? 

f|7'^R 9 

MW 4 1«7 


ftPR 111988 


SEP 12 


L161— O-1096 

Studies in 

The Linguistic Sciences 


ISSAM M. ABU-SALIM Syllable Slriniure in Palestinian Arabic 1 

CHIN-CHUAN CHENG A Quaniijication oj Chinese Dialed Affinity. . .29 

CHiN-CHUAN CHENG The Esperanto of El Popola Cinio 49 

MOHAMMAD DABIR-MOGH ADDAM Passive in Persian 63 

HANS HENRK H HOCK Aux-cliticization as a Motivation for Word Order 
Change 91 

MICHAEL KENSTOWICZ Germination and Spirantization in Ti^rinya 103 


bearinfi Nasals in Makua 123 

SHLOMO LE DERM AN Problems in a Prosodic Analysis of Hebrew 

Morphology 141 

BRUCE ARNE SHERWOOD Statistical Analysis of Conversational 

Esperanto, with Discussion of the Accusative 165 

BRUCE ARNE SHERWOOD Variation m Esperanto 183 

THE UbHAHX 0^ Irtt 

^^AK 8 198^ 


Department of Linguistics 
University of Illinois i 



EDITORS: Charles W. Kisseberth, Braj B. Kachru, Jerry L. Morgan 

REVIEW EDITORS: Chin-W. Kim and Ladislav Zgusta 

EDITORIAL BOARD: Eyamba G. Bokamba, Chin-chuan Cheng, Peter 
Cole, Alice Davison, Georgia M. Green, Hans Henrich Hock, Yamuna 
Kachru, Henry Kahane, Michael J. Kenstowicz and Howard Maclay. 

AIM: SLS is intended as a forum for the presentation of the latest original 
research by the faculty and especially students of the Department of 
Linguistics, University of Illinois, Urbana-Champaign. Especially invited 
papers by scholars not associated with the University of Illinois will also be in- 

SPECIAL ISSUES: Since its inception SLS has devoted one issue each year to 
restricted, specialized topics. A complete list of such special issues is 
given on the back cover. The following special issues are under preparation: 
Studies in Language Variation: Nonwestern Case Studies, edited by Braj B. 
Kachru; Papers on Diachronic Syntax, edited by Hans Henrich Hock. 

BOOKS FOR REVIEW: Review copies of books (in duplicate) may be sent to 
the Review Editors, Studies in the Linguistic Sciences, Department of 
Linguistics, University of Illinois, Urbana, Illinois, 61801. 

SUBSCRIPTION: There will be two issues during the academic year. Requests 
for subscriptions should be addressed to SLS Subscriptions, Department of 
Linguistics, 4088 Foreign Languages Building, University of Illinois, Urbana, 
Illinois, 61801. 

Price: $5.00 (per issue) 



Charles W. Kisseberth 
! Braj B. Kachru, Jerry L. Morgan 


Chin-W. Kim and Ladaslav Zgusta 


Eyamba G. Bokamba, Chin-chuan Cheng, Peter Cole, Alice Davison, 

Georgia M. Green, Hans Henrich Hock, Yamuna Kachru, Henry Kahane, 

Michael J. Kenstowicz and Howard Maclay. 

SPRING, 1982 



Issam M. Abu-Salim: Syllable Structure in Palestinian Arabic. ... 1 

Chin-Chuan Cheng: A Quantification of Chinese Dialect Affinity. . . 29 

Chin-Chuan Cheng: The Esperanto of El Popola fiinio 49 

Mohammad Dabir-Moghaddam: Passive in Persian 63 

Hans Henrich Hock: Aux-cliticization as a Motivation for Word Order 

Change 91 

Michael Kenstowicz: Gemination and Spirantization in Tigrinya . . . 103 

Chin-Chuan Cheng and Charles W. Kisseberth: Tone-bearing Nasals in 

Makua 123 

Shlomo Lederman: Problems in a Prosodic Analysis of Hebrew 

Morphology 141 

Bruce Ame Sherwood: Statistical Analysis of Conversational 

Esperanto, with Discussion of the Accusative 165 

Bruce Arne Sherwood: Variation in Esperanto 183 

Studies in the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 

Issam M. Abu-Saliin 

In this paper, I aigue that the syllabification mechanism in 
Palestinian Arabic (PA) involves not only rules to define syllable 
boundaries and assign internal structure to the syllables contained 
in the utterance, but also rules of vowel shortening and vowel 
insertion, which are similar in effect to some phonological rules 
in PA, that have to apply at the time syllable structure is as- 
signed. These rules, as discussed in section 3, explain different 
cases that cannot be accounted for by the phonological rules proper 
which apply after the segmental string is syllabified. Section 1 
deals with the syllable inventory of PA and section 2 deals with 
the question of how the various syllable types are to be repre- 
sented in underlying structures. It is argued that McCarthy's 
(1979a, b) account of superheavy syllables is inadequate, and that 
the CWCC syllable is better analyzed as part of the underlying 
syllable inventory of PA. Finally, in sections 4 and 5, a brief 
account of resyllabification and ambisyllabicity is given. 

1. Syllable Types and Their Structure 

Previous studies of Levantine Arabic, and of Arabic in general, 
(e.g., Al-Ani 5 May 1973, Broselow 1979, 1980, McCarthy 1979a, b, 1980, 
Selkirk 1981) have shown that the following syllable types occur as 
part of the phonological system of Arabic: 

(1) a. CV 'he wrote' 

b. CW raa .sal 'he corresponded' 

c. CVC mad 'school' 

d. CWC laaf 'he saw' 

e. CVCC ?uxt 'sister' 

Most of the studies cited above are based on the surface manifestation 
of the various syllable types which occur phonetically in one dialect or 
another. Thus they exclude other syllable types such as CWCC from being 
considered as part of the syllable inventory of Arabic. It is true that 
this syllable type, i.e., CWCC, does, not show up phonetically, I believe, 
in all the modern dialects of Arabic. But there is ample evidence to 
suggest that this syllable type is better analyzed as part of the under- 
lying syllable inventory of PA and, possibly, some other Arabic dialects. 
Some phonological processes, particularly shortening of stressed vowels, 
will be satisfactorily explained if reference to this syllable type is 
made at some point in the phonological derivation, as will be discussed 
in detail later. Hence, a distinction will be made throughout this study 
between underlying (or phonological) and surface (or phonetic) syllables. 
The two sets of syllable inventories will be linked by certain rules, 
similar in effect to the phonological rules proper, which will be treated 
as part of the syllabification rules of the language. These rules will be 
discussed in detail in section 3 of this paper. 

Syllable structure has been discussed in various works. Early (e.g. 
Pike § Pike 1947, Hockett 1955, and Fudge 1969) as well as recent studies 
(e.g. Newman 1972, Halle 1978, Halle § Vergnaud 1979, McCarthy 1979a, b, 
Kiparsky 1979, 1980, Selkirk 1980, and others) have asserted that syllables 
have internal immediate constituent structures of their own, similar to 
syntactic structures, that can be represented in terms of binary-branching 
tree diagrams. The first major division is into onset and rime . The 
onset consists of any consonant or consonant cluster preceding the 
syllable nucleus, and the rime includes all remaining elements. The rime 
in turn divides into two parts, the syllable nucleus and the coda (the 
final consonant or consonant cluster). The syllable structure could, 
thus, be represented as in (2): 


Onset Nucleus Coda 

Evidence for this internal structure for the syllable in Arabic is 
not difficult to find. The stress rules of Arabic, for instance, refer 
to one property of the syllables contained in an utterance: their weight. 
A syllable is said to be heavy only if its rime is heavy, and vice versa. 
The onset, regardless of its internal structure, plays no role in deter- 
mining syllable weight for purposes of stress assignment. Moreover, it 
will be shown below that some co-occurrence restrictions exist between the 
nucleus and the coda in terms of their heaviness. No such restrictions 
occur between the nucleus and the onset, which gives further evidence for 
the divisions between syllable constitutents in (2) above. 

Now, given the syllable template in (2) and the terminology associated 
with it, and disregarding CWCC syllables for the moment, the syllable types 
in (1) will have the following internal structures, where 0, R, N, and K 
stand for onset, rime, nucleus, and coda, respectively: 

r a- <r <r <r 




/ R 
' 1 





1 A 

b. C V V 




, R 

/ R 

/ R 



/ A 

N K 

N K 

N K 

1 1 1 

C V c < 

1 A 1 

i. C V V C ( 

1 1 A 

;. C V C C 

These structures demonstrate the following characteristics of the syllable 
structure of Arabic. First, the onset, as opposed to the coda, is an obligatory 
constituent of all the syllable structures in (3); each syllable must begin 
with a consonant, which implies that vowel-initial syllables are not permitted 
in Arabic. In all the syllable types above, the onset consists of only one 
consonant. There are cases, however, where the onset consists of two con- _ 
sonants, which may give rise to other syllable types in some Arabic dialects, 
as shown by the following examples: 

(4) a. CCV sta .lam 'to receive' 

b. CCW 'my book' 

c. CCVC sta? .bal 'to welcome' 

d. CCWC 'our countries' 

e. CCVCC mfakk 'screw driver' 

These syllable types are highly restricted in distribution; they occur 
only in phrase-initial position. Historically, words with initial consonant 
clusters in PA, as well as in other Arabic dialects, may be said to have been 
developed from corresponding words with no such clusters in Classical Arabic 
through the application of some diachronic rules deleting vowels or consonants 
or both in some environments. The Classical Arabic words corresponding to 
those in (4) are given in (5) below, where the deleted segments which created 
the initial consonant clusters in (4) are underlined. 

(5) a. ?istalama 

b. ki^taabi 

c. ^.istaqbala 

d. bnaaduna 

e . maf akk 

Synchronically, however, there are no alternations to support deriving 
these clusters from underlying representations without syllable-initial 
clusters. There are no cases where phrase-initial consonant clusters in 
words like those in (4) alternate with, say, CVC sequences in the same 
position. Meanwhile, there are other instances of phrase-inital heavy 
onsets which are derived synchronically after the application of a syncope 
rule which deletes unstressed high vowels in nonfinal open syllables, as 
shown by the following examples: 

(6) a. Sirib 'he drank' 
b. Srfbna (</Siribna/) 'we drank' 

Again, such derived heavy onsets do not occur except initially in 
phrases or after a pause in words. Thus, due to their limited distri- 
bution, the syllable types in (4) can be considered as special instances 
of those in (1), and accounted for by a rule adjoining the initial consonant 
to the following syllable at the point where syllable structure is assigned, 
as discussed further in section 3. 

Second, the syllable nucleus may consist of a short vowel as in (3a, 
c,a) or a long vowel as in (3b, d), where length is indicated by gemination 
on the segmental level. Structurally, vowel-length contrast is represented 
in terms of branching vs. nonbranching nodes, where short vowels are 
associated with nonbranching nodes and long vowels with branching nodes 
in the syllable tree. 

Finally, the coda is the only optional constituent of the syllable in 
Arabic. It may consist of zero consonants as in (3a, b), one consonant as 
in (3c, d), or two consonants as in (3e) . No dialect of Arabic has been 
reported to have syllable codas composed of more than two consonants. 

The syllable types in (3d) and (3e) indicate that either the syllable 
nucleus or the coda, but not both, may branch in any particular syllable. 
This indication is consistent with the fact that syllables of the form 
CWCC are not realized phonetically in PA. Classical Arabic, on the other 
hand, was reported to have such syllables, exemplified by words such as 
maarr 'passer-by'. This syllable type, as mentioned earlier, is the least 
frequent in Classical Arabic, appearing only rarely in its pausal form in 
phrase final position (Al-Ani 5 May 1973:118). Speakers of PA employ different 
strategies to avoid pronouncing words with such syllables. The word maarr . 

for instance, is realized as maa.rir by changing it into a pattern that 
exists in the dialect, and, for many speakers, the word saamm 'poisonous' 
is realized as bi.simni 'it poisons' by altering the grammatical category. 
Monosyllabic loan words from foreign languages having the syllable struc- 
ture CWCC are dealt with in a different fashion. The English words bank , 
chance , and Ford , for instance, which are heard as having the long vowel 
/aa/ in the first two words and /oo/ in the third, are realized in PA, 
as well as in other Arabic dialects, as /bank/, /sanS/, and /ford/, 
respectively, with short vowels. The nucleus in all these words is 
reduced to a short one to maintain the constraint mentioned above which 
prohibits both the nucleus and the coda from branching in the same syl- 
lable at the phonetic level. 

The different strategies applied to loan words from Classical Arabic, 
on the one hand, and from foreign languages, on the other, can be explained 
in this way: many foreign words which are borrowed into Arabic are still 
not completely nativized. They continue to remain in the same category 
they have in the original language. One can seldom find derived or related 
words with the same consonantal make-up in different categories. The English 
words^above are all treated as nouns in Arabic, but, except for the word 
ganS , no related verbs, adjectives, or adverbs occur in Arabic. That is, 
when the vowels in such monosyllabic words are reduced, the derived 
structures are not confused with other categories and they continue to 
maintain the same category they have in the original language. In other 
words, no ambiguity would result from the nucleus reduction process. This 
is not the case in Classical Arabic words. If the nucleus is reduced in 
words such as maarr or saamm , which are categorized as active participles, 
the derived structures would be marr 'he passed by' and samm 'he, it 
poisoned', which are used as verbs in both Classical and Colloquial Arabic. 
So in order not to confuse the two categories, the pattern- or category- 
changing techniques are used to avoid syllables of the form CWCC in the 
dialect phonetically. 

Finally, the syllable structures in (3) can be viewed as expansions 
of an abstract syllable template underlying the syllable structure of 
Arabic. Before trying to formulate that template, it can be seen that the 
syllable structures in (3d) and (3e) can be considered as expanded versions 
of those in (3b) and (3c), respectively, where a consonant is added to 
fill a vacuous coda position in the first case, and to make it heavy in the 
second. By the same token, the syllable structures in (3b) and (3c) can 
be viewed as expanded versions of the simple syllable structure in (3a), 
where the nucleus is made long in the first case, and a consonant is added 
to fill a vacuous coda in the second. No attempt, however, is being made 
here to derive one syllable type from another. Rather, these remarks about 
the syllable structure in PA are intended to show the differences between 
the various syllable types in terms of their constituent structures, which 
are relevant to other issues such as the relationship between syllable-weight 
distinctions and stress assignment (for further details about this relation- 
ship in PA, see Abu-Sal im (forthcoming)). 

The syllable template underlyip all the syllable structures in (3) can 
thus be formulated as in (7) below: 

The main function of the condition imposed on the template is to,, 
disallow syllables of the form CWCC from being derived phonetically. 
It says that if the coda is heavy, then the nucleus cannot be long. We 
have seen in the discussion above of the way loan words are realized 
in PA that it is the nucleus that is reduced, not the coda, in CWCC 
monosyllabic words. So, it is the structure of the coda that determines 
whether the nucleus may branch or not. 

The condition on the syllable template in (7) can be stated in a 
rather different way if we allow for syllables to be organized into 
moras according to some, possibly universal, principles whereby a 
light CV syllable counts as one mora, a heavy CW or CVC as two moras, 
and a superheavy CWC or CVCC as three moras. It can be reformulated 
in such a way so that the syllable rime can have no more than three moras, 
given that the mora is defined as being equivalent to a terminal node in 
the syllable tree. But this reformulation, however, will not capture the 
conclusion arrived at in the preceding paragraph, it is the coda which 
determines the length of the nucleus; it will allow for the choice of 
reducing either the nucleus or the coda in CWCC syllables, which is not 
the case since it is always the nucleus that is reduced. 

However, the syllable template in (7) does not specify any phonotactic 
constraints in the sense of Fudge (1969) or Kiparsky (1979) to account for 
the distribution of segments within syllables, particularly the distribution 
of consonants in terms of their sonority in the onset and coda positions. It 
is intended mainly to specify all the possible syllable types of the language 
in terms of their structures, and to serve as a well-formedness condition on 
the syllable structure of phonological representations as will be seen in 
section 3 below. 

2. Basic vs. Derived Syllables 

One important question which any theory of syllable and syllabification 
must address is the point in the derivation at which the syllable template 
in (7) is defined. That is, at what level of description, phonological or 
phonetic, will segments be syllabified according to the template in (7)? 
As will be seen, various phonological rules refer to the phonological 
structure in one way or another in PA. This implies that underlying 
strings have already been divided into syllables prior to the application 
of the phonological rules proper. Therefore, I would assume, following 
McCarthy (1979a) and Feinstein (1979) among others, that syllable structure 
is assigned on the underlying phonological representation, with a subsequent 
application of a resyllabification rule after each stage in the derivation, 
particularly after rules such as syncope or epenthesis that alter the 
segmental make-up of phonological strings. This rule of resyllabification 
is needed in the grammar so as to account for the lack of one-to-one corre- 
spondence, in some cases, between underlying and surface syllables, and to 

ensure that the segmental string is well -syllabified at any point in the 
derivation. The form and manner of application of these rules, however, 
will be discussed in detail in the next section. 

Given the assumption in the preceding paragraph that segments are 
syllabified in the underlying representation, the following question 
arises: Are all the syllable types in (1) possible in the underlying 
representation? and, if so, how are they going to be represented in 
terms of their internal structures? Al-Ani 5 May (1973) and McCarthy 
(1979a) point out that the first three syllable types in (1), i.e., CV, 
CW, and CVC, are unmarked in terms of their distribution in Classical 
Arabic words because they occur more often, in terms of frequency, than 
the last two types. Thus, they are considered to be the basic syllable 
inventory of Classical Arabic, and their internal structures are conse- 
quently represented as in (8) below given the assumptions made earlier 
about syllable structure: 

'"a a, /X 

a. C V b. C V V c. C V C 

The syllable types in (8b) and (8c) have basically the same structure: 
a simple nonbranching onset and a branching rime. Although the rimes in 
both cases have the same branching character, they differ in terms of 
their immediate constituency. In (8b) the rime consists of a branching 
nucleus, dominating a long vowel, and no coda, whereas in (8c) the rime 
consists of a short nucleus and a simple nonbranching coda. But these 
differences in the immediate constituency of these structures do not result 
in differences in their behavior with respect to phonological rules such as stress, 

Both syllables are assigned stress in a parallel way when they occur 
in similar positions. To maintain the parallel between these two syllable 
types, McCarthy proposes that the syllable structure rubrics of Arabic 
are the following: 


A b. /\ 

The last two syllable types, i.e., CWC and CVCC, on the other hand, 
occur primarily, as reported by Al-Ani S May (1973) and McCarthy (1979a, b), 
in phrase-final position or before a pause in Classical Arabic due to the 
optional loss of inflectional endings in those positions, as shown by 
the examples in (10) : 

(10) a. ki.taab (cf. ki.taa.bun) 'book' 

b. dars (cf. dar.sun) 'lesson' 

According to the syllable rubrics in (9), CWC and CVCC syllables 
cannot be exhaustively parsed into acceptable syllables, given that each 
segment must belong to at least one syllable in underlying representations. 
Hence, McCarthy proposes a rule by which the final consonant in these 
syllables is Chomsky -ad joined to the preceding syllable. This rule can 
be represented as in (11) below: 


C V [+seg] C $ =^ C 

A similar treatment is given to the syllable structure of various 
Arabic dialects (McCarthy 1979a, b, 1980, Broselow 1979, Kenstowicz and 
Abdul-Karim 1980, Selkirk 1981, among others). The CWC and CVCC, or the 
superheavy syllables as they have been referred to, have been reported to 
occur in many Arabic dialects, such as Cairene (CA) and Damascence (DA), 
but with one difference: they are restricted in distribution, according 
to the studies cited above, to word-final position. So McCarthy proposes 
basically the same rule in (11) to account for the syllabification of 
these syllables in CA and DA, with a slight modification: the phrase 
boundary in (11) is replaced by a word boundary, as shown by the rule 
in (12): 


C V [+seg] C #-=>C V [+seg] 

Several arguments can be presented to cast doubt on the validity of 
the claim that rule (12) is to be part of the syllabification processes 
in Levantine Arabic, particularly in PA and DA. First, the superheavy 
syllables under consideration are not exclusively restricted to word- 
final position in both PA and DA; they occur in other positions as well, 
as is evidenced by the examples in (13-14) below where the nonfinal 
superheavy syllables are underlined:!^ 

(13) PA (14) DA 

a. burd.?aan 'oranges' a. ?ramf .le 'a flower' 

b. ?uxt ■ hum 'their sister' b. 9and .kon 'you have' 

c. baal . to 'coat c. ?ing .lii.zi 'English' 

d. Sub. baak. hum 'their window' d. mist 'well-cooked (f.)' 

e. komb .yuu.tar 'computer' e. ba. naat .kun 'your (pi.) daughters' 

f. mift.xir 'proud' f. miSk .le 'problem' 

Given the syllable rubrics in (9) and the adjunction rule in (12), 
the nonfinal superheavy syllables in (13-14) cannot be properly syllabified. 
The rule in (12) could be revised slightly to account for the syllabifi- 
cation of nonfinal superheavy syllables by deleting the word boundary from 
the conditioning environment. Although this revision would provide for 
superheavy syllables in any position to be syllabified, it would, however, 
raise a questioji as to the motivation behind permitting syllable structures 

of the form /\ ^ to be part of the syllable structure of any language while, 
at the same time, disallowing superheavy syllables to be represented as in 

(3d,e) above, i.e., to have the structure /^ . As long as there is no 
universal constraint or prohibition against the syllable structures in (3d,e), 
I would consider them to be the correct representation for the internal struc- 
tures of superheavy syllables in Arabic. 

Second, final as well as nonfinal geminate consonant clusters are more 
likely to be dominated by the same node or two sister nodes rather than by 
two separate, but non-sister, nodes on the syllable structure level. Several 
studies (e.g., Kenstowicz 5 Pyle 1973 and Guerssel 1977) have shown that 
geminate consonant clusters not separated by any boundary to not undergo 
some phonological rules such as epenthesis that usually affect non-geminate 
consonant clusters. 

This difference in behavior will be accounted for if one allows 
geminate clusters to be treated, and thus, represented differently than 
non-geminate clusters. 

Under McCarthy's analysis, where length is indicated by gemination, 
words like ?imm 'mother' would have the syllable structure in (15), where 
the final geminate cluster is dominated by two separate non-sister nodes i^^ 

C V c c 


Recent studies have shown that other representations would be pre- 
ferred to the one in (15) as a possible representation of syllables with 
long consonants. Leben (1980), for instance, points out that long con- 
sonants in languages such as Hausa or Biblical Hebrew are best analyzed 
as single consonants occupying two non-nuclear positions at the metrical 
level, rather than treating them as consonants with the feature [+long] 
or as sequences of identical segments. This treatment of long consonants 
is supported by their behavior with respect to some phonological rules in 
both languages, where they group with single consonants in one case and 
with consonant clusters in another (for futher details, see Leben (1980)). 
Ingria (1980) takes a similar position for indicating length, which he 
calls the "multiple attachment analysis". According to this analysis, 
long consonants are viewed as single consonants filling two non-nuclear 
positions in a syllable tree. They can occur in onset and coda positions 
in various languages, and when they do they are represented as in (16): 




Accordingly, words like ?imm in Arabic would have the syllabic 
representation in (17) below, where node labeling is not indicated 
because it is not relevant for the present discussion: 


C V c c 

Ml 1/ 

It is argued in Abu-Salim (forthcoming) that an analysis along the 
lines of Leben and Ingria is preferred for indicating length and would 
better describe the behavior of long consonants with respect to certain 
phonological rules. 

Third, it has to be noted that some phonological rules in PA create 
superheavy syllables in nonfinal positions at some points in the derivation. 
One of these rules is syncope which deletes unstressed high vowels in non- 
final open syllables, as shown by the examples in (18-19): 

(18) a. naa.jiH 'successful (,)' 
b. niaj.Ha 'successful (' 

(19) a. fus.tu? 'peanut' 

b. fust.?i 'peanut-colored' 


Given that the (a) examples are the underlying stems for the (b) 
examples in (18-19), the latter are then derived after the application 
of the syncope rule mentioned above. The various stages in the derivation 
of (18b) and (19b) are illustrated below. The first stage would be the 
assignment of syllable structure to their underlying representations shown 
in (20a) and (20b), respectively: 16 

'"» K A K /\ A A 

cvvcvcv cvccvcv 

I i I I I I I I I I I ! i I 

a. naajiHa b. fustuTi 

These structures (after metrical structure assignment and associated 
labeling) would now be subject to the syncope rule. I assume that if any 
vowel is deleted at any stage in the derivation, the syllable node domi- 
nating that vowel is also deleted by a convention. This would yield the 
representations in (21a) and (21b) : 

cvvccv cvccvc 

a. illUi .. Uilll 

In each case, a consonant devoid of any syllable affiliation is left 
stranded. At this point, resyllabification is invoked so as to provide 
for that consonant to be syllabified with either of the neighboring syllables. 
This is based on the assumption that at any stage of the derivation phonolog- 
ical strings should be properly syllabified. Apparently, the stranded con- 
sonants in (21) cannot join the following syllables lince that would create 
medial heavy onsets in violation of the syllable template in (7) and the 
constraint placed on the structure of the syllable onset. So the only 
other alternative is to adjoin them to the preceding syllable, which can 
be achieved by at least two different methods of adjunction. The first 
one is to Chomsky-adjoin the stranded consonant to the preceding syllable 
by a rule similar to the adjunction rule in (12), yielding the configurations 
in (22a, b); o- 


/^ A 

/X\ A 


C V C C C V 



b. f i s t ? 1 


If this method of adjunction is the correct one for resyllabifying 
stranded consonants, then additional syllable types may arise throughout 
the derivation of some strings. To illustrate this point, consider the 
examples in (23-24) below: 

(23) a. ni.zil 'he went down' (24) a. §i.rib 'he drank' 

b. 'they went down' b. Iir.bu 'they drank' 
(</ (</5i.ri.bu/) 

The (a) examples in (23-24) are generally taken to be the underlying 
stems for the (b) examples (cf. Brame 1973, 1974). The latter examples 
are then derived after the syncope rule mentioned above has been applied. 
The relevant stages in the derivation of (23b), for instance, are sketched 
below: o- 


AAA A A a\ a 

cvcvcv cvccv cvccv 

MINI . ^ I 11 N _ I I I 1 I 

n 1 z 1 1 u ^ n 1 z 1 u — yn i z 1 u 

a. Underlying structure b. Syncope c. Resyllabification 

This derivation would raise some questions as to»the phonetic status 

and interpretation of derived syllables of the form CVC and how different 

they are, phonetically, from syllables of the form CVC. 

The other resyllabification method is to sister-adjoin the stranded 
consonant to the preceding node, i.e., to the coda or the nucleus of the 
preceding syllable. This would yield the representations in (26a, b,c) 
for the structures in (21a, b) and (25b), respectively: 


/X A /X A A A 

cvvccv cvcccv cvccv 

Willi I M I / 1 I { I n 

naajHa b. fust?i c. nizlu 

In (26a, c), the adjoined consonant fills an empty coda position 
that is available in terras of the template (7) in the preceding syllable, 
whereas in (26b) it makes the coda heavy. So, one can say that once the 
syllable structure is assigned, each syllable node (s^) will dominate a 
template like the one in (7) with all positions structurally present even 
if some of the optional ones are empty. Resyllabification would then be 
viewed as a process by which a standed consonant is allowed to fill an 
empty coda position, if there is any, in a preceding syllable. 

This method of resyllabification should be more highly valued than 
the preceding one since it creates syllabic configurations which conform 
to the syllable template of the language. The fact that the stranded 
consonant in the examples above is not adjoined to the following syllable 
is another indication of the conformity to the syllable template principle. 

Evidence for this method of adjuction can be gathered from similar 
adjunction rules in various languages. Selkirk (1980), for instance. 

argues that the various prosodic categories, syllable, foot, word, ... 
etc., must be recognized and identified in the hierarchical representation 
of words and phrases since many phonological processes refer to these 
categories and appeal to information provided by their internal structures . 
According to her, the "stress foot (E)" is the domain of resyllabification 
in English by which the onset consonant of a weak syllable in the foot is 
moved to the coda position of the preceding syllable, as shown by the 
derivation of words like 'total' in (27) below (Selkirk 1980: 577): 


; o w t cB 1 / «=^/ t 6 w t ae 1 / 

/towt cBl/ «=^ 

Consequently, I would assume that resyllabification in Arabic is 
achieved by sister-adjoining a stranded consonant to the final node of 
the preceding syllable, or, in other words, by filling an empty coda 
position in the preceding syllable. It follows that even if superheavy 
syllables are not basic in Arabic the rule in (12) is not the appropriate 
way of adjoining consonants to preceding syllables. 

Fourth, it has been mentioned earlier that CWCC syllables are 
phonetically possible in Classical Arabic and that they can be generated 
in underlying structures in PA. Such heavy syllables will not by syl- 
labified properly under McCarthy's analysis since the last consonant will 
be left devoid of any syllable affiliation, as illustjated in (28): 

(28) /x /x\ 

C V V c c — ^> C V V C c 

a. Initial syllabification b. Rule (11) 

he only way, I think, for the last consonant in such cases to be 
ified is to reapply rule (11) to its output in (28b), thus, yielding 

the structure in. (29) : 

(29) _. , 


C V V c c 

As can be seen in (29), McCarthy's analysis, if extended to account 
for the syllabification of CWCC syllables, would result in syllable 
structures with three syllable nodes in the same syllable. It would be 
hard, I believe, to give a phonetic interpretation to such syllable 
structures ,^and, consequently, to favor them over syllable structure of 


the form CWCC where only one syllable node is present. 

The last point that can be presented as a counterevidence to the 
representation of superheavy syllables in (12) is derived from the structure 


of the syllable onset in PA, as well as in other Arabic dialects. As 
mentioned in section 1, heavy onsets are restricted exclusively to phrase- 
initial position. So, given McCarthy's analysis of superheavy syllables 
which is based on distributional, rather than phonetic, f actors, ^^ heavy 
onsets are prime candidates to be considered as nonbasic in Arabic and to 
be syllabified by a rule similar to the one in (11) because they have 
occurrence restrictions similar to those imposed on the rimes of superheavy 
syllables in Classical Arabic. This rule could be formulated as in (30), 
where $ stands for a phrase boundary: 


A ^ A 

$ c c =^ $ c c 

That is, the first consonant in a phrase-initial consonant cluster is 
Chomsky-adjoined to the following syllable. The $ is necessary to be part 
of the rule so as to prevent non-initial syllables from acquiring additional 
consonants . 

The syllabification process in Arabic would be rather complex if we 
allow both rules (12) and (30) to part of the grammar. They would give 
rise to highly doubtful syllable structures, especially if they both apply 
to the same utterance. Words like ktaab 'book' would then be represented 
by either of the folloiwng configurations: 


A /2\ 


I M I I I j I M 

a. ktaab b. ktaab 

Choice among (31a) and (31b) would depend on the ordering restrictions 
defined among the adjunction rules in (12) and (30). At any rate, such 
syllables are hard to interpret. In McCarthy (1979a, b), superheavy syl- 
lables are treated as single syllables having two rimes, the first of 
which is branching and the second nonbranching. If that is the case, how 
would the syllable structures in (31) be interpreted? 

Finally, it has to be noted that syllable-initial consonant clusters 
and heavy codas cannot be treated alike by any theory of syllabification in 
Arabic. Syllable-initial consonant clusters, as mentioned above, are more 
restricted in distribution than heavy codas. Moreover, syllables with heavy 
codas are sensitive, in most cases, to sonority restrictions while syllables 
with initial consonant clusters are not (cf. ?a.kil (</?akl/) 'food' vs. 
Ika .lam 'the pencil'). I would conclude, therefore, that superheavy syllables 
are basic in Arabic, and that they are to be syllabified in the same way 
other syllable types are. Syllable-initial consonant clusters, on the other 
hand, are not basic for the reasons mentioned above. The first member of 
initial consonant clusters would be syllabified with the following syllable 
by rule (30) which will be considered as part of the syllabification process 
in Arabic. Word-inital consonant clusters in other than phrase-initial posi- 
tion would not undergo rule (30), since in that case the initial consonant 
would be syllabified with the preceding syllable. 

3. Syllable Structure Assignment and Syllabification Rules 

It has been claimed in section 2 that syllable structure is defined at 
the underlying level prior to the application of phonological rules. Syl- 
lable structure at this level does not necessarily correspond to the derived 
structure at the phonetic level due to the application of some phonological 
rules such as syncope or epenthesis which alter the segmental make-up of 
some phonological strings. A resyllabification rule is therefore needed 
in the grammar to readjust the syllable structure at any intermediate stage 
in the derivation, so that the derived structure is properly syllabified 
according to the syllabification rules of any particular language. 

Several methods have been proposed to define the notion 'possible syl- 
lable' of a natural language, and various procedures or conventions have been 
established for dividing utterances into syllables. Two of these conventions 
seem to be of interest to the phonological theory, and are often referred 
to in various phonological studies. The first is the proposal made in Hooper 
(1972) by which syllable boundaries are inserted between certain sequences 
of segments by a universal rule, and the second is the autosegmental approach 
of Kahn (1976) , by which segments in the phonological string are associated 
with syllable nodes on a separate tier by a set of association rules. 18 

Both schemes of syllabification mentioned above are aimed at giving a 
formal definition of what a possible syllable of a natural language is. 
They predict, for instance, that in VCV sequences the intervocalic con- 
sonant is syllabified with the following, rather than the preceding, syl- 
lable. This tendency for open syllabicity, claimed to be universal, 
recognizes the simple CV, but not VC, syllable to be the least marked in 
the human languages (cf. Kaye and Lowenstamm 1979 and Cairns and Feinstein 

However, both schemes are basically concerned with locating syllable 
boundaries in segmental strings. They do not provide any account for the 
question of the internal structure of the syllable, probably because the 
phenomena discussed in those studies do not require reference to this 
structure. It is possible to come up with a revised version of any of 
the schemes above to suit the case of syllabification in Arabic, but this 
will be done at the expense of ignoring any internal structure for the 
syllable. As mentioned earlier, there is evidence to suggest that an 
intermediate hierarchical structure intervening between the CV-tier and 
syllable nodes must be recognized as part of any theory of the syllable and 
syllabification in Arabic due to the availability of some rules that refer 
to this structure, as will be discussed below. 

My alternative proposal for syllabification in Arabic relies in part 
on the recent development in the theory of syllable structure as presented 
in McCarthy (1979a, b), Selkirk (1979b), Kiparsky (1979, 1980), Kaye and 
Lowenstamm (1979), and Ingria (1980), among others, where syllables are 
represented in terms of binary branching metrical trees, the nodes of 
which are labeled s or w according to their relative sonority, as outlined 
in section 2. Syllable structure assignment rules, in the analysis presented 
below, will involve not only rules that will gather segments into syllables 
and assign internal structures to the created syllables, but will also 
involve other rules, some of which are similar in effect to other phonological 
rules in PA, which apply at the time syllabification is performed. 

It has to be pointed out, first of all, that all the syllable types 
of PA at the phonological representation level are defined in terms of the 
underlying syllable template of the language. The phonetic syllable 
structure is derived after the application of a) the syllabification rules 
which apply at the time syllable structure assignment rules are perfonned, 
and b) the resyllabification rules which apply after the application of 
some phonological rules that alter the segmental make-up of the underlying 
string. I will have occasion to discuss all of these rules in detail in 
this section. 

The dichotomy between the underlying syllable structure and the phonetic 
one is not arbitrary. We will see below that while CWCC syllables are not 
permitted phonetically they are possible in underlying structures. Thus, I 
assume that this syllable type is part of the underlying syllable inventory 
in PA. The underlying syllable template of PA in (7) will, accordingly, 
be revised so that the condition imposed on it is no longer there, 19 as 
shown in (32) below: 

(32) PA underlying syllable template ( a revised version): 

Before we move to the next section and show how the underlying 
syllable structure is defined in terms of the syllable template in (32), 
it is better to first show briefly how segments are to be represented in 
underlying representations prior to the syllabification process. 

Underlying strings may contain short as well as long segments. It 
seems uncontroversial to define short segments, both consonants and 
vowels, as being associated with single nodes in the syllable tree. This 
is illustrated by the following initial representations for short vowels, 
short consonants, and nongeminate consonant clusters: 

(33) a. a short vowel b. a short consonant c. nongeminate 

I j consonant cluster 

V C C C 

Long segments, on the other hand, would have a different representation. 
It is assumed, following Leben (1980) and Ingria (1980), that length is 
interpreted as an association of one segment in the segmental string with 
two nodes in the syllable tree, as shown in (34a, b) below: 


(34) a. a long vowel b. a long consonant 

V V 

V C 

In other words, a long vowel will be associated with two V-nodes and 
a long consonant with two C-nodes in the CV-tier. This way of representing 
length should be more highly valued than representing length by gemination 
for it guarantees that non-identical vowels will not occupy the nucleus 
position in any syllable and further provides for the immunity of long 
consonants against rules such as epenthesis, which breaks up non-geminate 
consonant clusters under some conditions. 

3.1. Syllable Structure Assignment Rules 

As mentioned in the preceding section, the underlying syllable 
structure is defined basically in terms of the underlying syllable tem- 
plate of the language. So, given the syllable template in (32) and the 
representation of length assumed above, syllable structure assignment will 
involve only one rule, and that is the imposition of the syllable template 
on the segmental string. This rule will apply iteratively until all the 
input string is completely, and, in some cases, properly syllabified. To 
illustrate how this rule applies, consider simple strings such as maktab 
'office', ?immhum 'their mother', and ra?iis 'president', where the 
underlying phonological structure is identical to the phonetic structure. 
These words will have the following initial representations: 



a. maktab b. rimhum c. ra?is 

As mentioned in section 1, the syllable onset may consist of no more 
than one consonant. To guarantee that each syllable will have an onset it 
is important to start the syllabification process from right to left. To 
make this point clear, consider an input string of the form CVCVC. If we 
start syllabification from left to right, we will end up with a syllable 
structure of the form *CVC.VC, which is not acceptable, whereas if we 
start from right to left, the correct syllable structure CV.CVC will be 
derived (for further details about directionality in syllable structure 
assignment, see Kaye and Lowenstamm (1979)). 

Given this directionality constraint on syllable structure assignment, 
the first iteration of the rule on the input strings in (35) will yield 
the structures in (36) : <r 

C36) A /\ /K 


I I M M ^ I ! V I M I M V I 

a. maktab b. ?imhum c. ra?is 

In each case, the rightmost syllable is defined. However, syllable 
structure assignment will continue until all the underlying segments in 
the segmental string are syllabified. Thus, the second iteration of the 
syllable structure assignment rule will define the second rightmost 
syllable in the structures in (36) , yielding those in (37): ^^ 

(37) A A /5\ A A /^ 


I M I M N Y ' I ' I I » \/ I 

a. maktab b. ?imhum c. ra?is 

All the segments in the examples in (37) are now syllabified, which 
means that the syllable structure assignment process is accomplished. No 
other syllabification rules are needed in these simple examples. The 
resulting structures in (37) will now be subject to metrical structure 
assignment above the syllable level, which is irrelevant to the subject 
matter of this paper. 

3.2 Syllabification Rules 

The examples considered in section 3.1. are very simple in that they 
do not require any additional rules to readjust the resulting syllable 
structure to make it conform to the underlying syllable template of the 
language. There are cases, however, where readjustment rules, which 
I prefer to call syllabification rules, are needed whenever the underlying 
segmental string cannot be completely syllabified according to the 
syllable template in (32). The rules which will be discussed below are: 
Nucleus-Reduction, Vowel -Insertion, and Phrase-Initial Consonant Adjunction. 
These rules, I assume, apply whenever their environment is met at the time 
syllable structure is assigned. 

3.2.1. Nucleus Reduction 

The function of this rule is to guarantee that syllables of the form 
CWCC are not realized phonetically. This function is achieved be reducing 
the nucleus of CWCC syllables to nonbranching whenever such syllables arise 
in underlying syllable structures. We have seen in section 2 that loan 
words such as bank , chance , and Ford are realized phonetically with short 
nuclei in PA. This reduction phenomenon can be viewed as the result of the 
Nucleus -Reduction rule under consideration. The underlying syllable struc- 
tures of those words would thus be the following: 



A A ' A A /a 


I y ] I I V I I I V I I 

a. bank b. (ians c. ford 

These structures would give the opportunity to the Nucleus -Reduction 

rule to apply, yielding the structures in (39) which are identical to the 

phonetic forms; ^ 


/X As A. 

C V C c 

! ' ' ' 

bank b. cans c. fo 

•Ml , n n I M J 

It has to be noted in this regard that the Nucleus-Reduction rule 
is not restricted to loan words. It applies as well to native forms that 
meet its structural description. Consider, first, the following examples: 

(40) a. jaab 'he brought' 

b. jaab-ha 'he brought her' 

(41) a. jaab-at 'she brought' 

b. jab-at-hum 'she brought them' 
(</ jaab -at -hum) 

The vowel-length alternation observed in (4Ia,b) is due to a vowel- 
shortening rule in PA that, roughly, shortens long vowels when they occur 
in unstressed open syllables (for further details about this rule, see 
Abu-Salim (forthcoming)). But there are cases where long vowels are 
systematically shortened even when they occur in stressed syllables, as 
in the following examples: 


(42) a. raHlak (</raaH-l-ak/) 'he went for/to you (' 

b. jabli (</jaab-l-i/) 'he brought (sth.) for/to me' 

c. (ma) §^fi§ (</?aaf-?/) 'he didn't see' 

d. (ma) rittii (</raaH-S/) 'he didn't go' 

These examples cannot be accounted for by the vowel -shortening rule 
mentioned above for two reasons: first, the long vowels are assigned 
primary stress and, second, they are in closed syllables. Any attempt 
to modify the vowel -shortening rule to enable it to account for such 
cases would result in a number of anomalies. First, we have seen that 
stressed vowels do not undergo the vowel -shortening rule (40a, b, 41a), 
and second, unstressed vowels in closed syllables escape the rule too, 
as in the following examples: 

(43) a. (ma) laafniiS / * lafniiS 'he didn't see me' 
b. (ma) baashaaS / * bashaaS 'he didn't kiss her' 

The analysis proposed in this paper avoids these anomalies. The 
examples in (42) will be derived in a way similar to the derivation of 
loan words in (38-39). We need only to further assume that syllable 
structure assignment is cyclic (McCarthy 1981). Segments will initially 
be syllabified on the innermost cycle according to the syllabification 
rules of the language. Then, if necessary, segments are resyllabified to 
conform to the syllable structure constraints of the language when outer 
cycles are considered for syllabification. This process is repeated until 
all the segments of the superordinate cycle are properly syllabified. 

Given these assumptions, the derivation of (42a, c), for instance, 
would proceed as follows given that their internal constituent structures 
are bracketed as indicated by the (-) signs in their corresponding under- 
lying structures. First, syllable structure is assigned on the innermost 
cycle according to the template in (32) , yielding the representations in 


cwccvc cwcc 

I V I I I I I y \ \ 

a. [[[ r a h 1] a k] b. [[§ a f] §] 

When the second cycle is considered for syllabification, the dative 
suffix /I/ and the negative suffix /s/ and syllabified with the preceding 
syllables, thus yielding the syllable structures in (45) : 



cwccvc cwcc 

I V 1 1 I I n I \ 

a. [r a H 1] a k] b. [§ a f 5] 

The derived syllables in (45a, b) are of the form CWCC. This is 
exactly the syllable type which invokes the Nucleus -Reduction rule to 
apply. The rule, therefore, applies at this point in the derivation, 
yielding the structures in (46a, b), where the nucleus in each case is 
reduced to nonbranching : 




11 1 / I f \\\^ 

a, [[r a H 1 a k] b. [S a f §] 

Finally, syllabification continues on the outer cycles until all 
the segments are properly syllabified. In (46b) this job has been 
achieved. But in (46a) the object suffix /ak/ is still to be syllabi- 
fied. Reapplication of the syllable-structure assignment rules yields 
the representation in (47) : 

C47) /\ /X 

I I I I M 

a. [r a H 1 a k] b. 

The correct phonetic forms raHlak and (ma) SifiS are then derived 
after syllables are organized into higher metrical units, and after the 
application of phonological rules, particularly vowel -epenthesis to (46b. )^^ 

This analysis, I believe, provides a reasonable account for the 
phenomenon under consideration, i.e, shortening of vowels carrying primary 
stress. If this phenomenon were to be accounted for ky the vowel -shortening 
rule that shortens vowels in unstressed open syllables, the rule would be 
more complicated. Thus, I make the distinction between the Nucleus- 
Reduction rule under consideration and the Vowel -Shortening rule mentioned 
above. The latter rule, as mentioned earlier, affects only vowels in 
unstressed open syllables, whereas the former is not sensitive to this 
property since it affects syllables that would be viewed as the most 
prominent after the assignment of metrical structures above the syllable 
level. This is why the two rules are placed in different positions in 
the grammar: the Nucleus -Reduction rule applies at the time syllable 
structure is assigned, and is part of the syllabification rules, whereas 
the Vowel -Shortening rule applies after the metrical structure at the 
word level is assigned and is part of the phonological rules proper. 

Given this account of the nucleus-reduction phenomenon, the rule 
could be formulated as in (48) : 

(48) Nucleus-Reduction: 

A A ^ 
C V V c c --=^ C V 

3.2.2. Vowel -Insertion 

The function of this rule is to create new syllables so as to provide 
for consonants that cannot be properly syllabified by the syllable-structure 
assignment mechanism to be syllabified. Tliis process can be viewed as a 
consequence of the assumption made earlier that all segments in underlying 
strings must be properly syllabified according to the template in (32). 
Cases with stranded, or unsyllabified, consonants most commonly arise when 
various suffixes are concatenated to verbal stems, as in katabtilha 'I 
wrote to/for her' (</katab-t-l-ha/) . The various stages in the syllabifi- 
cation of this example would yield the structure in (49) : 


A A 

c V c c c c V 

I I M i I I I I 


As it is clear in (49), the dative suffix /!/ is left unsyllabified 
because it cannot be syllabified with any of the neighboring syllables: 
the two positions in the coda of the preceding syllable are already 
occupied, and the onset of the following syllable cannot contain more 
than one consonant. A provision must, therefore, be made to allow the 
stranded consonant to be syllabified. The surface form katabtilha 
indicates that a vowel has been inserted to the left of that consonant. 
This is basically what the Vowel -Insertion rule does at this point. It 
inserts the vowel /i/ before any stranded consonant at any point in the 
syllabification process when such a situation arises. Application of 
the Vowel-Insertion rule to the structure in (49) would yield that in (50) 



A //A A A 

cv cvcc vc cv 

} I I i n II i } 

ka tabt il ha 

This structure is still not well -formed. The newly created syllable 
is vowel-initial which cannot be allowed according to the syllable tem- 
plate in (32). This situation is amended by the resyllabification rule 
which is assumed to be part of the grammar of Arabic. 

Resyllabification would modify the structure in (50) in such a way 
so that the resulting structure conforms to the template in (32) . It 
will basically provide an onset for the newly created syllable by resyl- 
labifying the last member of the coda of the antepenultimate syllable 
with the following syllable. Application of this rule would yield the 
structure in (51) : 






C V 

C V c 

C V c 

C V 

1 1 
k a 

1 1 1 


1 1 1 

t i 1 

1 1 

h a 

No additional syllabification rules are needed since the structure 
in (51) is now well-syllabified. 

Evidence for the Vowel -Insertion rule as part of the syllabification 
rules is not difficult to find. One might argue that the vowel /i/ is 
inserted by the epenthesis rule which is needed anyway elsewhere in the 
grammar. Proponents of this analysis would have to explain the placement 
of primary stress on the inserted vowel in the phonetic form katabtilha 
since epenthetic vowels do not receive stress due to their insertion after 
stress has been applied (Brame 1973) . One might argue instead that the 
vowel /i/ is underlying in katabtilha and view it as part of the dative 
suffix. But this analysis would fail to explain the absence of primary 
stress on the same vowel in other examples as in katabilha . 

The present analysis provides a natural explanation for both cases 
above. The inserted vowel in katabtilha receives primary stress because, 
as illustrated above, it is inserted prior to the assignment of metrical 
structures above the syllable level. Thus, when the word-level tree is 
erected, the inserted vowel will be present for rules that weigh syllables 
against each other for purposes of stress interpretation. According to 
the stress-assignment mechanism of PA, 22 the syllable that contains the 
inserted vowel in katabtilha will be the strongest element in the strongest 
foot, thus, it is viewed as the most prominent, whereas in katabilha , the 
vowel /i/ is inserted later in the derivation by the epenthesis rule after 
the metrical structure at the word level has been erected and, consequently, 
after prominence relations among syllables have been established. Thus, it 
escapes being assigned any prominence relation at the time such a relation 
is assigned. This is why we end up with the opaque stress on the surface. 

Given this account of vowel-insertion, the rule could be formulated 
as in (52): 

(52) Vowel -Insertion: 

or f »* .*' 

\ / \ A /' 

C C C => C V c c 

Again, this rule is placed among the syllabification rules that apply 
at the time syllable structure is assigned, and is viewed as independent 
of the vowel -epenthesis rule which applies at a later point in the derivation 
and which is dealt with as part of the phonological rules proper. 

3.2.3. Phrase- Initial Consonant Adjunction 

It was pointed out in section 1 that syllable-initial consonant 
clusters are possible in PA only in phrase-initial position. Due to this 
restriction on their distribution, no attempt has been made to provide for 
their syllabification by the syllable template in (32). If this were done, 
syllable structure assignment would be more complicated since it would allow 
for heavy onsets in other positions to be derived, which would be followed 
by a rule or rules to modify the resulting structure by taking the lefthand 
member of the heavy onset and resyllabify it with the preceding syllable. 
Consequently, I assume that the first member of a phrase-initial consonant 
cluster is adjoined to the initial syllable in the utterance by rule (30) 
given in section 2. Words like ktaab 'book' would thus be syllabified as 
in (53): ^ ^^ 

ccwc ccwc ccwc 

1 1 n 1 1 V I 1 1 V I 

ktab ktab ktab 

a. Underlying b. Syllable Structure c. Rule (30) 
Structure Assignment 

This analysis can be supported by the fact that such initial consonant 
clusters are not broken up by epenthesis, whereas heavy clusters in the 
coda position are. In other words, heavy codas obey, in most cases, some 
sonority restrictions on the distribution of consonants in the two coda 


positions whereas phrase-initial consonant clusters do not, which can be 
viewed as resulting from the difference between the structure of the coda 
and that of the phrase-initial consonant cluster. 

4. Resyllabification 

It was pointed out earlier that some phonological rules affect the 
underlying syllable structure in such a way that a resyllabification rule 
is needed in the grammar to ensure that the segmental string is properly 
syllabified at any point in the derivation. Two of these rules are 
syncope and epenthesis, which delete and add segments, respectively, to 
underlying structures. In this section, I will examine briefly how these 
rules affect the underlying structure and how the resyllabification rule 
amends it. 

The general syncope rule of PA deletes short high vowels in unstressed 
nonfinal open syllables, whereas epenthesis has a complementary effect of 
inserting short high vowels into consonant clusters under some conditions, 23 
as illustrated by the following examples: 

(54) a. Sirib 'he drank' 
b. 2irb-at 'she drank' 

(55) a. ?akl-i 'my food' 
b. ?ak£l-na 'our food' 

In (54b) the second stem vowel is deleted by syncope whereas in (55b) 
the second vowel is inserted by epenthesis. The underlying syllable 
structures for (54b) and (55b) are those in (56a, b), respectively: 


cvcvcvc cvcccv 

I I I I I I I I i 1 I i I 

a. siribat b. ?aklna 

When syncope and epenthesis apply to the structures in (56a, b), they 
yield the corresponding structures in (57a, b), where a syllable node is 
deleted in the first case, and another is added in the second. In this 
regard, I assume that once a vowel is deleted from the segmental string, 
the corresponding syllable node is consequently pruned off the syllable 
tree. By the same token, once a vowel is inserted into a segmental string, 
a syllable node is created dominating that vowel and any neighboring 
unsyllabified consonant. 

(57) ^ r .^ ^ "^ 

A /X /A A X\ 

cvccvc cvcvccv 

I I I I I I I I I M I I 

a. Sirbat b. ?akilna 

Such situations will invoke the resyllabification rule to apply so 
as to allow for the resulting unsyllabified segments to be associated 
properly with syllable nodes. In (57a) the stranded consonant is 
resyllabified with the preceding syllable in such a way that it will come 
to occupy the coda position of that syllable. In (57b), on the other 
hand, the syllable structure of the whole utterance has to be reorganized 
so that the coda of the initial syllable is resyllabified with the newly- 
created syllable as its onset. The resulting structures after the appli- 
cation of resyllabification would be the following: 

^^" A A A A A 

cvccvc cvcvccv 
I I I I I I I I I I M I 

a. sirbat b. ?akilna 

The resyllabification process illustrated above can be viewed as a 
reapplication of the syllable-structure assignment rules. First, the 
initial syllable structure is erased whenever segments are deleted from 
or added to the segmental string. And, second, the syllable-structure 
assignment rule will reapply to give a new structure to the segmental 
string. This process of erasing and rebuilding syllable structures will 
be performed after the application of any phonological rule that affects 
the segmental make-up of the underlying string. (Cf. Cairns and Feinstein 
(1982) for a proposal to elimate the need for resyllabification following 
the application of each phonological rules.) 

However, it is important to make the distinction between syllable 
structure assignment and resyllabification and to hold that distinction 
throughout the derivation. It was mentioned earlier that the underlying 
syllable structure defined by the syllable-structure assignment rules 
is the input to other metrical -structure assignment rules that build the 
word tree. It is at this point, i.e., building the metrical word-structure, 
that prominence relations among syllables are defined. Since some phono- 
logical rules are sensitive to the underlying syllable structure and to 
the strength relations established among syllables, it is important to 
preserve these strength relations throughout the derivation. If, for 
instance, the underlying syllable structure in (56b), and ultimately the 
word-structure, is erased after epenthesis, prominence relations among 
syllables would be reassigned in the structure in (58b), thus, yielding 
the incorrect phonetic form * ?akilna , whereas preservation of the original 
prominence relations yields the correct, although opaque, phonetic form 
?akilna . 

In this analysis, then, syllable-structure along with word-structure 
assignment rules will define the syllables contained in any utterance in 
addition to the strength relations among those syllables, whereas resyl- 
labification will ensure that the segmental string is well-syllabified at 
any stage in the derivation according to the syllable template in (32) 
provided that it does not alter the original strength relations. In fact, 
what resyllabification does is to associate stranded consonants, as in 
(57a), with the coda position of the preceding syllable, or providing an 
onset for a vowel-initial syllable, as in (57b), by resyllabifying the 
rightmost coda position of the preceding syllable with the following onsetless 


syllable. In this regard, it can be assumed that once a syllable node 
is assigned it will dominate a structure similar to the syllable template 
in (32) with all positions structurally present although some may be 
empty. Resyllabification is viewed then as a reorganization rule by 
which some segments are reorganized in neighboring syllables. 

5. Ambisyllabicity 

The syllabification scheme proposed by Kahn (1976) allows for 
consonants in some positions to be dominated by two syllable nodes, 
thus being "ambisyllabic", as shown by the syllabification of words 
like money where the syllable boundary between the two syllables con- 
tained in that word is not well-defined as it is in words like atlas : 

(59) a. atlas b. mAni 

V N/ \1/V 

s s s s 

Association lines linking pairs of syllables, such as the dashed 
line in (59b), are introduced, according to Kahn, by one or more rules 
after the syllabification rules of the language have been applied. For 
further details about this phenomenon in English, see Kahn (1976). 

Ambisyllabicity is one of the least, if ever, discussed phenomena 
in Arabic. This is due partly to the way segments are generally repre- 
sented in phonological representations, where long segments, consonants 
or vowels, are treated as sequences of two identical segments. This 
mode of representation would allow for words like ?immi 'my mother' to 
have the syllable structure in (60) with a well-defined syllable boundary 
between the two members of the geminate consonant cluster: 

"»' /X A 

C V C C V 


? i m m i 

A careful phonetician, upon hearing the pronunciation of words like 
?immi , would reject the representation in (60) to be the correct one for 
representing intervocalic geminate clusters and would favor, instead, a 
representation with no internal syllable boundary associated with such 
words, since it is hard to identify the point where one syllable ends 
and the other begins. Thus, I assume that such consonants are best 
analyzed as being long, and, be represented as in (34b) above. To show 
how ambisyllabicity is predictable given the syllabification scheme 
proposed in section 3 and the way long consonants are represented in 
underlying structures, consider the syllabification of words like ?immi 
which would have the initial representation in (61); 

(61) C V CC V 

The syllable-structure assignment rules would unambiguously assign the 
following structure to the representation in (61) : 

'"' A A 

c Y cc V 


? i m 1 

In (62), the intervocalic long consonant is associated with two C- 
elements in the CV-tier and, consequently, with two branches in the 
syllable structure. It is interpreted as ambisyllabic since the two 
C-elements in the CV-tier are associated with two syllables in the 
syllable structure. No additional associations lines, the sense of 
Kahn (1976), are needed to link the two syllables since they are already 
linked. The long consonant serves as both the coda of the first syl- 
lable and the onset of the second. 

It is not to be implied, however, that all long consonants in 
Arabic are ambisyllabic: only long intervocalic consonants are. 
Words like ?imm-na 'our mother' as opposed to ?im-na 'we lifted', for 
instance, will have the syllable stixicture in (63) according to the 
syllable-structure assignment rules of PA: 

'"' A A 


M V I 1 

? i m n a 

Here, the long consonant is syllabified with only the first syllable 
and serves as its coda. Any attempt to view it as ambisyllabic would 
result in an unacceptable structure that violates the syllable template 
in (32). 

6. Conclusion 

It has been argued in this paper that syllabification in PA not only 
involves rules to define syllable boundaries and assign structure to the 
syllables contained in the utterance, but also rules, some of which are 
similar in effect to other phonological rules in PA, that have to apply 
at the same time syllable structure is assigned. These rules have been 
placed among the syllabification rules for the simple reason that they, 
unlike phonological rules, are not sensitive to information provided 
by other phonological rules. So, for instance, the Nucleus-Reduction 
rule applied prior to assignment of prominence relations among syllables, 
whereas the Vowel -Shortening rule is sensitive to whether long vowels 
are stressed or not. 

It is my belief that many of the generalizations stated in this 
paper for PA hold as well for other Arabic dialects. It is, however, 
beyond the scope of the present study to deal with the issues considered 
in this paper comparatively. 



*I would like to thank C.C. Cheng, C-W. Kim, M. Kenstowicz, and 
C. Kisseberth for their helpful comments on various earlier versions of 
this paper. The transcription system used here has the following char- 
acteristics: 9 and H stand for the voiced and voiceless pharyngeal 
fricatives, respectively, emphatic consonants are indicated by uppercase 
letters, and long segments are denoted by gemination. 

Classical Arabic, on the other hand, has been reported to have 
this syllable type exemplified by words such as haamm 'important' or 
maarr 'passer-by'. It is pointed out by Al-Ani and May (1973:118) that 
this syllable type is the least frequent of all the syllable types of 
Arabic and that it is restricted in distribution to the final position 
in utterances or in words in their pause form. 

There is no total agreement among linguists as to the nature of 
the hierarchical structure of the syllable and the number of levels that 
may intervene between the segmental string and syllable nodes. Whereas 
the studies cited above display a certain amount of agreement that 
syllables have binary-branching tree structures with no limit on the 
number of levels intervening between segments and syllable nodes, 
Clements and Keyser (1981) propose an n-ary branching structure for the 
syllable, where the segmental tier and the syllable tier are separated 
by a third tier which they call "the CV-tier". In the present study, I 
assume the existence of the CV-tier mediating between the segmental tier 
and the syllable tier, and further assume, following McCarthy (1979a, b) 
and others, that the structure intervening between the CV-tier and the 
syllable tier is binary, rather than n-ary. 

Heavy onsets have been reported to occur in various Arabic dialects, 
as in Tunisian (Maamouri 1967) and Syrian Arabic (Cowell 1964), among others, 

When they occur in other positions, however, the first member of the 
cluster is syllabified with the preceding syllable, as in the following 

lam. ma 'when we drank' 
of. lam, ma 'when' 'we drank' 

Later in section 3, I will argue that CWCC syllables can be gen- 
erated in underlying syllable structures. Their failure to have a 
phonetic realization will be viewed as resulting from the application of 
a syllabification rule that has the effect of reducing the nucleus of 
such syllables at the time syllabification takes place. 

/maarr/ cannot be interpreted as the underlying form for maarir 
since there is a prohibition against breaking up geminate consonant 
clusters in PA. The /i/ in maarir is underlying, as argued by Brarae 

Of the derived words from SanS are SannaS 'to be lucky' and 

mSanniS 'lucky' 


This is not to be understood as a rejection of the idea that one 
syllable type may be considered as "derived" from another in the sense 
of stating an implicational relation between the existence of one syllable 
type and another. That is, if a language has a CVCC syllable type, then 
it will have CVC syllables. Similarly, if it has CVC syllables, then it 
will have CV syllables, etc. (cf. Kaye and Lowenstamm 1979 and Clements 
and Keyser 1981). 

heavy onsets are not basic in PA due to their 
limited distribution. Thus, they are not represented in the syllable 
template in (7). Instead, the initial member of the consonant cluster 
will be joined to the following syllable by a syllabification rule which 
will be discussed in detail in section 3. 

Only long vowels may occupy the nucleus when the optional node in 
the template is present. Sequences of nonidentical vowels may not occupy 
that position, and they will be ruled out by a constraint on phonological 
representation prohibiting nonidentical vowels to be dominated by one 
syllable node. 

this syllable type can be generated in phono- 
logical structures through the concatenation of various morphemes to base 
forms. The condition on the template in (7) can thus be viewed as a 
syllabification rule that has the affect of reducing the nucleus in CVVCC 
syllables at the time underlying strings are syllabified. This syllabi- 
fication rule is different from the Vowel -Shortening rule of PA since 
their conditional environments are different, and they apply at different 
stages in the derivation as will be discussed later in section 3. 


Cf. Cairns and Feinstein (1982:194) for an opposite view about 

resyllabification where it is claimed that their proposal "obviates the 

need for resyllabification following the application of each phonological 


Cf. haam . mun 'important' where a CWC syllable occurs nonfinally 
when the inflectional ending -un , standing for the nominative case and 
nunation, is added to the word haanun. 

The Damascence Arabic examples are taken from Cowell (1964). 

Kenstowicz (personal communication) informed me that McCarthy's 
current position would give the following representation for words like 
?imm : 

• « V 

ignored because they are not crucial to the present discussion. The 
information they provide, however, should be available before any phono- 
logical rule applies. 


Arabic, although more complex than the other types, are, however, clearly 
single syllables by any measure of surface syllabification. Thus, they 
scan as single heavy syllables, not as two syllables, in the meter 
mutadaarik, where they occur most often." 

see Hooper (1972) and Kahn (1976). 


The condition on the syllable template will be viewed later as a 

syllabification rule which will guarantee that CWCC syllables are not 

realized phonetically in PA. 

The term "long consonant" is used here to refer to what has been 
usually referred to as a "geminate consonant cluster". 

In Abu-Salim (forthcoming), I assume that syllable and higher 
metrical units apply simultaneously on each cycle. Thus, on the inner- 
most cycle, syllables are grouped into feet and the latter into the word 
before we proceed to the other cycles, and so on. 


For further details about a segmental treatment of stress in PA, 

see Brame (1973), and for a metrical analysis of stress in PA, see 

Kenstowicz (1981) and Abu-Salim (forthcoming) . 


ABU-SALIM, I. (forthcoming). A reanalysis of some aspects of Arabic 

phonology: a metrical approach. Ph.D. dissertation. University 

of Illinois, Urbana. 
AL-ANI, S. and D. MAY. 1973. The phonological structure of the syllable 

in Arabic. American Journal of Arabic Studies 1.113-125. Reprinted 

in Readings in Arabic Linguistics, ed. by Al-Ani (1978). Bloomington: 

Indiana University Linguistics Club. 
BRAME, M. 1973. On stress assignment in two Arabic dialects. In S. 

Anderson and P. Kiparsky, eds.: A Festschrift for Morris Halle, 

pp. 14-25. New York: Holt, Rinehart, and Winston. 
. 1974. The cycle in phonology: stress in Palestinian, 

Maltese, and Spanish. Linguistic Inquiry 5.39-60. 
BROSELOW, E. 1979. Cairene Arabic syllable structure. Linguistic 

Analysis 5.345-382. 
. 1980. Syllable structure in two Arabic dialects. Studies 

in the Linguistic Sciences 10:2.13-24. 
CAIRNS, C. and M. Feinstein. 1982. Markedness and the theory of syllable 

structure. Linguistic Inquiry 13.193-225. 
CLEMENTS, G. , and S. Keyser. 1981. A three-tiered theory of the syllable. 

Unpublished paper. Harvard and M.I.T. 
COWELL, M. 1964. A reference grammar of Syrian Arabic. Washington, D.C.: 

Georgetown University Press. 
FEINSTEIN, M. 1979. Prenasalization and syllable structure. Linguistic 

Inquiry 10.245-278. 


FUDGE, E. 1969. Syllables. Journal of Linguistics 5.253-286. 
GUERSSEL, M. 1977. Constraints on phonological rules. Linguistic 

Analysis 3.267-305. 
HALLE, M. 1978. Metrical structure in phonology. Unpublished paper, 

and J. Vergnaud. 1979. Metrical phonology (a fragment of 

a draft). Unpublished paper, M.I.T. 
HOCKETT, C. 1955. A manual of phonology. Indiana University Publications 

in Anthropology and Linguistics, 11. Bloomington, IN: Indiana 

University Press. 
HOOPER, J. 1972. The syllable in phonological theory. Language 38. 

INGRIA, R. 1980. Compensatory lengthening as a metrical phenomenon. 

Linguistics Inquiry 11.465-495. 
KAHN, D. 1976. Syllable-based generalizations in English phonology. 

Ph.D. dissertation, M.I.T. Distributed by Indiana University 

Linguistics Club, Bloomington. 
KAYE, J. and J. Lowenstamm. 1979. Syllable structure and markedness 

theory. Unpublished ms., Universite du Quebec. 
KENSTOWICZ, M. and C. Pyle. 1973. On the phonological integrity of 

germinate clusters. In M. Kenstowicz and C. Kisseberth, eds.: 

Issues in phonological theory, 27-43. The Hague: Mouton. 
and K. Abdul-Karim. 1980. Cyclic stress in Levantine 

Arabic. Studies in the Linguistic Sciences 10:2.55-76 
KIPARSKY, P. 1979. Metrical structure assignment is cyclic. Linguistic 

Inquiry 10.421-641. 
. 1980. Remarks on the metrical structure of the syllable. 

Unpublished paper, M.I.T. 
LEBEN, W. 1980. A metrical analysis of length. Linguistic Inqury 11. 

LEHISTE, I. 1978. The syllable as a structural unit in Estonian. In 

A. Bell and J. Hooper, eds: Syllables and segments, pp. 73-83. 

Amsterdam: North-Holland. 
MAAMOURI, M. 1967. The phonology of Tunisian Arabic. Ph.D. disser- 
tation, Cornell University. 
McCarthy, J. 1979a. Formal problems in Semitic phonology and morphology. 

Ph.D. dissertation, M.I.T. 
. 1979b. On stress and syllabification. Linguistic Inquiry 


1980. A note on the accentuation of Damascene Arabic. 

Studies in the Linguistic Sciences 10:2.77-98. 
NEWMAN, P. 1972. Syllable weight as a phonological variable. Studies in 

African Linguistics 3.301-323. 
PIKE, K. and E. Pike. 1947. Immediate constituents of Mazatec syllables. 

International Journal of American Linguistics 13.78-91. 
SELKIRK, E. 1980a. The role of prosodic categories in English word stress, 

Linguistic Inquiry 11.563-605. 
. 1980b. On prosodic structure and its relation to syntactic 

structure. Distributed by Indiana University Linguistics Club, 

1981. Epenthesis and degenerate syllables in Cairene Arabic. 

In H. Borer and Y. Aoun, eds.: Theoretical issues in the grammar of 
Semitic languages, M.I.T. Working Papers in Linguistics 3.209-232. 

studies in the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 

Chin-Chuan Cheng 

Although there have been many studies on how Chinese dialects 
relate to each other on the basis of a handful of phonological 
changes within the framework of traditional historical linguistics, 
few have attempted to quantify the degrees of closeness. This paper 
presents a method of quantification of Chinese dialect affinity in 
terms of lexical items. The data consist of 905 words listed in the 
Hanyu Fangyan Cihui compiled by linguists in Peking. The 
corresponding forms in 18 dialects (Beijing, Ji'nan, Shenyang, Xi'an, 
Chengdu, Kunming, Hefei, Yangzhou, Suzhou , Wenzhou, Changsha, 
Nanchang, Meixian, Guangzhou, Yangjiang, Xiamen, Chaozhou and Fuzhou) 
from the book have been tabulated. These forms in relation to the 
dialects were dichotomized as either presence or absence. Then 
contingency tables of each pair of dialects were constructed, and 
Pearson's Moment Product in statistics was applied to them to obtain 
the cross-correlation among the dialects. The correlation 
coefficients are considered the degrees of closeness. The 
coefficients are utilized to subgroup the dialects. The derived 
subgrouping is compared with known facts in Chinese linguistics. 
Further applications of the method and the coefficients as indices of 
distance are discussed. 

1. Dialect Distance 

As early as 1935, Professor Wang Li grouped Chinese dialects into 
Mandarin, Wu, Min, Yue , and Kejia (Wang Li 1935). His grouping was mainly 
based on the existence of voiced obstruents; the presence or absence of 
syllable endings -m, -p, -t , -k ; and the number of tones. Professor Li 
Fang-kuei's 1937 classification was also made on the basis of these 
phonological characteristics. Thanks to the successful work of Chinese 
linguists in the past, phonological details of speech in many localities 
are generally available. In Professor Yuan Jiahua's (1960) momumental 
work. Northern Dialects (including Northern Mandarin, Northwestern 
Mandarin, Southwestern Mandarin, and Xiajiang or Eastern Mandarin), Wu, 
Xiang, Can, Kejia, Yue, Northern Min, and Southern Min are established and 
well documented. 

Dialect grouping or subgrouping provides us with the information that 
some speech communities share or do not share certain linguistic 
features. We are thus able to differentiate the varieties in a systematic 
way. However, these distinct dialects are actually related. Indeed, the 
relatedness can be demonstrated by the sharing of various character- 
istics. Yet, the DEGREES of closeness among the dialects and among the 

varieties within the dialects have not been well studied. For example, 
there are no readily available indices for determining whether Kejia as 
spoken in Meixian is closer to Gan than it is to Yue or whether Xi'an is 
closer to Ji'nan than to Beijing. One of the objectives of science is to 
express things in terms of verifiable measurements. It will be further 
progress in linguistics if we are able to precisely quantify dialect 
affinity. Indeed, Wang Yude (1960) was the first scholar to attempt to 
apply glottochronological methods to quantify closeness or remoteness of 
Chinese dialects. His work, however, is limited to five dialects and the 
validity of many assumptions in glottochronology is doubtful. In this 
paper, I will present another method to measure the distance of Chinese 
dialects on the basis of lexical items. Specifically, the statistician 
Karl Pearson's product -moment correlation coefficients are used to measure 
the degrees of association among the varieties of dialects as given in the 
Hanyu Fangyan Cihui (Beijing University 1964). 

2. Lexical Correlation 

Syntactic features, phonological characteristics, and lexical 
cognates are used by linguists as well as by speakers of a dialect who 
come into language contact with speakers of another dialect to determine 
dialect relationships. Let us examine how lexical items facilitate such a 
task. Take the words 'sun' and 'moon' as an example. According to Hanyu 
Fangvan Cihui these words exist in the following dialects as given in 


t'ai iag 
t'£ iai) 
t 'e iat) 
^ii' dY 
t'ai ian 
zl tau 
i^t t'tu 
i^iat t 't u 
jit t'-gu 

lit t'au 
ni? lau 

As shown in (1), some dialects have more than one word for the same 
meaning. The sounds vary across the dialects. But internal phonological 
patterning and historical correspondences allow us to identify cognates. 
The Hanyu Fangyan Cihui has identified the cognates and indicates such 
with Chinese characters. Thus on the basis of (1), we can begin to group 
the dialects. For example, the words <S itt and ^ ^ are used in Beijing, 
Ji'nan, Suzhou, and Changsha, and hence these dialects can be considered 
to form a group as opposed to the other dialects which do not share these 
words. Meixian and Guangzhou share ^4^ , and hence can be considered 
closer to each other than they are to the others. However, as we examine 
the word ^ ^ , ve see that Nanchang, Meixian, and Guangzhou form a 
group. The grouping established on the basis of *ft; -^ therefore has to be 
reconciled. As more words are taken into consideration, the relationships 
among the dialects are not a matter of existence or nonexistence but a 








^ ?fl 


9 ^ 


9 A 







ye liag 


ye liat) 


r)»» liag 


ye lian 


9ri3t kuag 


^iat kusn 

H % 

jyt kwon 


8^^ .; 


ge? niu 



matter of degree. Counting common vocabulary items is an obvious and old 
method for subgrouping. But in order to process a large amount of data, a 
rigorous formulation is necessary. In my veiw, it is precisely the 
degrees of association among dialects that need to be seriously studied at 
this stage of the development of Chinese dialectology. 

In order to derive the degrees of association among dialects, cognate 
data have to be transformed in a certain way. The basis for the grouping 
discussed above is the existence or nonexistence of forms for certain 
meanings. The words in relation to the dialects are dichotomized as 
either presence or absence. If we consistently use "1" for presence and 
"0" for absence, then (1) can be transformed into (2), where the dialects 
are listed in columns and the words in rows. 

(2) Beijing Ji'nan Suzhou Changsha Meixian Guangzhou Xiamen Fuzhou 

^ »a 1 11 1 

e -V, 














The arrangement in (2) shows clearly that Beijing and Ji'nan, Suzhou and 
Changsha, and Meixian and Guangzhou are perfectly correlated. That is, 
when a "1" or "0" appears in a dialect, it also appears in the other of 
the pair. Other pairs, on the other hand, do not exhibit such a 
relationship. To find the degree of association between two dialects, 2 x 
2 tables can be constructed to show (a) the number of words scoring on a 
dialect and 1 on the other, (b) the number of words scoring 1 on both 
dialects, (c) the number of words scoring on both dialects, and (d) the 
number of words scoring 1 on a dialect and on the other. For example, 
on the basis of (2), the contingency tables for the pairs Bei jing-Ji 'nan, 
Beijing- and Xiamen-Fuzhou are given in (3). 









•a 1 



§ 1 









The correlation of the two variables, dialects in our case, of each pair 
can be calculated by using equation (4), which was first derived by the 

statistician Karl Pearson in 1901." 




The correlation coefficient so derived is the phi coefficient in 
statistics. Its value varies from +1 to -1. It indicates both the 
direction and the strength of relationship between two variables. The 
sign indicates that the two variables are either positively or negatively 
correlated. The value means that the two variables are not correlated. 
The higher the value, the stronger the association. 

The correlation equation applied to the contingency tables in (3) 
yields the results in (5). 

(5) Bejing-Ji'nan correlation coefficient: 1.0000 
Bei jing-Meixian correlation coefficient: -0.3333 
Xiamen-Fuzhou correlation coefficient: 0.5774 

Since Beijing and Ji'nan have the same words as shown in (1), they are 
perfectly correlated. Xiamen and Fuzhou are closer related (0.5774) than 
Beijing and Meixian (-0.3333). Since these coefficients are derived on 
the basis of 8 lexical items only, they have no useful generality for 
measuring the degrees of relationship among these dialects. What I have 
presented so far is only an illustration of the quantification procedures 
used in this study. The data base and the calculation procedures are 
presented below. 

The Hanyu Fangyan Cihui contains 905 words in Putonghua. Each page 
contains 2 words. Under each word the corresponding items in 18 
localities are listed in Chinese characters as well as in phonetic forms. 
As said before, the characters represent the identified cognate words. In 
(1) we presented only some of the dialects for illustration. In order to 
describe the data base for this study fully, let us look at the word 
'moon' again. Listed in (6) are all the 18 dialects and the various forms 
of this word in Chinese characters, phonetic transcriptions being omitted 




























B ^ 


h ^ 


h I 


n 1 


a ^ 


^ ^ 


8 II 






^ ^1 



All the Chinese dialect groups are represented here. Beijing, Ji'nan, and 
Shenyang represent the Northern Mandarin group, Xi'an the Northwestern 
Mandarin, Chengdu and Kunming the Southwestern Mandarin, Hefei and 
Yangzhou the Eastern Mandarin, Suzhou and Wenzhou the Wu, Changsha the 
Xiang, Nanchang the Can, Meixian the Kejia, Guangzhou and Yangjiang the 
Yue , Xiamen and Chaozhou the Southern Min , and Fuzhou the Northern Min 

In preparing the data for computer processing, each distinct item, 
judged on the basis of morpheme identification as represented in Chinese 
characters, was treated as an entry of which the occurrence or 
nonoccurrence in each dialect was marked "1" or "0." Thus the 5 words 
corresponding to the Putonghua yueliang (fl 'JL>^^i''^^>3 '^ "^1^ 
were treated as 5 distinct words and thus the presence or absence in each 
dialect was marked "1" or "0" as in (7), 

(7) OOlBOl 111111111010001000 
001B02 000000010000000000 
001B03 000000000101110000 
001B04 000000000000000111 
001B05 000000000000000110 

The first three digits coded the page number, page 1 in this case. The 
next letter, "A" or "B", represented the first or the second half of the 
page where the word was found. The next two columms were for the distinct 
words coded in an arbitrary sequence beginning with "01." After the space 
in the seventh column, the "1" or "0" mark was given for each of the 
dialects Beining, Ji'nan, Shenyang, etc. in the sequence as shown in 

Of the 905 Putonghua words, there are a total of 6,454 items for the 
dialects. All the 6,454 items were coded in the same way as just 
described. In the Hanyu Fangyan Cihui , these words are grouped into 


nouns, verbs adjectives, pronouns, measure words, adverbs, and 
prepositions and connectives. Correlation coefficients were obtained for 
these individual categories as well as for the entire data. The SPSS 
(Statistical Package for the Social Sciences) computer package was used to 
run the statistics. The phi coefficient, discussed above, is the Pearson 
product- correlation coefficient calculated on nominal- data. 
Therefore in the actual computer run the product-moment coefficient 
equation given in (8) was used (see Nie et al 1975). 


i;. x;Y,-(iu,>^o(i:r.,Y)/v 

where X and Y are scores of words on two dialects 
whose degree of association is being calculated. 

The correlation coefficients for the individual categories of nouns, 
verbs, etc. will not be discussed in this paper because of space 
limitation. The correlation coefficients for pairs of the 18 dialects 
thus computed with the entire data are given in Tables la and lb. The 
level of significance computed is .001. That is, the correlation due to 
chance is statistically insignificant. The coefficients indicate strength 
of association. The higher the value of the coefficient is the closer the 
related dialects in terms of the lexicon. 

To illustrate the use of the tables, let us now answer the question 
whether Xi'an is closer to Beijing or to Ji'nan. We look at line 4 of 
Table la and find Xi'an heading the row. Its correlation coefficients 
with the other dialects are listed across and on to Table lb. In table la 
we find its correlation coefficient with Beijing to be 0.6108 and its 
coefficient with Ji'nan to be 0.6076. The answer to the question is Xi'an 
is closer to Beijing. Thus we have provided quantitative indices to 
dialect affinity in term of the lexicon. 

The distance among the dialects can be ranked according to the 
coefficients. Take Beijing for instance, the dialect closest to it is 
Shenyang; the next closest is Ji'nan; the next closest is Xi'an; etc. The 
rankings in relation to each of the dialects are given in Table 2. In this 
table the highest rank is "1" and the lowest is "17". however, with 
respect to Changsha, two of the coefficients have the same value. Both 
are ranked number 6, and the lowest rank is "16" rather than "17." To read 
Table 2, first find the name of the dialect in the rows. Then read across 
the columns to find the ranking of the other dialects in relation to this 

All the coefficients can also be ranked. In this ranking we can 
compare readily the position of a particular pair among all the pairs. 
The ranks are given in Table 3. Because of some tie scores, the ranks are 
from 1 to 152. Now we see in the table that the Shenyang-Beijing pair 
ranks first and the Xiamen-Suzhou pair ranks last. 


CM in -^ -H 
ui in CM -* 

CM -< CM vO 

m o -H r~. 
O «N 0\ -H 


CO NO ^ 00 

0\ CO o 

m ■* c^ r- 

o m ^ ov 

o 00 CO m 

-. o -« 

:S :? 2 

O 00 -* vO 

ff> ^ CM 


CM -H 00 


O t^ CT\ -H 

CM On -H 

ON CO 0> CN 

CM ^ vO 00 -» vo u-i 


^ CO vO r» 

O r- O 
O r^ t^ 
O -* -H 

m CO i-» 

CO vo CO o m 

CM CM eg 

O >ri <N 

m CO CO 
Wv u-> o 

00 —I 



pqs8uFq3 '~~'^'^oo~*"^moova-~*i-H,»invO(»>— i-h £ 


nouznc ooooooooi— <ovoro-*uir~r-r-^ a 

c • 
OJ n 

noqzguF;^ -*"^~*'^<N-^-Hi-Hro<ntMoooa\cMesoo So 



^ _ _, -H ,-< _, 4, 

^ c 

cd ■'^ 

npSuaqo oo<y^a^^^ i cMvot^ovomoou-iuir-vOxOvo 



o ^ 

C Q 


«, 2P -,^ 2 _OJt>0 OC 3 

cc« T)d j:3OBJ0j«9N.HCj:a 
'1 2 - S 2?'S "^ "^ ° -^ aox: -rj oo-'n UNO 

• H- 4). (UC<4-lCNCnC.HaJc5SN 










in ^ o 00 m 
i/^ vO CN o pn 

ON -^t CO CO in sc m 
r~ r~ On -* CO CM 

<rvocM— 10000N-* 

nOU23uBI vO-^Of^fO—iONCMOvNO 





r»D3u9U'^ OO0Nv^r^r^^O**O'-<0N^ 


cNr^oo— icMo\cor»iri 


^3 <tc^rocMu-i<»co<t-ir)0\c-> 


vo vo cs r^ o> 

CM— iu-ir~cMfOoor-o\cg-^ 

CM— i-*o~J-— ioor^O»ccNcnoNu-iu-i~3-co 
^csico— itnooco^s-oovDr^ONOONO 

C 300 O 3j:CCj2td o 

C<0 T3C X:30a)COcON.^^CJ:3 

« >^C M-H-w N O j: tJOjC-rJ 60-^ (UNO 

- <u- ajSiMCNcoc-HWCfljcdN 



The statistics neatly show the degrees of lexical association. 
However, we need to pause to consider the validity of the method in 
general and the nature of the statistics in particular. 

3. Validity of Lexicostat ist ics 

In historical-comparative linguistics, inferences about language 
relationships are made on the basis of phonological correspondences, 
morphological- features, and lexical cognates. It is somewhat 
curious for Annette J. Dobson (1978:58) to state: "Linguists and 
anthropologists make inferences about the relationships and history of 
languages within a family from vocabularies. They count the number of 
cognate words shared by each pair of present-day languages and use these 
data to reconstruct the ancestry of the family." The use of cognate words 
exclusively in the study of genetic relationships in recent linguistics 
history first appeared in the 1950' s. Morris Swadesh (1950) first 
suggested in an article on Salish internal relationships a statistical 
method which took into consideration the percentage of cognate words 
existing in pairs of related languages to determine the degrees of 
relationship or the time-depths of the language split. At the publication 
of Swadesh' 8 article, many linguists showed an increasing interst in the 
application of the mathematical method to historical linguistics. This 
particular lexicostatistical method is called glottochronology . 

Glottochronology as developed in the 1950' s uses about 200 "basic 
words" to measure the percentage of shared cognates in a pair of 
languages. It was established on the basis of counts from some languages 
that on the average about 81% of the "basic" words of a language will 
survive as cognates after 1,000 years. This means that two related 
languages 1,000 years after they have split off from the parent language 
will have 66% overlapping (81% of 81%) cognates. This principle is then 
used to determine the time-depths, and by comparing the time-depths of 
languages or dialects one can obtain the degrees of relatedness. 
Generally speaking, the mathematical formula to derive the time in units 
of one thousand years is given in (9). 

(9) t = log C i- 2 log r. 

In (9), t is time; C is the fraction of corresponding test-list 
equivalents that are cognate between the two related languages; r is the 
retention rate assumed. 

In the 1950' s, the calibration of time-depths was the most exciting 
function of glottochronology. However, criticisms of this particular 
lexico- statistical method quickly followed. Criticisms have been 
directed especially at the concept of "basic" vocabulary and at the 
validity of the retention rate, which assumes a uniform and steady change 
over time. C. Douglas Chretien (1962) states that the functions of 
glottochronology does not correspond to its hypotheses and concludes that 
the results obtained by various glotto- chronological studies in the 
1950's have been illusory. Karl V. Teeter (1963) states that only on the 
basis of correspondences, validated by the re- construction of graomiatical 
systems and their comparison, that we can arrive at statements of genetic 
relationship and its details. He further states that words are on the 

surface of language, and they may be freely added or dropped as the 
culture changes. His conclusions are that any regularity to lexical 
attrition is a cultural regularity, and the history of culture is not the 
same as the history of language. I will have more to say about language 
history and culture history later. 

Wang Yude (1960) represents the first attempt to apply glotto- 
chronology to determine chronological linguistic relationships to Chinese 
dialects. He prepared a list of 200 words from Beijing, Suzhou , Xiamen, 
Guangzhou, and Meixian on the basis of Swadesh's "basic" word list. Then 
he tabulated the number of shared cognates in pairs of these five 
dialects. Time-depths were then calculated with Swadesh's and Hottori's 
modified equations. However, since he covered only 5 dialects, the 
general use of the results as indications of dialect affinity is limited. 

Other lexicostatistical methods have also been designed to measure 
language relationships and to determine subgrouping. They are often built 
upon the principles of glottochronology , especially those of measuring 
percentage of shared cognates. One method that deserves special 
mentioning in this context of quantification of Chinese dialect affinity 
is Hsin-I Hsieh's (1977) phono- lexico-statistic method. Essentially, 
Hsieh counts the number of phonological forms shared by two dialects in 
some cognate items but not in others. The basis of this quantification is 
not cognates but phonological forms in cognates. He applies such a method 
to 20 dialect localities in Jiangsu to determine the subgrouping of these 
localities. The phonological forms being considered are the various 
reflexes of the Even Tone category of Middle Chinese which appear in 
syllables which had a voiced initial in Middle Chinese. Using the Report 
on the Survey of Dialects in Jiansu Province , he counts 533 items in this 
category for these 20 localities. Of these 533 items 43 appear in 
different categories of tones in these twenty dialects. The sharing of 
these forms in pairs of dialects is then tabulated. The tabulated numbers 
are then used to determining subgrouping. He further points out that the 
ratio of shared items over items compared can be taken as a statistical 
measurement of dialect relationships. However, he does not pursue this 
point further and thus leaves the tabulation in its raw data form. 

Now we return to our study of the relationships of the 18 dialects 
given in the Hanyu Fangyan Cihui. Does the product-moment correlation 
calculation yield meaningful coefficients for interpretation of dialect 
affinity? To evaluate the validity of this method, we need to discuss the 
following questions: (a) Is vocabulary a valid basis for determining 
dialect closeness? (b) Is the statistical procedure appropriate for this 
kind of data? (c) What generalizations can one make with the results? 

Is vocabulary a valid basis for determining dialect relationships? 
My feelings are that it all depends on what is meant by relationships. As 
mentioned above, Teeter's view is that words are cultural phenomena and 
hence have little to do with the genetic system of language. So far in 
this paper, I have not yet asserted that my method will yield a genetic 
relationship. Language does not exist apart from culture. Indeed, 
lexical changes often reflect cultural evolution. In the history of 
Chinese, the dialects of cultural centers have played a dominant role in 
lending lexical items to other dialects. It is well known that words of 


dominant dialects have been borrowed into other dialects. The lexicon 
thus can be considered a composite form of language history and culture. 
By using lexical items as data, what I propose to measure is the dialect 
closeness irrespective of their historical derivation and genetic 
affiliation. The results can be viewed as the synchronic closeness as a 
consequence of language history and cultural interaction. 

Is the statistical procedure appropriate for this kind of data? The 
Pearson product -moment has been frequently used in finding correlations of 
bivariate data which are nominal-dichotomous . As I understand it, no 
assumptions are necessary for the computation of a Pearsonian 

Then, how do we interpret the coefficients? That is, what 
generalizations can one make with the results? Interpretation of a given 
correlation coefficient is somewhat elusive. First of all, we know that 
the presence of a correlation between two variables does not necessarily 
mean that a causal relation exists. Therefore, the results are not 
appropriate for interpretation of direction of dialect influence. The 
direction of influence has to be established independently. Secondly, 
coefficient is not the same as percentage. The coefficients calculated in 
this study cannot be regarded as percentages of sharing of cognates 
between the dialects. Thirdly, the correlation coefficient is subject to 
sampling variation. Depending on the nature of a particular sample, the 
calculated coefficient may be higher or lower than it would be in a 
different sample. At this juncture, it is in order to discuss what 
Herbert H. Clark (1973) has called the "language-as-fixed-effect fallacy." 

Herbert H. Clark points out that many investigators of language 
statistics have treated language as fixed effect rather than correctly as 
random effect. The nonrandom sampling procedure causes difficulty when 
the investigators want to determine exactly what population they can 
legitimately generalize their results to. He further points out that the 
main problems of random sampling in language statistics are (a) defining 
the language population, (b) sampling without bias from this population, 
and (c) sampling by a procedure that other investigator can repeat. Now 
let us examine our dialect data in the Hanyu Fangyan Cihui . The words are 
certainly high frequency words. But there are many more other words 
existing in the dialects which are not included. It is therefore not 
reasonable to consider this set of data the population. But then 
the sampling or the selection of words by the authors of the Hanyu Fangyan 
Cihui represents their view of what are common, ordinary lexical items. 
It is difficult to say that the selection was without bias in the 
statistical sense. As the selection was more or less intuitive, rigorous 
procedures are not available. Clark acknowledges that an entirely 
different approach to the study of language does exist. This different 
approach is to work from single cases. This method has had a long and 
respectable tradition in linguistics. Because of the sampling problems 
inherent in our treatment of the data in the Hanyu Fangyan Cihui , I 
consider our work here a single case study. That is, the coefficients are 
considered as relative degrees of association for the dialects on the 
basis of the lexical items given in the Hanyu Fangyan Cihui . No 
immediate generalizations are made about the population. However, this 
does not mean that the generalizations derived from the data and about the 


data cannot corroborate the results of other case studies. It is hoped 
that this study presents a useful method and the results give a fairly 
adequate indication of the degrees of synchronic, not necessarily genetic, 
closeness of Chinese dialects in lexicon. 

4. Lexical Affinity and Genetic Subgrouping 

The degrees of closeness among the dialects with respect to the 
lexicon are already given in Tables la and lb. The rankings of closeness 
are already given in Tables 2 and 3. In this section we will look at the 
relatedness from a different angle. The coefficients can be used to 
subgroup the dialects. In the following, we will use the word "level" to 
mean the minimum correlation coefficient. For example, level .65 means 
that the correlation coefficients are .65 or greater. By fixing the level 
at a certain point, we can group together the dialects which have 
correlation coefficients equal to or greater than that point. The process 
of grouping is carried out as follows: If the coefficient of a pair of 
dialects, say A and B, is equal to or greater than the level point, then 
the dialects are linked in a group, say G. If the coefficient of a third 
dialect, say C, and either of the two dialects, say B, is equal to or 
greater than the level point, then the third dialect is similarly linked 
in this group. In this example, A and B are directly linked; so is C and 
B. But because of the linking, now A, B, and C are all linked in group G. 
If the correlation coefficient of the pair A and C is at that level, then 
of course A and C are also directly linked. If the coefficient is below 
that level, then they are indirectly linked through B. In the subgrouping 
below, we will utilize the notions "directly linked" and "indirectly 
linked" in the discussion. As said before, words are results of language 
history and cultural influence. We therefore do not claim that the 
correlation coefficients derived in this study characterize the closeness 
of the genetic relationships of the dialects. However, comments will be 
frequently made in reference to the known genetic subgrouping. 

Tables la and lb show that the coefficients are lower than .70. In 
the following, the levels are given at the interval of .05 in a decreasing 
order. If a coefficient is at a certain level, then the correlated 
dialects are given in a group. 

(10) Level .65 

(a) Beijing, Ji'nan, and Shenyang. 

(b) The other dialects are not linked to one another 
at this level. 

Beijing, Ji'nan, and Shenyang are close to one another. The other 
dialects do not show such a high degree of correlation. The genetic 
Northern Dialect group is usually subdivided into Northern, Northwestern, 
southwestern, and Eastern Mandarin (Yuan 1960:24). This level of 
correlation shows that the varieties of the Northern Mandarin subdivision 
are fairly close to one another in lexicon. Xi'an belongs to the 
Northwestern Mandarin subdivision and is not linked to Northern Mandarin 
at this level. 

(11) Level .60 

(a) Beijing, Ji'nan, Shenyang, and Xi'an 


(b) Yangzhou and Hefei. 

(c) The other dialects are not linked to one another 
at this level. 

At this level, the Northern dialects Beijing, Ji'nan, Shenyang, and Xi'an 
form a group. Genetically Yangzhou and Hefei belong to Eastern Mandarin. 

(12) Level .55 

(a) Beijing, Ji'nan, Shenyang, and Xi'an. 

(b) Chengdu, Kunming, Hefei, and Yangzhou, 

(c) Changsha and Nanchang. 

(d) The other dialects are not linked to one another 
at this level. 

Chengdu is directly linked to Kunming. Kunming is directly linked to 
Chengdu and Yangzhou. Therefore Southwestern Mandarin and Eastern 
Mandarin are linked. Nanchang is a Can dialect, and Changsha belongs to 
Xiang. But our study here shows that they are close to each other. This 
association lends support to Yuan Jiahua's (1960:128) view that Can and 
Xiang are lexically close. Can and Kejia, which is represented by Meixian 
in this study, are often said to be close to each other because of the 
sharing of the diachronic rule that aspirates the obstruent initials which 
in Middle Chinese were voiced. But in vocabulary, they are linked at a 
point much lower than this level. 

(13) Level .50 

(a) Beijing, Ji'nan, Shenyang, Xi'an, Chengdu, Kunming, 
Hefei, Yangzhou, Changsha, and Nanchang. 

(b) The other dialects are not linked to one another 
at this level. 

The Xiang and Can dialects are linked to the Northern dialects at this 
level. Geographically, Xiang and Gan are situated in the central region 
of the Chinese language area. Historically, cultural dominance has been 
coming from northern regions. The closeness of Xiang and Gan to Northern 
dialects as shown here might very well be a reflection of such history. 

(14) Level .45 

(a) Beijing, Ji'nan, Shenyang, Xi'an, Chengdu, Kunming, 
Hefei, Yangzhou, Changsha, and Nanchang. 

(b) Guangzhou and Yangjiang. 

(c) The other dialects are not linked to one another 
at this level. 

Both Guangzhou and Yangjiang are Yue dialects. They are linked to each 
other for the first time at this level. 

(15) Level .40 

(a) Beijing, Ji'nan, Shenyang, Xi'an, Chengdu, Kunming, 
Hefei, Yangzhou, Changsha, Nanchang, and Suzhou. 

(b) Guangzhou and Yangjiang. 

(c) The other dialects are not linked to one another 
at this level. 

Phonological ly , Suzhou is close to Wenzhou. Lexically, Suzhou is linked 

through Yangzhou to the Northern dialects, Xiang, and Gan while Wenzhou 
stands alone at this level. Suzhou and Yangzhou are geographically close 
to each other. 

(16) Level .35 

(a) Beijing, Ji'nan, Shenyang, Xi ' an , Chengdu, Kunming, 
Hefei, Yangzhou, Changsha , Nanchang, and Suzhou. 

(b) Guangzhou and Yangjiang. 

(c) The other dialects are not linked to one another 
at this level. 

The lexical grouping is the same as that established at level .40. But 
the linking of the individual dialects is slightly different. At level 
.40, Suzhou is linked to Kunming indirectly because it is directly linked 
to Yangzhou and Yangzhou in turn is directly linked to Kunming. At level 
.35, Suzhou is directly linked to Kunming. 

(17) Level .30 

(a) Beijing, Ji'nan, Shenyang, Xi'an, Chengdu, Kunming, 
Hefei, Yangzhou, Changsha, Nanchang, Suzhou, and 

(b) Meixian, Guangzhou, and Yangjiang. 

(c) Xiamen and Chaozhou. 

(d) Fuzhou stands alone. 

At this level, Wenzhou is directly linked to Suzhou. But since Suzhou is 
linked to the other dialects in group (a), so is Wenzhou grouped there. 
Also at this level, the Southern Min dialects, Xiamen and Chaozhou, are 
linked. Fuzhou, a Northern Min dialect, stands alone. Meixian, a Kejia 
dialect, is linked at this level to the Yue dialects. Yuan Jiahua 
(1960:171) gives geographical proximity for the cause of mutual borrowing 
of words among the Yue and Kejia dialects. 

(18) Level .25 

All dialects are linked to one another, some directly 
and some indirectly. Of special interest are the 
following direct linking (" — " for "is directly 
linked to"): 
(i) Suzhou — Beijing, Ji'nan, Shenyang, Xi'an, 

Chengdu, Kunming, Hefei, Yangzhou, 

Wenzhou, Changsha, and Nanchang. 
(ii) Wenzhou — Yangzhou, Suzhou, 

Changsha, and Nanchang. 
(iii) Meixian — Nanchang, Guangzhou, 

and Yangjiang. 
(iv) Guangzhou — Meixian and Yangjiang. 
(v) Xiamen — Chaozhou and Fuzhou. 
(vi) Chaozhou — Xiamen, 
(vii) Fuzhou — Beijing and Xiamen. 

Yangzhou, Changsha, and Nanchang provide the linkage betwen the Wu 
dialects (Suzhou and Wenzhou) and the Northern dialects. Nanchang is the 
link between the Northern and the Yue dialects. The link between Beijing 
and Fuzhou completes the chain of all the dialects. It is somewhat 
curious that Fuzhou is directly linked to Beijing but not to Chaozhou, a 

Southern Min dialect, at this level. 

(19) Level .20 

All dialects are linked to one another, some directly 
and some indirectly. Fuzhou is now directly linked 
to the Southern Min dialects. 

We have examined dialect subgrouping in terms of the lexicon. The 
levels of association have been fixed reasonably at .05 intervals to avoid 
biased manipulation of the data on the basis of known genetic 
subgrouping. It is reassuring to find that no instances of the 
subgrouping of the dialect according to the correlation coefficients have 
contradicted well known facts of the Chinese language. In the following 
we will discuss genetic boundaries and degrees of lexical correlation. 
Listed in (20) are the major dialect groups and the minimum correlation 
coefficients for the member dialects. For example, in the Northern 
Mandarin group, Bei jing-Ji ' nan coefficient is .6715, Beijing- Shenyang 
.6983, and Ji' nan-Shenyang .6421, The minimum coefficient that allows 
these localities to be linked is therefore .6421 

(20) (a) Northern Mandarin: ,6421 

Beijing, Ji'nan, Shenyang 

(b) Northern and Northwestern Mandarin .6076 
Beijing, Ji'nan, Shenyang, Xi ' an 

(c) Eastern Mandarin: .6014 
Yangzhou, Hefei 

(d) Southwestern Mandarin: .5530 
Chengdu , Kunming 

(e) Yue: .4776 
Guangzhou, Yangjiang 

(f) Mandarin dialects: .4254 
Beijing, Ji'nan, Shenyang, Xi'an, 
Chengdu, Kunming, Hefei, Yangzhou 

(f) Southern Min: .3380 
Xiamen, Chaozhou 

(g) (Southern and Northern) Min: .2459 
Xiamen, Chaozhou, Fuzhou 

Missing in (20) are Kejia, Xiang, and Can. Since each of these dialects 
is represented by one locality only, no minimum group figures are 
available here. List (20) provides a summary of the degrees of closeness 
of dialects within the genetic groups. A generalization can be readily 
made: The dialects in the north are much more closely related than those 
in the south in lexicon. 

6. Conclusions 

In this paper I have presented the use of the Pearson product -moment 
correlation to calculate the degrees of lexical association among the 18 
dialects given in the Hanyu Fangyan Cihui , The correlation coefficients 
are considered the degrees of association. In the past we had to rely on 
intuitive feelings to talk about strengths of association. We now can 
deal with vocabulary beyond simple counts of percentage of cognates. 
These indices to affinity are given in Tales la and lb. There we see the 

relationships of the dialects in depth. 

Although this statistical work is intended to be a single case study, 
its coverage of a large amount of lexical data should allow us to see the 
generality of Chinese dialect affinity. Moreover, our lexical subgrouping 
of the dialects does not contradict known linguistic facts. I am 
therefore confident that this method can be used to find syntactic and 
phonological associations as well. And thus I have presented a method for 
quantification of dialect affinity. 

Chinese Year 
Book, Shanghai, 1937. 

These words are taken from the Hanyu Fangyan Cihui , p. 1. As will bt 
discussed later, the Hanyu Fangyan Cihui includes data for 18 dialects. 
Here in this exposition, I list 9 dialects only. Also, tones for these 
items are omitted as they are not immediately relevant in this 

See Nie et al (1975) for the nature of the statistics package. 

The proof that the phi coefficient is the Pearson product -moment 
correlation coefficient between two variables can be found, e.g. in Gene 
V. Glass and Julia C. Stanley (1970.159-160). 

glottochronology: Robert B. Lees (1953), Shiro Hattori (1953), Gordon H. 
Fairbanks (1955), Morris Swadesh (1955), Sarah C. Gudschinsky (1955), Dell 
H. Hymes (1960), Wang Yude (1960), Karl V. Teeter (1960), C. Douglas 
Chretien (1962), Isidore Dyen et al (1967), and Rudoph C. Troike (1969). 

Robert L. Oswalt (1973) and David Sankoff (1970). 

°See for example, John B. Carrol (1961). 


Beijing University. 1964. Hanyu fangyan cihui [Chinese dialect 

lexicon] . Beijing: Wenzi Gaige Chubanshe 1964. 
CARROL, John B. 1961. The nature of the data, or how to choose a 

correlation coefficient. Psychometr ika 26.347-372. 
CHRETIEN, C. Douglas. 1962. The mathematical models of 

glottochronology. Language 38.11-37. 
CLARK, Herbert H. 1973. The language-as-fixed-effect fallacy: a critique 

of language statistics in psychological research. Journal of Verbal 

Learning and Verbal Behavior 12.335-359. 
DOBSON, Annette J. 1978. Evolution times of languages. Journal of the 

American Statistical Association 73.58-64. 

DYEN, Isidore. A.T. James, and J.W.L. Cole. 1967. Language divergence 

and estimated word retention rate. Language 43.150-171. 
FAIRBANKS, Gordon. 1955. A note on glottochronology . International 

Journal of American Linguitics 21.116-120. 
GLASS, Gene V. and Julian C. Stanley. 1970. Statistical methods in 

education and psychology. Englewood Cliffs, New Jersey: Prentice- 
Hall, Inc. 
GUDSCHINSKY, Sarah C. 1955. Lexico-statistical skewing from dialect 

borrowing. International Journal of American Linguistics 21.138- 

HATTORI, Shiro. 1953. On the method of glottochronology and the time- 
depth of Proto-Japanese. GengoKenkyu 22.29-77. 
HSIEH, Hsin-I. 1977. A new method of dialect subgrouping. In William S- 

Y. Wang, ed . The lexicon in phonological change, pp. 159-196. 

Mouton Publishers. 
HYMES, Dell H. 1960. Lexicostatistics so far. Current Anthropology 1.3- 

LEES, Robert B. 1953. The basis of glottochronology. Language" 29. 113- 

LI, Fang-kuei. 1973. Language and dialects of China. Journal of Chinese 

Linguistics 1.1-13. (A condensed version of this article first 

appeared in the Chinese Year Book, Shanghai, 1937.) 
NIE, Norman H. , C. Hadlai Hull, Jean G. Jenkins, Karin Steinbrenner , and 

Dale H. Bent. 1975. SPSS: Statistical package for the social 

sciences second edition. New York: McGraw-Hill. 
OSWALT, Robert L. 1973. A triadic method of judging linguistic 

relationships. Anthropological Linguistics 15.328-336. 
SANKOFF, David. 1970. On the rate of replacement of word-meaning 

relationships. Language 46.564-569. 
SWADESH, Morris. 1950. Salish internal relationships. International 

Journal of American Linguistics 16.157-167. 

1955. Towards greater accuracy in lexicostatistic dating. 

International Journal of American Linguistics 21.121-137. 
TEETER, Karl V. 1960. Lexicostatistics and genetic relationship. 

Langauge 39.638-648. 
TROIKE, Rudoph C. 1969. The glottochronology of six Turkic languages. 

International Journal of American Linguistics 35.183-191. 
WANG, Li. 1935. Hanyu yinyun xue [Chinese phonology]. Beijing: 

Zhonghua Shuju (1956, 1963, original edition 1935). 
WANG, Yude. 1960. Chugoku go dai hogen no bunretsu nendai no gengo 

nendai gakuteki shitan [Lexicostatistic estimation of time depths of 

five major Chinese dialects]. Gengo Kenkyu 38.33-105. 
YUAN, Jiahua. 1960. Hanyu fangyan gaiyao [An outline of Chinese 

dialects). Beijing: Wenzi Gaige Chubanshe. 

studies in the Linguistic Sciences 

Volume 12, Number 1, Spring 1982 

Chln-chuan Cheng 

The use of Esperanto in China since the turn of the century has 
cast Chinese elements onto Esperanto to reflect cultural characteristics. 
From the writings of the monthly magazine £J Popola Cinio, Chinese 
innovations, especially those belonging to word formation and idioms, are 
identified. On the basis of the findings, the relationships between 
international norm and local variation are discussed. 

In the past few years, Esperanto associations, jplubs, and classes have been 
set up one after another in various parts of China. One can easily see that a 
great desire to learn things foreign surged after 10 years of inward searching 
during the "cultural revolution" of 1966-76. Naturally, learning of Esperanto, among 
other non-Chinese languages, for international communication fits well with such 
an outward-looking mentality. Moreover, Chinese Esperantists have enjoyed rela- 
tively strong support from the government and the general public mainly for the 
reason that earlier Esperantists in China earned the credit of being instrumental in 
Chinese language reform and in the promotion of left-wing literature during the 
first half of the century. 

Esperanto was introduced to China more than 70 years ago. The first transla- 
tion of 'Esperanto' was 'Wanguo xinyu' ('Ten thousand nations' new language') by 
Wu Zhihui and Li Shizeng in 1908 in France in their magazine Xin StuJ. (New 
Century) weei<ly. Then the Japanese translation 'Shijieyu' in Chinese characters 
meaning 'world language' was adopted in China and has been used since the 1910's. 
Japanese Esperantists, however, soon decided to use a Kana transliteration and 
abandoned this translation (Qian 1918a, 1918b). 

According to Hou (1981b), Harbin in Northeast China had the earliest 
Esperanto speakers. At the turn of the century, some Russian merchants from 
Vladivostok taught local people Esperanto. The first Chinese Esperanto organiza- 
tion was established in 1908 (Chen 1957). Among the better known journals, 
Dongfang Zazhi (The Eastern MisceJianyj published an artiolc on tho world outlook 
of Esperanto in 1913 by Lu (1913). In 1917 the monthly Xln (iingnian (New Youth) 
published a rebuttal by Tao Menghe to Qian Xuantong's promotion of Esperanto in 
China. On the eve of the 1919 May Fourth Movement, the Chinese logographic 
writing system and the feudal thoughts embedded in classical literature were 
viewed most critically. Qian (1918c), among others, took the extreme view that the 
Chinese language was no good and that Esperanto would salvage Chinese language 
problems. The numerous notes concerning Esperanto published in Xdn Qingnian led 
to the discussion of romanization of the Chinese language (Zhu 1918, Qian 1918d, 
Yao 1918). 

In 1921 Beijing University added Esperanto language courses to its curricula; 
the classes were taught by the blind Russian poet V. Eroshenko. Three years later, 
in Beijing an Esperanto school was established in 1925. In 1922 Dongfang Zazhi in 
the May issue published several articles in a section entitled "International 

Language Movements". One of the articles was by Hu Yuzhi (1922), who is the 
current president of the Chinese Esperanto League. Hu Yuzhi (better known in 
Esperanto circles as Hujucz) attended an international Esperanto meeting in 1928 
(Chen 1957). During the 1930's, Chinese Esperantists were active in both the 
Anti-Japanese V^ar and left-wing literature movements. In 1931 left-wing 
Esperantists formed the Chinese Proletarian Esperantist Union (Cina Proleta 
Esperantista Unio). In the ensuing years, under the slogan "By Esperanto for the 
liberation of China" ("Per Esperanto por la liberi^ de Cinio"), the union published 
a newsletter Chinese Proletarian^ Esperantist (Cina Proleta Esperantisto), The 
ivorld (La Hondo), China Roars (Cinio Hurlas), Voices from the Orient (Vo&pJ el 
Oriento), Oriental Courier (Orienta Kuriero), and China Herald (Heroldo de Cinio) 
(People's Daily March 13, 1951; Chen 1978). Because of the Anti-3apanese War, 
World War II, and then the Civil War, these publications did not last very long. As 
soon as the People's Republic of China was established in 19'j9, Hu Yuzhi started 
to organize an Esperantist association. On March 11, 1951 the Chinese Esperanto 
League (Cina Esperanto-Ligo) was established in Beijing (People's Daily March 13, 
1951). In 1956 a Chinese delegation attended an international Esperanto congress 
for the first time since 19^*9 (Zhang 1956). 

Current Chinese Esperanto activities most visible abroad are the publishing of 
the monthly El Popola Cinio (From People's China). A parallel activity of 
Eoeranto radio broadcasting started in December 196^* (Esperanto Section of Radio 
Beijing 1980) and continues today. Furthermore, over WO Chinese works have been 
translated into Esperanto in the past 30 years (Hou 1981a). 

During the early years of its use in China, Esperanto was considered unworthy 
of learning by some scholars who argued that as each speaker would use his own 
preferred words, the language would become idiosyncratic and extremely complex 
after a period of time (Zhu 1918). It is sociolinguistically interesting to observe 
that more than 60 years later, the language has not divided into mutually unintelli- 
gible groups (Sherwood forthcoming b). Indeed, the language has been developing in 
lexicon since its creation about a century ago. Chen (1957), for example, lists the 
increase in the size of vocabulary on the basis of I. Lapenna's study as follows: 

1887 90i> roots 
1894 2,599 roots 
1954 7,866 roots 

^Vhat did Chinese Esperanto users contribute to the development of the 
language? More than 60 years ago, Qian (1918a) made the following statements 
concerning use of Chinese in Esperanto: (1) Chinese characters could not be mixed 
with the alphabetic, phonetic spelling; (2) Chinese meanings were all so vague that 
incorporation of them would not be appropriate; (3) the multitude of homophones 
made Chinese unsuitable for Esperanto. However, he suggested that some special 
terms of ancient Chinese history should be added to the Esperanto vocabulary. 
Qian's personal view could not necessarily dictate the development of Esperanto in 
China. After all, China has gone through various stages of isolation and outside 
contact in the last hundred years. Foreign languages as part of foreign culture 
were bound to be modified somewhat to accommodate Chinese realities. I have 
discussed elsewhere (Cheng 1982) Chinese elements in English used in China. The 
main purpose of this paper is to examine Chinese use of the Esperanto language. 

Phonological characteristics of Chinese Esperanto can be studied systemati- 
cally by use of shortwave broadcasts. Wood (forthcoming) mentions briefly his 

impressions of the special characteristics of the Esperanto spoken by Chinese 
announcers on Radio Beijing. He points out that the announcers have manifested a 
tendency not to distinguish sufficiently between voiced and voiceless consonants, 
and not to give the penultimate syllable sufficient stress. He says moreover that 
they have difficulty with certain consonant clusters. My work reported here is 
entirely based on the writings in El Popola tinio. Phonological features therefore 
will not be dealt with further. 


El Popola Cinio, a monthly Esperanto magazine published in Beijing by the 
Chinese Esperanto League, reached its thirtieth year and the 260th issue in May 
1980. It was first published on May 'f, 1950 by Beijing Foreign Language Publishing 
House (People's Daily, March 13, 1951; Honfan 1980). Its publishing was interrupted 
in 195^^ and then resumed in 1957 with the Chinese Esperanto League as the 
publisher (El Popola £inio Editorial Staff 1980, Honfan 1980). 

The format of the periodical varied somewhat in the past years, but the 
contents have always been mainly about things Chinese. Indeed, Esperanto news 
from outside of China and current world issues appear regularly; they, however, 
consititute only a minor portion of each issue. 

The Esperanto of El Popola Cinio is considered excellent and easy to under- 
stand among Esperanto speakers in other parts of the world. Yet, with the 
profusion of things Chinese in the issues, it is natural that some Chinese cultural 
and linguistic elements are cast onto the Esperanto language. Generally speaking, 
the syntax is fairly straightforward for both English and Chinese speakers, but the 
vocabulary shows much more Chinese characteristics. 


The syntax is predominantly Subject-Verb-Object (SVO). For example, out of 
about 50 accusative cases found in the article "Novjara mesago" (New year's 
message) only the following three appeared in an order other than SVO: 

Koncerne al tio ni faros klopodojn kaj esperas, ke diverslandaj 
Esperantaj organizoj kaj aktivuloj nin helpos. (1980.256.2) 
About that we will do our best and hope that Esperanto organizations 
and activitists will help us. 

Ni esperas, ke niaj legantoj varme nin helpos. (1980.256.3) 
We hope that our readers will warmly help us. 

Ni kore esperas, ke ili daCire nin subtenos kaj helpos. (1980.256.3) 
We sincerely hope that they will continuously support and help us. 

These sentences ail involve the pronoun nin 'us'. One also finds the non-SVO 
ordering in conversational Esperanto spoken by those with European language 
backgrounds. Sherwood (forthcoming a) in a statistical analysis of conversational 
Esperanto by Scot William Auld and Flemish Peter de Smedt finds that out of 179 
main clauses containing transitive verbs, only 15 had a constituent order other 
than subject-verb-object, and only 2 of these involved other than simple elements 

such as ^pronouns. Chinese syntax corresponding to the sentences found in El 
Popola CJrdo does not provide the SOV order for pronouns. One can therefore 
conciude that this pattern is a "global" feature of Esperanto. 

A major difference between the syntax of Chinese and that of many European 
languages is the position of subordinate clauses. In Chinese the modifying clause 
comes before the head noun, whereas in many European languages the embedded 
sentence follows the noun. I know of no Esperanto rule that explicitly governs this 
type of structure. However, words such as kiu 'that, which', kio 'that, who', etc. 
which introduce the relative clause would come conveniently after the head noun. 
As far as I can tell, there is no trace of Chinese clause structure in the Esperanto 
of El Popola tmio; all the modifying clauses appear in the position where English 
relative clauses would occupy. This preliminary observation does not rule out the 
possibility that more adjectival forms are used in a place where subordinate 
clauses are also appropriate. But I do not have statistics to warrant suitable 


In terms of smaller units, Esperanto has a set of rules governing the endings 
for verbs, nouns, adjectives, and adverbs. On the other hand, Chinese has no in- 
flectional endings. I know of no Chinese interference in this area. Most personal 
and place names are given in transliteration without adding the Esperanto noun 
endings. Other types of words that have been transplanted from Chinese are given 
appropriate inflection. Transliteration in El Popola tinio changed in 1979 from an 
Esperanto approximation to the Pinyin system. This change was made uniformly for 
all the foreign language publications printed in the Latin alphabet in China by gov- 
ernment decree. Because of this recent change in romanization. El Popola Clnio 
often includes the old spellings in parentheses, for example: 

Nun iras la aktivadoj "Donu plej bonan servon" kaj "Kontentigu la 
klientojn" en la urboj Beijing (Pekino), Tianjin (Tiangin), Wuhan (Vuhan), 
Shenyang (Senjang) k.a. (1981.278.3) 

Now the activities of "Give the best service" and "Satisfy the customers" 
in the cities of Beijing, Tianjin, Wuhan, Shengyang etc. are going on. 

The place names and other nouns that had been incorporated in Esperanto are 
properly inflected. The name Pekino illustrates this practice. The words 
Kuomintango 'Guomindang— Nationalist Party' and juano 'yuan— Chinese monetary 
unit' have never been changed to Pinyin spelling and are always used with an 
Esperanto ending. Since the use of Pinyin was adopted, place names are not in- 
flected anymore. In case adjectival forms are needed in the text, a hyphen is 
inserted before the Esperanto ending "a" is added, for example: 

Tan Qilong, la unua sekretario de la Kompartia Komitato de Sichuan- 
provinco, You Taizhong, komandanto de la Chengdu-a Milita Regiono, kaj 
aliaj iris respektive al Chengdu, Chongqing, Mianyang, Wenjiang, 
Neijiang, Nanchong, kaj Yongchuan por fari inspektadon kaj direkti savan 
laboron (1981.278.14) 

Tan Qilong, first secretary of the Communist Party of Sichuan Province, 
You Taizhong, commander of the Chengdu Military Region, and others 
went respectively to Chengdu, Chongqing, Mianyang, Wenjiang, Neijiang, 
Nanchong, and Yongchuan to make inspection and to direct rescue work. 



In terms of vocabulary the combination of morphemes to form words and com- 
pounds is largely left with the user; Esperanto dictionaries provide mainly the 
roots and affixes and the words already coined by users. There is no rule against 
the creation of words ^to make appropriate expressions. In the case of the 
Esperanto of El Popola Cirdo, we see a few Chinese elements. Following is a list 
of items of this nature that are discussed in this paper. These items are given at 
this point without the glosses so that those who know Esperanto can make an 
effort to guess at their meanings. In this way, the influence of Chinese on 
Esperanto as discussed later will become more apparent. 




malgranda Du 

maljuna Lin 









mil- kaj dekmilfoje 

nudpiedaj kuracistoj 

intelekta junulino 

kvarpersona bando 

iranto de la kapitalisma vojo 

gis sekigo de la maroj kaj putrigo de la stonoj 

kune floru cent floroj kaj vocojn donu cent skoloj 

These innovations can be divided into the following categories: (a) Direct borrow- 
ing of Chinese words, (b) words coined specifically for Chinese concepts, (c) 
ordinary words with extended or specific meanings, and (d) Chinese idioms and set 


Direct use of Chinese words in Esperanto, besides place and personal names, 
is fairly limited. The following are some of the more frequent ones: 

Gi estas la komuna lingvo de la hana nacieco kaj ankau oficiala lingvo 
de la fiina Popola Respubliko. (1981.278.20) 

It is the common language of the Han nationality and also the official 
language of the People's Republic of China. kuomintanga gubernia registaro rigardis ilin kiel ... (1980.256.37) 
...the Guomindang (Nationalist) county government regarded them as... 

...unu autobusa monata bileto kostis 12 juanojn... (1979.253.10) bus monthly ticket cost 12 yuans... 


Multaj el la ceestantoj protestis kaj skribis kolektive dazibati-on 
titolitan... (1979.252.36) 

Many of the participants protested and wrote collectively a 
big-character poster titled... 

The word Kuomintango 'Guomindang— Nationalist Party', capitalized, can be easily 
recognized. The adjectival form kuomintanga may not be immediately apparent as 
to what it means, but a diligent reader will find the word Kuondntango in 
Waringhien (1970) Full Illustrated Dictionary of Esperanto (Plena Hustrita Vortaro 
de Esperanto). One also finds Hano 'Han Chinese' in the same dictionary. The word 
juan from Chinese yuan is a monetary unit. With the word kostis 'cost' occurring in 
the context, it may not be hard to understand its meaning at all. The word dazibau 
'dazibao— big character poster' signifies something of Chinese invention. 


It is somewhat difficult to identify the second type of Chinese innovation, the 
type of words coined specifically to express Chinese concepts, because such an 
identification requires extensive studies of Esperanto etymology, which I cannot 
undertake at this time. Yet, there are words which look very much like Chinese in 
concept or form. 

En la malforta flava lumo de kerosena lampo gemadis la akusantino sur 
la terlito. (1979.253.35) 

In the weak yellow light of a kerosene lamp the woman in labor was 
moaning on the earthbed. 

From the context where the word terlito occurs, I assume that it is a direct 
translation of the Chinese word kang which means 'a beatable brick bed'. 

Antaij ol la drakboatoj atingis la lokon, standoj en unu vico altiris al si 
multajn homojn. (1979.253.40) 

Before the dragon boats reached the place, many people had been 
attracted to a row of stands. 

The word drakboato is a translation of the Chinese word longzhou or longchuan 
'dragon boat --a boat with a decorative dragon head for racing'. Waringhien (1970) 
lists the drak^o of Scandinavian origin but not drakboato from Chinese. 

These words of Chinese origin have been formed using simple roots and 
straightforward concatenation. In contrast with words of Latin origin, Claude Piron 
(1977) states that words thus formed are much easier to understand. He cites 'self- 
standing, independent' as an example. He says that memstara consisting of mem 
'self and stara 'standing' is a better choice than autonoma 'autonomous' which a 
Chinese reader would first interpret as 'relating to the names of automobiles'. 
Piron calls the simpler style "global Esperanto" ("tutmonda Esperanto") and urges 
that this type be used instead of the Latinized version, which is comprehensible to 
only an elite group. From this point of view, then, the Chinese elements are very 
well incorporated into Esperanto by Chinese Esperantists. 


The third type of Chinese elements in Esperanto involves extension of the 
ordinary meanings of common words. 

Malgranda Du, membro de la Komunisma Junulara Ligo, diplomitigis en 
supera mezlernejo, kaj sia hejmo estas en la kamparo (19S0.256.20) 
Small Du, a member of the Communist Youth League, graduated from a 
senior middle school, and her home is in the countryside. 

...lia gvidanto venis al li kaj diris: "Maljuna Lin, ..." (1980.256.21) 
...his leader came to him and said: "Old Lin, ..." 

The word malgranda 'small' (as defined in Wells 1969 Esperanto Dictionary) is more 
fully explained in Waringhien (1970) as (1) having dimensions less than the ordinary, 
occupying less space; (2) not reaching high growth; (3) not reaching ordinary scale, 
quality, intensity. In the context of the example, none of these meanings applies. 
In Chinese one often addresses close associates or friends with the word lao 'old' 
added before the surname if they are of about the same age. Used in this way, the 
word also means 'long-standing (friend)'. It does not necessarily mean that the 
person addressed is old in age. Similarly, the Chinese word xiao 'small, young' is 
used to address familiar persons of either sex younger than oneself, sometimes 
carrying the tone of endearment. The word in this context does not necessarily 
mean small in size or young in age. The Esperanto words malgranda and maljuna in 
the above examples are used to carry the Chinese senses and thus acquire the 
extended meanings. 

Some kinship terms are also used to address or refer to persons who are not 
necessarily one's own relatives. For example, the following sentence is given in a 
context which makes no reference to aunt-nephew relations: 

Onklino Li agas 58 jarojn. Sia edzo mortis antau 15 jaroj, kaj si devis 
sola vivteni la tutan familion kun 3 filoj kaj unu filino. (1980.256.29) 
Aunt Li is 58 years old. Her husband died 15 years ago, and she had to 
maintain the entire family with 3 sons and one daughter. 

The following sentences show that other kinship terms are also used to refer to 
people other than one's own relatives: 

Mi rapide acetis dekkelkajn kukojn kaj ilin donis cil la komunumano 
dirante: "Onklo, vi nenion mangis tuttage, prenu ilin por mangi en 
vagono!" (1979.252.35) 

1 quickly bought more than 10 cakes and gave them to the commune 
member saying: "Uncle, you did not eat anything all day long, take these 
to eat in the train!" 

Tiam si kaj sia fratino ploris cagrene pro la bonkora avo. (1979.253.38) 
Then she and her sister cried sadly for the good-hearted grandfather. 

Waringhien (1970) glosses onklo 'uncle' as (1) father's or mother's brother or (2) 
husband of aunt. Similarly onklino 'aunt' is glossed as either (1) father's or 
mother's sister or (2) wife of uncle. The word avo 'grandfather' is given as father's 
or mother's father. These are the ordinary meanings of the words, but in the 
examples, these senses do not exist. As one reads the passages in which onklino. 


onklo, and avo occur, one cannot find blood relations between the persons. The use 
of these terms to females of mother's generation (Chinese ayi 'aunt', Esperanto 
onkJino), males of father's generation (Chinese shushu 'uncle- - father's younger 
brother', Esperanto onkJa, and Chinese laobo 'uncle, father's elder brother', 
Esperanto avo) is customary in China. The Esperanto sentences given above should 
be understood with consideration of Chinese culture. 

In one of the sentences above, we saw the word komunumano 'commune 
membmember' used. The Chinese origin sheyuan 'commune member' does not simply 
mean 'member of a commune'. It specifically refers to a 'peasant', who works in an 
agricultural commune. Thus, the word komunumano has acquired a specific meaning 
in the Chinese social context. 

The word fratJneto is another case of specificity of Esperanto in China. 

"Jes! Tute!" si diris vigle, "fratino, mia patro diris en la vagono, ... " Kaj 
§i kaptis mian manon kaj vokadis min per "fratino"... Mi pensis, ke estus 
tre bone havi tian fratineton! (1979.252.37) 

"Yes! Entirely!" she said briskly, "Sister, my father said in the car,..." 
And she held my hand and called me "sister"... I thought that it would be 
very good to have such a younger sister! 

In Butler's (1967) Esperanto-English Dictionary the word frauJineto is listed and 
glossed as 'lassie, little maid, young miss'. On the other hand, fratineto is not 
listed. The exclusion implies that the meaning of the word can be derived from 
fratino 'sister' and et the root for 'small'. But then what does the combination of 
these senses mean? Waringhien (1970) gives the word as (1) affectionate address to 
younger sister and (2) title of some nuns. In the example, the sense of 'nun' does 
not apply. It is possible to derive affectionate meaning here. But a reader who 
understands the Chinese language will immediately think of the Chinese word 
meimei 'younger sister' rather than the affectionate sense of a simple word 
'sister'. In Chinese there are three words corresponding to sister in English. Jiefie 
means sister older than oneself; meimei means sister younger than oneself; and 
jemei is a term referring to both female siblings. The words fratino and fratineto 
in the above example have acquired the specific meanings of Chinese words ^e^e 
and meimei. Moreover, these words are not used here as strict kinship terms. 

Another word that illustrates acculturation of Esperanto is kunlernanto. In 
the following sentence the subject 'he' is a teacher: 

Li fojrefoje kontaktigis kun Vang Lujan, kaj admonis liajn gepatrojn kaj 
aliajn kunlernantojn multe prizorgi kaj helpi lin. (1978.243.37) 
He time and again met with Vang Lujan, and talked to his parents and 
other students in order to greatly take care of and help him. 

Butler (1967) gives 'fellow student' as the meaning of kunlernanto and Waringhien 
(1970) defines it as kamarado de lernejo 'comrade of school'. The word corresponds 
to the Chinese word tongxue. In Chinese, tongxue can mean 'student' or 'fellow 
student'. A teacher can use tongxue to refer to his student. That is the sense in 
which the Esperanto word kunlernanto is used in this example. 

Some words may not be clearly identifiable as being derived from Chinese, 
but they are used in El Popola tinio so often that many readers may make such an 


association. Laborforto 'labor force— able bodied person' and nacimalpUmuJto 
'minority nationality' are some examples: 

En la vilago sia familio havas plej multajn laborfortojn kaj sekve ilia 
vivo estas rimarkinde pli bona ol antaue. (1980.256.29) 
In the village her family has the most labor forces and consequently 
their life is remarkably better than before. 

Li vivas longe en la regiono de nacimalplimultoj en sudokcidenta Ginio. 


He has lived for a long time in the region of minority nationalities in 

southwestern China. 


Now we come to the fourth type of Chinese elements in the Esperanto of El 
Popola tinio. Included in this category are Chinese idioms and other set phrases. 
Some of the phrases may be somewhat ordinary while others are rather bizarre to 
the Westerners. First let us look at an ordinary one: 

Civilizitaj homoj kutimas intersanfi saluton je matena renkontigo, 
almenau ili diras la vortojn mil- kaj dekmilfoje ripetitajn de la homoj 
dum la tuta vivo, ... (1981.270.12) 

Civilized men are accustomed to exchanging salute in a morning 
encounter; at least they have said the words repeated thousand and ten 
thousand times by men during their entire life, ... 

Of interest here is the phrase mil- kaj dekmil 'thousand and ten thousand'. This 
phrase simply signifies a great number. And a great number in English is said as 
hundreds and thousands rather than thousands and ten thousands. From the point 
view of the English speaker, thousands and ten thousands in this context is a 
curious expression. On the other hand, the corresponding words in Chinese make an 
idiomatic phrase. The reason for the difference between Chinese and English in 
expressing a great number is that the Chinese numeration system is based on the 
itth power of 10 while the English one is based on the 3rd power of 10. In Chinese 
the words are shi 'ten', bai 'hundred', qian 'thousand', wan 'ten thousand', then 
shiwan 'ten ten-thousand', baiwan 'hundred ten-thousand', CfLanwan 'thousand ten- 
thousand', yi 'ten thousand ten-thousand'. In English the words are ten, hundred, 
thousand, then million. Thus the "third power" system mandates hundreds and 
thousands and the "fourth power" system requires qian wan 'thousands and 
ten-thousands' to express a great amount. Although the Esperanto translation looks 
ordinary on the surface, the linguistic system behind it is quite different from 
European languages. 

Following are some phrases that require knowledge of recent Chinese social 
and political events to understand their meanings: 

Hodiau en plejparto de la produktaj brigadoj jam trovigas nudpiedaj 

kuracistoj... (1979.253.6) 

Today in most of the production brigades there are already barefoot 


...iu intelekta junulino laboranta en ilia brigade toksigis dum sprucigado 
de insekticido... (1979.252,35) 

...some intellectual (female) youth working in their brigade became 
poisoned while spreading insecticide... 

Sed pro detruo de la "kvarpersona bando" la pliproduktado de industrio 
reduktigis grandpase de post la jaro 1967. (1979.253.13) 
But because of the sabotage by the "gang of four" the greater 
production of industry was reduced in big strides after the year 1967. 

Intelekta jUnulo 'intellectual youth' meaning a high school graduate is the transla- 
tion of the Chinese phrase zhishi qingnian. NudpiedaJ kuracJstoJ 'barefoot doctors' 
from Chinese yisheng refers to paramedics who work in the countryside. 
These phrases have occurred frequently and should not constitute a difficulty in 
comprehension for the reader who has followed closely the Chinese social and 
political events in the past few years. These phrases are somewhat odd but are not 
as bizarre as the 'capitalist roader' in the following text: 

Iu el nia stacio kritikis Sekretarion Lei per dazibau-o dirante, ke li estas 
iranto de la kapitalisma vojo. (1979.251.37). 

Someone from our station criticized Secretary Lei with a big-character 
poster saying that he was a capitalist roader. 

The period of cultural revolution saw the great language creativity of the Chinese. 
Many interesting phrases were coined during the political movement. 'Capitalist 
roader' meaning the person who is in favor of capitalism is one of these phrases. 
The Esperanto translation retains the authenticity of the Chinese language. English 
used in China also keeps such authenticity. But the feelings of strangeness on the 
part of Westerners have made some English teachers in China voice their views 
against such "Chinese English" (see Cheng 1982). In Chinese Esperanto circles, on 
the other hand, no such discussion has been recorded in publications. 

Chinese language elements are particularly apparent when idioms are 
translated literally: 

Gis sekigo de la maroj kaj putrigo de la stonoj mi tiel faros dum la tuta 

vivo! (1979.252.36) 

Until the seas dry and rocks crumble I will do so during my entire 


...por vigligi literaturan kaj artan kreadon estas necese praktiki la 

principon "kune floru cent floroj kaj vocojn donu cent skoloj". 

(1979.252.10) order to stimulate literature and art creation it is necessary to 

practice the principle "let a hundred flowers bloom and a hundred 

schools contend". 

The Chinese phrases hai ku shi Ian 'seas dry and rocks crumble— the seas may dry 
and the rocks may crumble, but I will...' and bai hua qi fang bai Ja zheng ming 
'hundred flowers bloom and hundred schools contend' are literally translated into 
Esperanto. The Chinese metaphorical effects are thus transplanted into Esperanto. 



I have shown that the Esperanto used in China has acquired life through the 
introduction of Chinese elements, both cultural and linguistic. Esperanto would 
have been rather pale if no such variation were allowed. Indeed, it is such versati- 
lity that has made the language lively and natural. 

As mentioned earlier, Qian (191Sa) thought that the Chinese words most 
appropriate for Esperanto would be ancient Chinese terms. On the contrary, we 
have seen that most of the Chinese elements in Esperanto are related to modern 
Chinese society. In fact, Esperanto translations of Chinese literature are mostly 
from those produced in this century (Hou 1981a). It is safe to say that the involve- 
ment of Chinese Esperantists in current affairs has kept the Esperanto movement 
in China alive. 

Local variation, which provides life to the international language, appears to 
be in conflict with the ideal of a language understood by all nationalities. Wood 
(1979) and Sherwood (forthcoming b), however, state that mutual Intel- ligibility is 
maintained among the varieties of Esperanto spoken in various lands. The 
Esperanto community is "non-ethnic and non-territorial" (Wood 1979), yet the 
instrumental use of the language for international communication makes the users 
strive for an international norm. Wood (forthcoming) mentions that one way in 
which a global style can be attained is by having written and spoken material 
checked by editors of different native language background before presentation or 
publication. He cites as an example a recent selection of an English Esperantist by 
the Chinese Esperanto League to work with Chinese Esperantists in Beijing as a 
linguistic monitor. 

Wells (1978) approaches the "norm" by way of discussion of "good" pronuncia- 
tion in terms of "practical", "linguistic", "geographical", and "sociological" criteria. 
By "practical", it is meant that a good pronunciation is one which facilitates 
comprehension. The linguistic criterion calls for distinction between phonological 
contrasts. The geographical criterion measures the degree of freedom from geogra- 
phical interference. Wood (forthcoming) correctly points out that "geographical" is 
a misnomer; what is intended is avoidance of influence of native languages. The 
sociological criterion means acceptance of norms emerging in the community of 
Esperanto users. 

In light of the above discussion of Chinese use of Esperanto, we see that 
local variation of the international language is bound to exist. The "practical" and 
"linguistic" criteria are basic reguirements of a common language. On the other 
hand, the aim to achieve a pronunciation, or usage in general, which is free from 
native language influences may not be attainable. Most Esperanto speakers have 
learned to speak the language as a second language in adulthood, and learning a 
second language at this stage of life is by no means a simple matter. Moreover, if 
avoidance of native language influence is pursued to extremes beyond the initial 
learning stage, it may easily result in social stigmatization. Esperanto is considered 
neutral and approachable mostly for the reason that it is not "owned" by anyone. 
That is, there are few native speakers to offend with "bad" pronunciation or 
"deviant" usage. A requirement of freedom from ethnolinguistic interference can 
become a deterring factor for the majority of speakers. In the Chinese context, 
this standard in a way contradicts the lofty goals of the Esperanto movement. 
During the early years of use of Esperanto in China, Hu (1922) emphatically 
pointed out that besides providing a tool for international communication, cultiva- 

tion of "international heart" (internacia koro) and eradication of racial prejudices 
were the ultimate ideals of the movement. It seems to me that cultivation of 
"international heart" requires one to actively promote ethno-linguistic adaptation 
in the Esperanto community. 

"Adaptation" means adjustment by the communicators. In our daily contacts 
with others, we make adjustments all the time. For example, in conversation, we 
adjust our perception on the basis of the other speaker's sex, voice quality, 
accent, dialect, or foreign language background. In our experience, we find that 
our ability to adjust perception is much higher than our ability to adjust speech 
production. In the Esperanto context, the speech community in terms of its 
aspiration is "non-ethnic" and "non-territorial" (Wood 1979). On the other hand, if 
one takes into consideration the language reality, the differences are as important 
as the unity for the future development of the Esperanto community. Wells (1978) 
lists various peculiarities of spoken Esperanto; Wood (forthcoming) gives a lengthy 
account of historical evolution of pronunciation; Sherwood (forthcoming b) also 
deals with variation of Esperanto; Chen (1981) gives a short but lively account of 
language, life, and Esperanto; I have just discussed the Chinese linguistic and 
cultural elements in the Esperanto used in China. These findings all point out that, 
in terms of cultural backgrounds and actual use, the Esperanto community is full of 
ethnic and territorial differences. 

The differences manifested in phonology, syntax, semantics, discourse, or 
general style, can be accepted as norms for the following reasons: (1) Variation 
enriches the language. History has shown that the vocabulary of 904 roots about a 
century ago has changed to accommodate scientic innovations and other elements. 
Without infusion of new elements from other languages and cultures, Esperanto 
would have been one of the many "closet" languages that were invented and left 
alone to die. (2) Acceptance of differences among speakers allows the language to 
become that of the masses rather than that of a small elite. In this way, the users 
can exercise their creativity in proper contexts and feel that they "own" the 
language. Sherwood (forthcoming b) also expresses this point clearly in the 
following words: "In any case, the continued strength of the Esperanto movement 
in Asia is likely to insure that the needs of non-European speakers will not be 
neglected, and that Asians will contribute to the evolution of a global style." 

Adaptation requires that Esperanto users make efforts toward understanding 
other cultures. Notice that we do not use the word "tolerance". "Tolerance" is a 
negative concept that is not compatible with the ideals of mutual understanding. 
Mutual understanding should also include understanding of native language 
influences on Esperanto. When an Esperanto speaker with a European language 
background understands that Standard Chinese does not have voiced stops, he is 
able to quickly adapt to the Esperanto spoken by the Chinese and diligently make 
perceptual adjustments on the basis of the context. As another example, when ndl- 
kaj dekmUfoje 'thousand and ten- thousand times' is used for 'hundreds of 
thousands of times', Europeans should understand that the Chinese language 
background calls for such an expression and not that a mistake was made. Since 
Esperanto speakers, by nature of the movement, are often aware of linguistic 
differences, they can readily understand that understanding of linguistic 
differences is one of the processes to achieve the goals of the movement. It is in 
this frame of reference that studies of native language influences on Esperanto 
become most meaningful. 


I would like to thank Bruce Shrwood and Richard E. Wood for their 
comments on an earlier version of this paper. 


BUTLER, Montagu C. 1967. Esperanto-English dictionary. London: British 

Esperanto Association. 
CHEN, Yuan. 1957. Shijieyu qishi nian (Seventy years of Esperanto). Renmin 

Ribao (People's Daily) December 15, p. 7. 
CHEN, Yuan. 1978. Rememoroj pri £ina Proleta Esperantista Unio (Memories 

of Chinese Proletarian Esperantist Union). El Popola Cinio 235.1^-16. 
CHEN, Yuan. 1981. Lingvo, ciutaga j/ivo kaj Esperanto (Language, everyday 

life and Esperanto). El Popola Cinio 270.12-1^*. 
CHENG, Chin-Chuan. 1982. Chinese varieties of English. In Braj Kachru ed. 

The Other tongue: English across cultures 125-[W. University of Illinois 

EL POPOLA CINIO EDITORIAL STAFF. 1980. Retrospektivo kaj perspektivo 

(Retrospective and perspective). El Popola Cinio 260.2-3. 
ESPERANTO SECTION OF RADIO BEDING. 1980. Nia Esperanto disaudigo 

(Our Esperanto broadcast). El Popola 6inio 260.16. 
HOFAN. 1980. 30 jarojn kune kun EPC (30 years with EP£). El Popola Sinio 

HOU, Zhiping. 1981a. Esperanto kaj la cina literature (Esperanto and Chinese 

literature). El Popola Cinio 268.18-20. 
HOU, Zhiping. 1981b. Fremdlandaj esperantistoj kaj la cina Esperanto- movado 

(Foreign Esperantists and the Chinese Esperanto movement). El Popola 

Cinio 273.11-13. 
HU, Yuzhi. 1922. Guojiyu de lixiang yu xianshi (Ideals and realities of inter- 
national language). Dongfang Zazhi 19:15.77-82. 
LU, Shihui. 1913. Shijieyu zhi shijieguan (The world outlook of Esperanto). 

Dongfang Zazhi 9:7.9-22. 
PIRON, Claude. 1977. La okcidenta dialekto (the occidental dialect). 

Esperanto 7-8.125-126. 
QIAN, Xuantong. 1918a. Esperanto. Xin Qingnian k:2A73-\77. 
QIAN, Xuantong. 1918b. Esperanto. Xin Qingnian 4:4.362-364. 
QIAN, Xuantong. 1918c. Zhongguo jinhou zhi wenzi wenti (Future language 

problems in China). Xin Qingnian 4:4.350-356. 
QIAN, Xuantong. 1918d. Duiyu Zhu Wonong jun liang xin de yijian (Comments 

on two letters from Mr. Zhu Wonong). Xin Qingnian 5:4.424-429. 
RENMIN RIBAO (People's Daily). 1951. Zhongguo Shijieyu yundong (Chinese 

Esperanto movements). Renmin Ribao March 13, p. 3. 
SHERWOOD, Bruce Arne. Forthcoming a. Statistical analysis of conversational 

Esperanto with discussion of the accusative. 
SHERWOOD, Bruce Arne. Forthcoming b. Variation in Esperanto. 
WARINGHIEN, G. 1970. Plena ilustrita vortaro de Esperanto (Full illustrated 

dictionary of Esperanto). Paris: Sennacieca Asocio Tutmonda. 
WELLS, John C. 1969. Esperanto dictionary. New York: David McKay 


WELLS, John C. 1978. Lingvistikaj aspektoj de Esperanto (Linguistic aspects of 

Esperanto). Rotterdam: Universala Esperanto-Asocio. 
WOOD, Richard E. 1977. A voluntary non-ethnic and non-territorial speech 

community. In William Francis Mackey and Jacob Ornstein eds. 

Sociolinguistic studies in language contact: Methods and Cases 433-450. 

WOOD, Richard E. Forthcoming. The development of Esperanto pronunciation. 
YAO, Jiren. 1918. Zhongguo wenzi yu Esperanto (Chinese characters and 

Esperanto). Xin Qingnian 5:5.537-542. 
ZHANG, Qicheng. 1956. Yige buyong fanyi de guoji huiyi (An international 

conference without translation). Renmin Ribao (People's Daily) December 

15, p. 7. 
ZHU, Wonong. 1918. Fandui Esperanto (Opposing Esperanto). Xin Qingnian 


Studies in the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 

Mohammad Dabir-Moghaddam 

In this paper I will be concerned with the question of pas- 
sive in Persian. The existence of passive construction in Per- 
sian has been a controversial issue in the transformational treat- 
ments of Persian. A group of scholars have postulated the exis- 
tence of 'passive' in Persian, while at least one linguist has 
called this construction 'inchoative.' In this paper I discuss 
the 'passive' and the 'inchoative' approaches to the notion 'pas- 
sive voice' in Persian and finally suggest that in addition to 
the category of inchoative constructions there is a syntactic 
category of passive in Persian. I will show that a definable 
subset of passive constructions in Persian may optionally under- 
go another syntactic process that produces a surface structure 
which is potentially ambiguous between an inchoative and a pas- 
sive interpretation. I will claim that the transformational 
rule of passive in Persian is a governed rule in the sense that 
it applies to a semantical ly definable class of verbs. 

1. Previous Approaches to Passive in Persian 

On the question of passive construction in Persian, transformational 
grammarians take two distinct positions. Marashi (cf. Marashi, 1970:18), 
Palmer (cf. Palmer, 1971:98), Sheintuch (cf. Sheintuch, 1973:54), Soheili- 
Isfahani Ccf. Soheili-Isfahani, 1976:164) and Hajati (cf. Hajati, 1977:17) 
postulate (with very little discussion) a passive rule in Persian. In his 
discussion of S-Raising, Soheili-Isfahani utilizes an argument based on 
passivization and suggests the following processes to be involved in the 
derivation of a passive from its active counterpart. (The items in square 
brackets are mine.) 

(1) ... The direct object is promoted to the subject posi- 
tion, and the [underlying] subject is generally deleted 
in Persian. As an illustration, consider the following 
active sentence in (67) [i.e., (2) below] and its passive 
form in (68) [i.e., (3) below]: 

Soheili-Isfahani (1976:164) 

(2) Iranian ferdowsi- ra bozorg-tarin sa?er-E hemasi mi-semar-and, 
Iranians Ferdowsi+obj . greatest poet epic reckon+sub j . 

'Iranians reckon Ferdowsi as the greatest epic poet.' 


(3) ferdowsi bozorg-tarin sa?er-E hemasi semorde mi-sav-ad. 
Ferdowsi greatest poet epic reckon become+sub j . 

'Ferdowsi is reckoned as the greatest epic poet.' 

The second position is illustrated by Moyne (1974) which claims that 
there is no passive construction in Modem Persian, all instances which 
have been referred to as 'passive' are in fact 'inchoative.' Before I dis- 
cuss this position I would like, for the sake of convenience of discussion, 
to refer to the first approach as the 'passive' approach and to the second 
approach as the 'inchoative' approach. 

Moyne C1974) addresses the question of passivization in Persian and on 
the beisis of, among others, a historical observation and the claim that the 
'by-phrase' in Persian has an instrumental sense draws the following con- 

(4) In conclusion the facts presented in this paper suggest 
that there is no active-passive opposition in Persian. . ., 
but there are certain inchoative structures in sodan. 

Moyne (1974:265) 

Moyne (1974) can be criticized from several different perspectives. In viev 
of limitation of space, I will discuss only two aspects of the paper: 
'historical' and 'empirical.' First, a note on the 'historical' aspect 
of Moyne (1974). 

1.1 Historical 

Moyne makes the following claim with respect to a historical develop- 
ment in the passive construction in the Persian language: 

(5) Viewed through the historical development, the lack of 
passive in Persian should not come as a surprise. In 
Old Persian, lexical or morphological passive forms 
were common. Specially, the auxiliary kart- was pas- 
sive and took an oblique subject (cf. Meillet, 1911). 
In Middle Persian (Pahlavi) , we can trace the demise 
of the Middle and the Passive as lexical processes. 
In particular, krt- {=kart->kard-) becomes active and 
it is passed into New Persian as an active verb. 

ibid: 250 

Although what Moyne claims in the above quotation is true, it is only part 
of the truth. Evidence from four historical stages in the history of the 
Persian language— i.e. , Old Persian (6th c.B.C. -3rd c.B.C), Middle Per- 
sian (Pahlavi) (224-651 A.D.), Early Modem Persian (864-1005 A.D.), and 
Modem Persian--suggest the following. In Old Persian there existed two 
different ways of passive formation-- (1) 'inflectional,' i.e., addition of 
the suffix -ya- to the active form (Kent, 1950: 73(#220)-74,88(#275)) , and 
(b) 'periphrastic,' i.e., a combination of past participle and 'be' or just 

the past participle without 'be' (Kent, 1950:88(#275)) . Similarly, in 
Middle Persian (Pahlavi) , there existed both a category of inflectional 
passive formed by the morpheme -ih-(-yh-) which appears following the pres- 
ent stem (Heston, 1976:161), and periphrastic pcissive formed through the 
combination of past participle with 'be' (Heston, 1976:177) as well as past 
participle without 'be' (Brunner, 1971:240). 

Thus, we see a trace of the Old Persian passive morpheme -ya- with 
the change of a > h in Pahlavi. The fact that the Old Persian -ya- mor- 
pheme appears without any phonetic change in Sogdian (cf. Heston, 1976: 
162) --an eastern Middle Iranian language, as opposed to Pahlavi which is a 
western Middle Iranian language --further confirms the postulation of -ya- 
as the historical antecedent of the Pahlavi -ih-(-yh-). In Early Modem 
Persian, there is no clearly identifiable passive morpheme (Heston, 1976: 
161) which means that in the process of historical development this morpheme 
either has lost its function and productivity or simply that it has been lost 
because of sound change. This is not a terribly surprising phenomenon for 
the following two reasons: First, in Pahlavi we see a fairly 'restricted' 
use of verbs with this passive suffix ((Nyberg, 1974:282; Heston, 1976:161), 
however, Brunner takes the position that they are 'used freely' (Brunner, 
1971:239)) and it is, in fact, the periphrastic class which constitutes a 
large category (Brunner, 1971:238). Second, the spread of the periphrastic 
category as the only category for passivization seems to be part of a gen- 
eral historical development between the Old and Modem Persian in which 
there is a tendency away from agglutination toward lexicalization/isolation. 
As a result, there is no passive suffix in Modem Persian. 

With regard to the periphrastic passive construction, two very inter- 
esting historical developments can be observed. The first of these is the 
reinterpretation of the Old and Middle Persian periphrastic peissives as ac- 
tive^ constructions in Early Modem Persian (Brunner, 1971:246). Although 
this is clearly attested only in Early Modem Persian, its origin can be 
traced back to Middle Persian (ibid.). This process is, more or less, what 
Moyne refers to in his historical comment quoted earlier. What Moyne does 
not mention, however, is the second historical development that takes place 
in Early Modem Persian. This historical development is the emergence of 
the single periphrastic category of passive, which contains the auxiliary 
verb Sudan 'to become; to go' in Early Modem Persian and the continuation 
of that, in the form of sodan 'to become', in Modem Persian. In Early 
Modem Persian texts (just as is the case in Modem Persian), there are two 
patterns of passive formation: (1) the combination of a past participle 
with a fonn of ^dn 'become' , and (2) the combination of an adjective with 
sdn (Heston, 1976:180-182). The following are examples of the first pattem 
in different tenses, e.g., past, past perfect, and present (the examples are 
given in their transliteration form): 

(6) w 'z hr dw sp'h bsy'r ksth sd. 

'and many from both armies were killed.' 

(7) w 'yn zryr br dst 'w ksth sdh bwd. 

'and this Zarir had been killed by his hand.' 

(8) f d'nsth §w2. 

'So that (the version of each group) will be known.' 

Heston (1976:181) 

The following sentence exemplifies the second pattern of the passive for- 
mation in Early Modem Persian: 

(9) zn'n msr 'zyn k'r "g'h sdnd. 

'The women of Egypt became aware of this matter. ' 


There are at least three pieces of evidence for the claim made above 
about the periphrastic category of passive containing the auxiliary veib 
Sudan in Early Modem Persian being an innovation. First, in Middle Per- 
sian (Pahlavi), the verb sudan serves only as an intransitive 'motion' verb 
(Heston, 1976:183) in both inflectional and periphrastic tenses. The fol- 
lowing examples illustrate this point: 

(10) Inflectional Tense 

Present- future sawed 'He goes, is going, 

will go' 

Periphrastic Tenses 

Present-perfect sud ested 'He has gone.' 

Perfect sud (hem) 'He went.' 

Plu-perfect <• sud bud (hem) i 'He had gone.' 

sud estad (hem) 

Brunner (1971:239) 

Second, the fact that the verb sudan in Pahlavi neither occurs with parti- 
ciples of transitive verbs nor does it occur with adjectives (Heston, 1976: 
183) further supports the fact that, in Middle Persian this verb is always 
a verb of motion and not a passive auxiliary. This means that in Early 
Modem Persian the Middle Persian intransitive motion verb Sudan 'to go' 
acquired a 'specialized' usage as a passive auxiliary. Third, in Early 
Modem Persian, there are instances where sudan still carries its earlier 
function as an intransitive motion verb, e.g., 

(11) bh krm'n sd. 

' (he) went to Kerman' 

Heston (1976:232) 

Thus, the double function of sudan (as a verb of motion and as a passive 
auxiliary) suggests that Early Modem Persian mirrors a stage in which we 

can observe the beginning of a syntactic innovation. I would like to claim 
that in Modem Standard Persian, this syntactic innovation is completed and 
sodan functions only as an auxiliary verb. 

Therefore, the conclusion that can be drawn from the above discussion 
is that although, as Moyne has claimed, there was a process by which Old 
and Middle Persian passive constructions were reinterpreted as active (in 
Early Modem Persian) and were passed as such into Modem Persian, yet at 
the same time (i.e., in Early Modem Persian) a new category of passive 
emerged as a consequence of a change in the function of an intransitive mo- 
tion verb into an auxiliary verb. Thus, as I pointed out earlier, Moyne's 
'historical' comment is only part of the truth; in fact, the other part of 
the truth that I discussed above constitutes an argument against Moyne's 
conclusion as to the non-existence of a passive construction in Modem Per- 
sian. In the following paragraphs I discuss the 'empirical' aspect of 
Moyne (1974) . 

1.2 Empirical 

Empirically speaking, I do not share many of the intuitive judgments 
expressed in Moyne's paper. Thus, I cannot accept some of the conclusions 
that were drawn on the basis of those intuitive judgements. I have asked 
for intuitive judgement of some native speakers of Persian with respect to 
the data which are unacceptable to me and all of these are in agreement 
with my judgement. 

For instance, the following discussion by Moyne which leads to the 
claim that the expressed logical subjects in the 'so-called' passive con- 
structions are instrumental--i.e. , that they have 'the instrumental sense' 
(Moyne, 1974:252)--is based on intuitive judgements that I do not share. 

(12) As a general rule the so-called passive constructions in 
Persian has no agent indicated. . . There are, however, 
some instances in modem usage where the so-called pas- 
sive constructions are used with an agent indicated. 

Moyne (1974:250) 

In order to support that claim, Moyne provides the following examples. 
(The numbers in the right side refer to the numbers used in the original 
paper) : 

(13) a. az dast-e ali koSta sod (3a) 

from hand-of Ali killed became 

'he was killed by Ali.' 

b. emsab avSz tavassot-e bSnu Parv3na xanda (3b) 

tonight song through madam P. sung 

will -be come 

'tonight songs will be sung by Miss Parvana. ' 

c. ali bevasile-e mamur-e dadgostary dastgir (3c) 

Ali by-means -of agent -of justice arrested 


'Ali was arrested by an agent of the Justice Department.' 

ibid (250-251) 

He further adds that 

(14) Notice that the examples in (3) [e.g., (13) above] are 
awkward and relatively new in the language. . . These 
instrumental constructions do not clearly specify an 
agent for the action. For example (3a) [i.e., (13a) 
above] means that Ali was instrumental in the killing 
of someone, but it does not necessarily mean that he 
personally performed the killing. 

ibid (251) 

Basically, I agree with Moyne that in passive sentences the agent remains 
generally unexpressed. However, I postpone any further elaboration of this 
point until the last section in this paper. In this section I will only 
concentrate on instances mentioned by Moyne where the agent is indicated. 
My objection to Moyne's claim that the 'by phrase' in Persian in instru- 
mental is based upon my disagreement with his data in (13) above as well as 
his interpretation of that data. As far as my intuition is concerned, sen- 
tence (13a) is totally unacceptable and should be preceded by an asterisk. 
However, there is a sentence (15) that resembles (13a) which is grammatical: 

(15) be dast-e ali kolte Sod - 4) 

at hand-of Ali killed became-subj . 

'He was killed by Ali.' 

If that is the case, then the phrase be dast-e ali 'by Ali' in the above 
example necessarily means that Ali personally performed the act of killing 
and that he is, in fact, the agent of the action. Sentences (13b) and (13c) 
are both well-formed, but in none of these sentences I get any instrumental 
interpretation as suggested by Moyne. In both of these sentences, the 
'tavassot-E' and 'bevasile-E phrase' are stylistic variants of each other 
and both phrases have only the agentive connotation. The most widely used 
equivalent of the English 'by phrase' is the 'tavassot-E phrase' in Persian 
and all of the examples in (13) can be equally expressed with 'tavassot-E 
phrase ' : 

(16) a. (u) tavassot-E ali kost - e sod - (]) 

he/she by Ali killed-part. became-subj. 

'He/she was killed by Ali 

b. emSab avaz tavassot-E banu parvane xand- e 
tonight song by madam P. sang-part. 

mi - Sav - ad. 
IMPF. -become-subj . 

'Tonight songs will be sung by Miss Parvane.' 

c. ali tavassot-E mamur-E dadgostary dastgir Sod - ({> 
Ali by agent justice dep. arrested became-subj. 

'Ali was arrested by an agent of the Justice Department.' 

Support for my position, as opposed to Moyne's position, that in the 
'so-called' passive constructions the preposition 'tavassot-E' marks an 
agentive phrase comes from at least two different sources.^ One piece of 
evidence is derived from 'semantic contradiction' which is produced should 
each of the passive sentence in (16) be conjoined with its corresponding 
active counterpart in the negative. If the passive sentences in (16) were 
indeed 'instrumental constructions,' as Moyne would like to claim, then such 
'semantic contradiction' would not be expected. In that case the first con- 
joined sentence would simply introduce the underlying subject as an instru- 
ment in the fulfillment of the action expressed by the verb and the second 
conjoined sentence would reinforce that instrumental sense by denying the 
agentive role of the underlying subject. The following non-sentences exem- 
plify the semantic contradiction just alluded to: 

rva J 
(17) a. *(u) tavassot-E ali kost - e sod - (|> vali 
he by Ali killed-part. became-subj. r and i 


ali u -ra na- kost - 
Ali he-DO not-killed-subj . 

'He was killed by Ali { ^^ } Ali did not kill him. 

" emsab avaz tavassot-E banu parvane xand- e 
tonight song by madam P. sang-part. 

mi - sav - ad vali emsab banu parvane 

wi 11 -become-subj . rand ^ tonight madam P. 
^but ' 

avaz ne- mi - xan - ad. 
song not-will-sing-subj . 

Parvane will not sing songs tonight. 


tavassot-E mamur-E dadgostary dastgir 
by agent Justice Dep. arrested 

sod - (j) vali mamur-E dadgostary 

became-subj . r and -, agent Justice Dep. 

ali-ra dastgir na-kard- ({> 
Ali-DO arrested not-made-subj . 

'Ali was arrested by an agent of the Justice Department 
{, } an agent of Justice Department did not arrest Ali. 

The second piece of evidence supporting my position comes from the 
'functional distribution' of the prepositions ba 'with', bevasile-E 'by; by 
means of, and tavassot-E 'by' in Persian. In an active sentence such as 
(19a) below all of the above three prepositions can precede the instrumental 
noun phrase, however, there is a left to right 'hierarchy of preference' 
with respect to the occurrence of these prepositions. The 'hierarchy of 
preference' is expressed in (18). 

(18) Hierarchy of Preference 

ba is the most preferred preposition to occur with an 
instrumental nominal, tavassot-E is the least preferred 
preposition to occur with an instrumental nominal, and 
bevasile-E stands between these two extremes. 

(19) a. qatel maqtul-ra ba / bevasile-E/tavassot-E caqu 

murderer victim-DO with/by means of/by knife- 

kost - 4) 

Lit. 'The murderer killed the victim with/by means of/by 
a knife. ' 

In the passive counterpart of the above active sentence where two occur- 
rences of the above prepositions are expected the 'functional distribution' 
of the prepositions manifests itself. As the following examples clearly 
show the co-occurrence of ba and bevasile-E before the instrumental NP and 
tavassot-E before the logical subject is permitted (cf . , 19(b)), whereas 
the co-occurrence of tavassot-E before the instrumental NP and bevasile-E 
before the underlying subject is not allowed (cf. (19c)). (19c) also shows 
that the co-occurrence of ba before the instrumental NP and bevasile-E be- 
fore the underlying subject is allowed. The examples in (19b) and (19c) 
also indicate that the two occurrences of bevasile-E or tavassot-E are not 


C19) b. maqtul ba / bevasile-E/*tavassot-E caqu tavassot-E 
victim with/by means of/ by knife by 

qatel kost - e §od - (fi 

murderer killed-part. became-subj . 

Lit. 'The victim was killed by the murderer with/by means 
of/*by a knife. ' 

c. maqtul ba/*bevasile-E/*tavassot-E caqu bevasile-E qatel 

koSt-e §od-({) 

Lit. 'The victim was killed by the murderer with/*by means 
of/*by a knife. ' 

Before I proceed, I would like to point out that the ungrammaticality of 
Cl9c) in which the occurrence of tavassot-E precedes that of bevasile-E, 
and the grammaticality of (19b) in which the occurrence of bevasile-E comes 
before that of tavassot-E is a natural consequence of the hierarchy of 
preference stipulated above. This, I believe, further strengthens the 
postulation of that hierarchy. Now, notice further that in the passive 
counterparts of an active sentence which does not contain an instrumental 
NP, the occurrence of the preposition ba before the underlying subject is 
not permitted whereas bevasile-E and tavassot-E can equally appear before 
it. The following examples in which the (e) member is the passive form of 
(d) are illustrative: 

(19) d. qatel maqtul-ra kost - ^ 

murderer victim-DO killed-subj. 

'The murder killed the victim. ' 

e. maqtul *ba/bevasile-E/tavassot-E qatel kost-e sod-(t> 

'The victim was killed *with( instrumental) /by the murderer.' 

If the observations made above are correct, then an analysis which is cap- 
able of accounting for the native speaker's intuition with respect to the 
'hierarchy of preference' alluded to above will be in position to automati- 
cally predict the distributional differences observed in (19a) - (19e) . If 
an analysis claims that the 'by phrase' in Persian is instrumental (i.e., 
that it has instrumental sense) and is prepared to go beyond the simple 
description (i.e., listing) of the distributions observed in the above data, 
then it has to claim that ba only marks inanimate instrumental NPs whereas 
tavassot-E primarily marks animate instnomental NPs and bevasile-E shares 
some of the properties of the former and some of the properties of the 
latter instrumental prepositions. This characterization provides a justi- 
fication based on the feature of animacy for our 'hierarchy of preference.' 
Notice that although the above characterization is a step beyong mere 
listing of the observed distributions, yet it is a step away from explain- 
ing the phenomenon. An explanation of the phenomenon is obtained only when 
the facts that tavassot-E primarily marks those NPs which in underlying rep- 
resentation are agents (e.g., consider the grammaticality of (19b) and (19e)) 

rather than instruments (e.g., the ungrammaticality of C19b) and (19c)), 
whereas ba marks those NPs which in underlying representation are always 
'instrumental' (e.g., (19a), (19b), and (19c) and the ungrammaticality of 
(19e)) and never 'agentive' (cf. (19f) below which is anomalous) are con- 
sidered as well. 

(19) f. *caqu maqtul-ra kost - (fi 
knife victim-DO killed-subj. 

'The knife killed the victim.' 

The general explanation which takes into account the two facts just men- 
tioned can be suggested along the following lines: The only function of ba 
is to mark instrumental NPs, whereas the primary (i.e., major) function of 
tcwassot-E is to mark agentive phrases, however, tavassot-E shares a small 
fraction of the function of ba as well (cf . , (19a)). In this analysis, the 
preposition bevasile-E is claimed to share part of the 'functional distri- 
bution' of ba (cf . , (19a) and (19b)) and part of the 'functional distribu- 
tion' of tavassot-E (cf. (19c) and (19e) as well as (19g), which shows the 
'semantic contradiction' disciissed earlier). In other words, the semantic 
domain of bevasile-E intersects with part of the semantic domain of ba and 
part of the semantic domain of tavassot-E. 

(19) g. *maqtul bevasile-E qatel kost-e sod - ({) vali 

victim by murderer killed became-subj . rand 


qatel maqtul-ra na- kost - (}) 
murderer victim-DO not-killed-subj , 

*'The victim was killed by the murderer { t^ ^ 1 the murderer 
did not kill the victim.' " 

The discussion of the above functional distribution of prepositions (1) ac- 
counts for the native speaker's intuition with respect to the 'hierarchy of 
preference' and (2) predicts the forms observed in (19a) - (19e). 

These two pieces of evidence suggest that the characterization of the 
'by-phrase' in Persian as instrumental is not valid. 

Before I proceed, the discussion presented hitherto may be summarized 
as follows: 

(20) a. Evidence based on the diachronic syntax of Persian sug- 
gest that since Early Modem Persian the Middle Persian 
intransitive motion verb Sudan 'go' acquired a specialized 
usage as a passive auxiliary--i.e. , sodan 'become'. 

b. Evidence based on 'semantic contradiction' and 'functional 
distribution' related to the tavassot-E NP 'by-phrase' sug- 
gest that the preposition tavassot-E marks an agentive 
phrase in Persian. 

In the following section (1) I will attempt to show why passive in 
Persian has been called inchoative, and (2) I will present my analysis of 
passive in Persian. 

2. A New Proposal for the Treatment of Passive in Persian 

As a first step toward presenting my account of passive/inchoative in 
Persian, I consider it essential to try to understand the factors which 
were responsible for the emergence of two different approaches (passive 
versus inchoative) with regard to the syntactic construction under investi- 
gation. A comparison of the following English sentences with their corre- 
sponding Persian counterparts is helpful: 

(21) a. The water is cool. 

b. The water cooled. 

c. The water became cool. 

d. *The water became cool by John. 

e. John cooled the water. 

f . The water was cooled (by John) . 

(22) a. ab sard ast- (J) 

water cool is -subj . 
'The water is cool. ' 

b. ab sard sod - 

became-subj . 
'The water cooled/became cool.' 

c. ab sard sod-(J) 

d. ab tavassot-E mahmud sard sod - <^ 
water by M. cool became-subj . 
*'The water became cool by Mahmud.' 

e. mahmud ab -ra sard kard- ()> 

M. water-DO cool made-sub j . 

'M. cooled the water.' 

f. ab (tavassot-E mahmud) sard sod - 4i 
water by M. cool became-subj. 
'The water was cooled (by Mahmud) . ' 

A comparison of the (b) and (c) sentences in the English paradigm with the 
(b) and (c) sentences in the Persian paradigm clearly indicates that what 
is traditionally called an inchoative construction (Lakoff, 1970:32) in 
English is expressed by the auxiliary Sodan 'to become' in Persian. Now if 
we compare the English sentence (f) with its corresponding sentence in Per- 
sian, we notice that the Persian passive again contains the auxiliary sodan 
'to become.' Thus, the occurrence of a single auxiliary in constructions 
which correspond to English 'inchoative' and 'passive' constructions is 
one major factor for the existence of a 'passive' and an 'inchoative' ap- 
proach for Persian. Notice that the overlap between the inchoative and 
passive auxiliary in Persian explains the well-formedness of (d) , and the 

disparity of the two constructions in English justifies the ill-formedness 
of the English (d) . A comparison of the (f) sentences in the two paradigms 
reveals that the deletion of the 'by-phrase' in the Persian passive sen- 
tence (which is generally favored) produces a surface structure which is 
identical to the inchoative sentence (b) whereas obviously this is not the 
case in English. Thus, after the deletion of the 'by-phrase' in item (f) 
of the Persian paradigm the sentence would be ambiguous/non-committal be- 
tween a reading in which 'the water became cool on its own accord' (i.e., 
an inchoative reading), e.g., items (b) and (c) , and a reading in which 'the 
water was cooled as a result of some deliberate action by someone' (i.e., a 
passive reading), e.g., item (f) . Thus, the omission of the 'by-phrase' in 
sentences such as (22f) in Persian, which is generally favored, creates an 
ambiguous surface structure which is another factor responsible for the 
existence of a 'passive' and an 'inchoative' approach. (For further illus- 
tration of the ambiguity see pages 15-18) . 

The analysis that I will argue for in the following paragraphs sug- 
gests that there is a syntactic category of passive in Persian independent 
of the inchoative constructions. Furthermore, I will claim that the appli- 
cation of the passive rule to the structures underlying active sentences 
always produces unambiguous passive constructions. I will show that a de- 
finable subset of passive sentences such as item (22f) in which after the 
deletion of the 'by-phrase' we arrive at ambiguous surface structures in 
fact represents the output of another syntactic process which may apply to 
the output of the passive transformation. Finally, I will study some verbs 
which do not allow passivization to apply to them. In light of this obser- 
vation, I will claim that the rule of passive in Persian is a governed rule. 
I begin this discussion with what I will eventually call 'unambiguous/ 
transparent passive constructions.' 

In addition to the 'historical' argument that I presented earlier 
which supports the emergence of a new periphrastic category of passive 
since Early Modem Persian, there is other evidence which strengthen the 
postulation of passive in Modem Persian. One piece of evidence comes from 
the characteristics reflected by the active sentences and their passive 
counterparts in Persian which are in line with the characteristics of the 
'universals of passivization' as stipulated in the proposals of Perlmutter 
and Postal (1977). This proposal suggests that a universal characteriza- 
tion of passivization in terms of 'word order', 'case', or 'verbal morphol- 
ogy' is not possible; instead an appeal to such largely traditional rela- 
tional notions as 'subject of and 'direct object of' paves the way for the 
expression of two imiversals which underly the process of passivization. 
The two universals that they propose are the following: 

(23) i. A direct object of an active clause is the (super- 
ficial) subject of the 'corresponding' passive. 

Perlmutter and Postal (1977:399) 

ii. The subject of a monostratal active sentence is a 
chomeur in the second stratum of the corresponding 
bistratal passive. 

ibid (1977:409) 


Perlmutter and Postal further claim that some of the consequences of uni- 
versal (ii) are, in turn, universal and some of the consequences of it are 
language-particular. (iii) and (iv) below reflect a universal and a 
language-dependent consequence of (ii), respectively: 

iii. In the absence of another rule permitting some 

further nominal to be direct object of the clause, 
a passive clause is a (superficially) intransi- 
tive clause. ' 

ibid (1977:399) 

iv. The marking of the passive chomeur totally de- 
pends on individual languages. Some mark it with 
preposition (e.g., English), some with postposi- 
tions (e.g., Turkish (cf. Aissen, 1974b), and 
Eskimo with instrumental postposition (cf . , Perl- 
mutter and Postal (1977:397), some with case 
(instrumental in Russian, and ergative in Basque 
(cf . , Perlmutter and Postal, 1977:397-398), and 
some not at all (e.g., Malagasy (cf . , Perlmutter 
and Postal, 1977:395)). 

Keeping the above universals and language -dependent properties of 
passivization in mind, in the following paragraph I demonstrate that the 
(a) and their corresponding (b) sentences in the Persian data below reflect 
these properties and from that I conclude that there is no reason why the 
members of each of these and similar pairs should not be called 'active' 
and 'passive' counterparts. 

(24) a. ma?mur- an -E savak yek ostad-E danesgah -ra 

agent-plu. SAVAK a professor university-DO 

kost -and. . 
killed-subj . 

'The SAVAK agents killed a university professor.' 

b. yek ostad-E danesgah (tavassot-E ma?mur-an-E 
a professor university by agent 

savak) kost - e sod - $ 
SAVAK killed-part. became-subj . 

'A university professor was killed (by the SAVAK agents).' 

(25) a. mardom name -?-i-ra be re?is jomhur nevest-and. 

people letter- a -DO to head republic wrote -subj . 

'The people wrote a letter to the president.' 

b. name-?-i (tavassot-E mardom) be re?is jomhur 
letter-a by people to head republic 

nevest- e sod - <i> 

wrote -part, became-subj . 

'A letter was written to the president (by the people).' 

(26) a. re?is-E danesgah darxast-E ostad - an -ra 

head university request professor-plu. -DO 

paziroft- ()) 
accepted- sub j. 

'The head of the university accepted the request of the 
professors. ' 

b. darxast-E ostad - an (tavassot-E re?is-E danesgah) 
request professor-plu. by head university 

paziroft- e sod - (() 
accepted-part, became-subj. 

'The request of the professors was accepted (by the head 
of the university) . ' 

(27) a. ma?mur- an mottaham-ra be dadgah avard -and. 

agent -plu. accused -DO to court brought -sub j . 

'The agents brought the accused person to the court.' 

b. mottaham (tavassot-E maTmur- an) be dadgah avard - e 
accused by agent-plu. to court brought-part. 

sod - (t> 
became-subj . 

'The accused person was brought to the court (by the 
agents) . ' 

The (a) member of each of the above examples shows three basic features of 
Persian syntax. First, the direct object carries the accusative case 
marker -ra, and the subject carries the nominative -^ case marking. Second, 
the 'person' and 'number' of the subject is always copy-marked on the verb. 
Third, the basic word order of Persian is SOV. A comparison of the (a) and 
(b) members in each pair clearly indicates that the direct object in the (a) 
sentence by losing its accusative case marker -ra, acquiring the nominative 
-(() marking, being copy-marked on the verb, and finally by occurring in the 
initial position of the sentence assumes the role of (superficial) subject 
of the corresponding (b) sentence. (Notice that in sentences (24a), (25a), 
and (27a) the subject is plural, hence the verb bears the plural subject 
marking -and 'they'; whereas in the corresponding (b) sentences, the super- 
ficial subject, which is the ex-direct object, is singular hence the verb 
bears the singular subject marking -^ 'he/she/it'. In sentences (26), how- 
ever, the subject of both the (a) and (b) members are singular, hence the 
verb bears the -(}) ending in both cases.) This promotion of the direct 

object of the (a) sentences as the subject of the corresponding (b) sen- 
tences is exactly the phenomenon that universal (i) refers to. Inciden- 
tally, the fact that the subject in all (a) sentences above, bears the 
'l-relation' (to use the terminology of the Relational Granunar) , whereas 
the direct object of the (a] sentences bears the 'l-relation' in the cor- 
responding fb) sentences suggests that the subject of the (a) sentences is 
a chomeur in the (b) sentences (cf. Perlmutter and Postal, 1977:408). This 
is exactly what is stated in universal (ii) . The surface structure of all 
of the (b) sentences can be represented as s[NP (by-phrase) V] which is the 
surface representation of an intransitive clause. This situation is 
thoroughly compatible with the assumption of universal (iii) . Finally, as 
it should be clear by now, the passive chomeur in Persian (when it appears 
on surface) will be marked with a preposition (cf. iv) . To conclude, the 
complete compatibility of the Persian data with the assumptions of univer- 
sals (i) and (ii) and the consequences of those universals suggests that, 
contrary to the claims of the 'inchoative' approach (cf. item (4), for 
instance) , the grammatical category of passive should be postulated as part 
of the grammar of the Persian language. 

Earlier, I claimed in relation to items in (22) that after the dele- 
tion of the 'by-phrase' the passive sentence, e.g., (22f) (which structur- 
ally consists of an adjeative + sodan) would be ambiguous/noncommittal be- 
tween an inchoative and a passive reading. In what follows, I intend to 
illustrate that all of the (b) sentences that were mentioned in the pre- 
vious paragraph (whose verb morphology consists of a 'Past Participle + 
sodan'), with the by-phrase omitted, indicate that the proposition took 
place as a result of some deliberate action by someone else. Hence, I con- 
clude that these constructions are definitely 'passive' without having any 
overlap with the 'inchoative' constructions. 

The following test is useful to show that the (24b) - (27b) sentences 
express propositions which took place as a consequence of some deliberate 
action by someone else. In all of these and similar sentences, in the posi- 
tion where otherwise the 'by-phrase' might occur, if we insert the adver- 
bial phrase xod be xod 'gratuitously' on his/her/its own accord' the result- 
ing construction will be contradictory: 

(24') b. *yek ostad-E danesgah xod be xod kost - e 

a professor university self with self killed-part. 

sod - 4) 
became-subj . 

Lit. *'A university professor became killed gratuitously.' 

(25') b. *name-?-i xod be xod be re?is jomhur 
letter-a self with self to head republic 

nevest- e sod - 4) 
wrote -part, became-subj. 

Lit.*'A letter became written to the president gratuitously.' 


(26') b. *darxast-E ostad-an xod be xod paziroft- e 

request professors self with self accepted-part. 

sod - (j) 
became-subj . 

Lit. *'The request of the professors became accepted 
gratuitously. ' 

(27') b. *mottaham xod be xod be dadgah avard - e 

accused self with self to cour'.: brought-part. 

sod - (^ 
became-subj . 

Lit. *'The accused person became brought to the court 
gratuitously. ' 

The contradiction in the above sentences arises because all of those con- 
structions necessarily imply an agent, whereas the presence of the adver- 
bial phrase xod be xod cancels that implication. Therefore, it could ap- 
propriately be claimed that in sentences (24b) - (27b) and similar sentences 
(with the by phrase omitted), where the verb morphology consists of a 'Past 
participle + sodan', the only possible reading is that there is an agent 
implied. I am going to refer to these and all sentences with similar char- 
acteristics as 'unambiguous/transparent passives.' I claim that the struc- 
ture underlying item (a) and the structure underlying its corresponding (b) 
member in sentences (24) - (27) and similar sentences are related to each 
other by an optional transformational rule of passive. The transformational 
rule of passivization (a) promotes the direct object of the active sentence 
into the superficial subject position, (b) demotes the underlying subject 
of the active clause to a position after the derived subject, preceded by 
the inserted preposition tavassot-E 'by', and (c) changes the verb of the 
active sentence into its past participial form and inserts an appropriate 
form of the passive auxiliary sodan (depending on the tense of the sentence 
and the number and person of the derived subject) after the past participle. 
This statement of passive formation (taking into consideration a wider range 
of data than those presented here) can be formalized as follows: 

(28) Passive Formation Rule (Optional) 

SD: X-NP-Y- NP^ -Z-V 
12 3 4 5 6 

SC: 1, 3, 4 , (5), tavassot-E+2, (5) , 6 + sodan 

[-ace] [+Past Part.] 

Condition: (i) 2, 4, and 6 are clausemate. 

(ii) 5 can appear either before or after the 
'by-phrase. ' 

On the other hand, the auxiliary causative constructions in Persian 
exen5)lified in the (a) members of sentences (29) - (31) may be related to 

three different sodan 'become' constructions as shown by items (b) - (d) . 
Of these three constructions, items (b) and (d) are quite well-known to us 
since they illustrate inchoative and passive counterparts of items (a), re- 
spectively. With respect to items (c) , I claim that these sentences are 
derived from their corresponding items (d) by a syntactic process that op- 
tionally deletes the kard-e part of the participial form. (I will return 
to the deletion of kard-e in the next paragraph.) It may be noted that the 
omission of the 'by-phrase' in the constructions like items (c) makes these 
constructions identical to the inchoative sentences in items Cb) whereas the 
deletion of the 'by-phrase' in the constructions in items (d) does not pro- 
duce any potentially ambiguous grammatical structure. Thus, after the 'by- 
phrase' deletion in sentences (29c) - (31c) and similar sentences where the 
verb morphology consists of an 'adjective + sodan' we expect to arrive at 
potential ambiguity between an 'inchoative' reading and a 'passive' reading. 
These constructions may be called 'ambiguous/opaque passives.' 

(29) a. nasrin bomb-ra monfajer kard- ij) 

Nasrin bomb-DO exploded made-sub j . 
'Nasrin exploded the bomb.' 

b. bomb (xod be xod) monfajer sod - <]) 
bomb self with self exploded became-sub j . 
Lit. 'The bomb exploded (gratuitously).' 

c. bomb (tavassot-E nasrin) monfajer §od-<|) 

by Nasrin 
Lit. 'The bomb exploded (by Nasrin).' 

d. bomb (tavassot-E nasrin) monfajer kard- e sod - <() 

made-part, became-subj . 
•The bomb was exploded (by Nasrin). 

(30) a. hamsiye - hS ma§in-ra pancar kard-and 

neighbor-plu. car -DO flat made-subj . 
Lit. 'The neighbors made the car's tire flat.' 

b. maSin (xod be xod) pancar sod - (J) 

car self with self flat became-subj. 
Lit. 'The car's tire became flat (gratuitously).' 

c. maSin (tavassot-E hamsaye - ha) pancar lod-(t> 

by neighbor-plu. 

Lit. 'The car's tire became flat (by the neighbors). 

d. masin (tavassot-E harasaye-ha) pancar kard- e 


Sod - <i> 
became-subj . 

'The car was made flat (by the neighbors).' 


(31) a. nasrin panjere-ra baz kard- (f 

Nasrin window -DO open made-sub j . 
'Nasrin opened the window.' 

b. panjere (xod be xod) baz sod - (]) 

self with self open became-subj , 
Lit. 'The window opened (gratuitously).' 

c. panjere (tavassot-E nasrin) baz sod-<J> 

by Nasrin 
Lit. 'The window opened (by Nasrin). 

d. panjere (tavassot-E nasrin) baz kard- e sod - (fi 

made-part, became-subj. 
'The window was opened (by Nasrin).' 

It may be noted that if the claim that items (c) in sentences (29) - (31) 
are derived from their corresponding items (d) by a syntactic process of 
kard-e deletion is correct then it would predict that corresponding to item 
(f) of sentence (22), repeated in (32a) below, in which the omission of the 
•by-phrase' would produce a potential ambiguous surface there is an unam- 
biguous/transparent passive construction. Item (32b) supports this observa- 

(32) a. ab (tavassot-E mahmud) sard sod - $ 

water by Mahmud cool became-subj . 
'The water was cooled (by Mahmud). 

b. ab (tavassot-E mahmud) sard kard- e sod - 4) 
water by Mahmud cool made-part, became-subj. 
'The water was made cool (by Mahmud). 

That sentences (29d) - (31d) as well as item (32b) with the 'by-phrase' 
omitted where the verb morphology consists of a 'past participle + sodan' 
are unambiguous/transparent passives with an agent implied is shown by the 
contradictory nature of items (29') - (32') in which the adverbial phrase 
xod be xod 'gratuitously; on his/her/its own accord' occurs. 

(29') *bomb xod be xod monfajer kard- e sod - (\> 

bomb self with self exploded made-part, became-subj. 
*'The bomb was exploded gratuitously.' 

(30') *masin xod be xod pancar kard- e sod - (fi 

car self with self flat made-part, became-subj. 
*'The car was made flat gratuitously.' 

(31') *panjere xod be xod baz kard- e sod - ((i 

window self with self open made-part, became-subj. 
*'The window was opened gratuitously.' 

(32') * ab xod be xod sard kard- e Sod - 4) 

water self with self cool made-part, became-subj. 
*'The water was made cool gratuitously.' 


In regard to the optional deletion of kard-e in sentences (29d) - (31d) 
as well as items (32b) it is sufficient to mention that this deletion pro- 
cess seems to be part of a more general process of kctrdan deletion in Per- 
sian. That the application of the infinitival nominalization, as a result 
of that (1) the verb of the clause appears in its non-finite form, (2) the 
infinitivized verb becomes the head of the Ezafe--i.e., genitival construc- 
tion, (3) the ex-direct object loses its case marking and appears imme- 
diately after the Ezafe morpheme, and (4) the ex-subject while being pre- 
ceded by the agentive preposition tavassot-E 'by' optionally follows the 
ex-direct object, to the structures underlying sentences (33a) and (34a) 
produces items (33b, c) and (34b, c) respectively where in the (b) members 
the infinitive kardan appears whereas in the (c) members it has been de- 
leted supports this observation. (For further elaboration on infinitival 
nominalization in Persian see Dabir-Moghaddam, 1982). 

(33) a. nasrin in mowzu?-ra barrasi kard- <t) 

Nasrin this issue-DO investigation made-sub j . 
'Nasrin investigated this issue.' 

b. barrasi kard-an-E in mowzu? (tavassot-E nasrin) 
investigation made-INF. this issue by Nasrin 

c. barrasi-y-E in mowzu? (tavassot-E nasrin) 
'The investigation of this issue (by Nasrin).' 

(34) a. nasrin name -ra post kard- <}) 

Nasrin letter-DO mail made-subj . 
'Nasrin mailed the letter.' 

b. post kard-an-E name (tavassot-E nasrin) 
mail made-INF. letter by Nasrin 

c. post-E name (tavassot-E nasrin) 

'The mailing of the letter (by Nasrin).' 

It seems to me that the deletion of kard-e /kardan is determined by pragmatic 
considerations in order to lessen the degree of the agentive force implied 
in the sentence. That the deletion of kard-e in the passive sentences 
(29d) - (31d) as well as item (32b) produces ambiguous/opaque structures 
between an inchoative and a passive reading supports this observation. Now 
I return to the discussion of passive. 

There are auxiliary causative constructions in Persian that behave 
differently from the auxiliary causative constructions discussed above. 
That is, there are auxiliary causative constructions in Persian such as 
items (a) below for which there is a corresponding inchoative form, of. , 
items (b) , and a corresponding inchoative form with a reason adverb,^ 
cf., the grammatical version(s) of items (c) , but not any corresponding 
passive forms, cf . , the ill-formedness of items (d) and the ungrammatical 
version(s) of items (c) . The existence of these sentences suggests that 
the transformational rule of passive in item (28) should be constrained 
such that it would not apply to the sentences in (35a) - (38a) . I claim 
that passive is a governed rule in Persian in the sense that it only 

applies to verbs that express a volitional act. Since the verbs in sen- 
tences (35a) - (38a) express non-volitional acts they may not undergo the 
transformational rule of passive. It may be noted that all the sentences 
in (24) - (27) and sentences (29a) - (31a) as well as (22e) that imderwent 
passivization have verbs that express volitional acts. 

(35) a. nasrin ali-ra ranjide kard- (t> 

Nasrin Ali-DO offended made-subj . 
'Nasrin offended Ali.' 

b. ali (xod be xod) ranjide sod - <)) 
Ali self with self offended became-sub j . 
Lit. 'Ali became offended (gratuitously).' 

c. ali (az dast-E nasrin / az nasrin / *tavassot-E nasrin) 
Ali of hand Nasrin of Nasrin by Nasrin 

ranjide sod - <\> 

offended became-subj . 

Lit. 'Ali became offended (of Nasrin/ *by Nasrin).' 

d. *ali (az dast-E nasrin/az nasrin/tavassot-E nasrin) 

ranjide kard-e sod-cj) 
Lit. 'Ali was offended (of Nasrin/by Nasrin).' 

(36) a. nasrin ali-ra narahat kard- (|) 

Nasrin Ali-DO angry made-subj. 
'Nasrin made Ali angry. ' 

b. ali (xod be xod) narahat sod - (() 
Ali self with self angry became-subj. 
Lit. 'Ali became angry (gratuitously).' 

c. ali (az dast-E nasrin / az nasrin / *tavassot-E nasrin) 
Ali of hand Nasrin of Nasrin by Nasrin 

narahat sod - (J) 
angry became-subj . 

Lit. 'Ali became angry (of Nasrin/*by Nasrin).' 

d. *ali (az dast-E nasrin/az nasrin/tavassot-E nasrin) 

narahat kard-e sod-4) 
Lit. *'Ali was made angry (of Nasrin/by Nasrin).' 

(37) a. in daru badan-E u-ra za?if kard- (\) 

this medicine body he-DO weak made-subj. 

'This medicine weakened his body.' 

b. badan-E u (xod be xod) za?if sod - 4) 
body his self with self weak became-subj. 

Lit. 'His body became weak (gratuitously).' 

-E u (az in daru / *tavassot-E in daru) 
of this medicine by 

za?if Sod-<t> 

Lit. 'His body became weak (of this medicine/*by this 
medicine) . ' 

d. *badan-E u (az in daru / tavassot-E in daru) za?if 

kard- e §od-<|) 

made-part . 

Lit. *'His body was made weak (of this medicine/by this 
medicine) . ' 

(38) a. garma-Y-E sadid gol - ha -ra pazmorde kard- if 

heat severe flower-plu. -DO fade made-subj . 

'The severe heat faded the flowers.' 

b. gol-ha (xod be xod) pazmorde sod -and 

self with self became-subj . 

Lit. 'The flowers faded (gratuitously).' 

c. gol-ha (az garma-Y-E sadid / *tavassot-E garmS-Y-E 

of heat severe by 

sadid) pazmorde sod-and 
Lit. 'The flowers faded (of severe heat/*by severe heat).' 

d. *gol-ha (az garma-Y-E sadid/tavassot-E garma-Y-E sadid) 

pazmorde kard-e sod-and 
Lit. *'The flowers were made to fade (of severe heat/by 
severe heat) . ' 

Further support for the claim that only verbs that express volitional acts 
undergo the passive rule in (28) is provided by the non-existence of a pas- 
sive form for a construction with the non-causative transitive dust dastan 
'like' and a construction with the morphological causative verb xoskan(i)dan 
'dry'. Clearly, both of these verbs express non-volitional acts. Sentences 
(39) and (40) exemplify this observation. 

(39) a. nasrin ali-ra dust dard- <}) 

Nasrin Ali-DO liking has-subj. 
'Nasrin likes Ali. • 

b. *ali (tavassot-E nasrin) dust dast- e 

Ali by Nasrin liking had-part. 

mi - sav - ad 
IMPF . -become-subj . 

'Ali is liked (by Nasrin).' 


(40) a. nasrin tamim-E gol-ha-y-E tuy-E bSqCe -ra 

Nasrin all flower-plu. in garden-DO 

xo5k- an -(i)d- ({> 
dry-cause-past-subj . 

'Nasrin caused all the flowers in the garden to dry.' 

h. *taraam-E gol-ha-y-E tuy-E baqce (tavassot-E nasrin 
all flower-plu. in garden by 

xosk- an -(i)d- e sod - (f 
dry-cause-past-part, became-subj. 

*'A11 the flowers in the garden were made to dry (by 
Nasrin) . ' 

To sunnnarize, the discussion in the foregoing paragraphs suggests the 

(41) a. There is a transformational rule of passive in Persian 

that relates the structure underlying an active sentence 
to the structure underlying its passive counterpart. 

b. The application of the transformational rule of passive 
produces 'unambiguous /transparent passives' in the sense 
that regardless of the presence or deletion of the 'by- 
phrase' they have only a passive interpretation. The 
passive constructions which are formed on the auxiliary 
causative constructions in Persian may also optionally 
undergo a syntactic process of kard-e deletion. The 
deletion of the 'by-phrase' in the constructions without 
kard-e produces 'ambiguous/opague passives' which are 
potentially ambiguous between an inchoative and a passive 

c. The transformational rule of passive in Persian is a 
governed rule in the sense that it only applies to 
the class of verbs that express volitional acts. 

3. Demoted Agent 

All passive examples that I have presented in the previous sections in 
which the 'by-phase' is expressed belong to the written standard Persian. 
In the colloquial standard Persian, the agentive phrase is omitted, because 
it is generally the case that the agent is either recoverable from the con- 
text (linguistic, extra-linguistic, prior knowledge, etc.), or it is un- 
known to the speaker, or that it is known but he desires to avoid mention- 
ing it. The speaker might avoid mentioning the agentive phrase because, 
for instance, either he considers this piece of information to be irrele- 
vant, or he wants to be polite, or he intends to be sarcastic. Thus, sen- 
tence (26b) above but with the 'by-phrase' omitted, for instance, will be 
felicitously used by a speaker as first hand news, to inform another person 
who is aware (i.e., has prior knowledge) that 'the university professors 
had given a request to the head of the university. ' In the same vein. 


items (31c, d) without the 'by-phrase' may be felicituously uttered as a 
first hand information when the hearers are aware of the fact that 'Nasrin 
was trying to open the window.' Similarly, in the following examples, where 
the agent is unknown to the speaker (cf . 42) , or is known but he desires to 
avoid mentioning (cf . 43) , the passive construction is highly preferred to 
its corresponding active form: 

(42) a. in masjed dar zaman-E safaviye saxt- e 

this mosque during period Safavid built-part. 

sod - e ast- (j) 

became-part. is -sub j . 

'This mosque has been built during the Safavid period.' 

b. saxsi/yeki in masjed-ra dar zaman-E safaviye 
someone this mosque-DO during period Safavid 

saxt- e ast- (^ 

built-part. is -subj . 

•Someone (builder/architect) has built this mosque during 
the Safavid period. ' 

(43) a. vaqti dar havapeyma bud- am sf ^^ xabar be man 

when in airplane was-subj . this news to me 

ettela? dad - e sod - <f ] 

information gave-part. became-subj . 

'When I was in the airplane, this news was given to me.' 

b. vaqti dar havapeyma bud- am 5[saxsi in xabar- ra 
when in airplane was-subj. one this news -DO 

be man ettela? dad - (J) ] 

to me information gave- subj . 

Lit. 'When I was in the airplane, someone gave this news 
to me. • 

If the above observations are correct, then they suggest that in cases 
where the agent of the proposition is 'redundant' (i.e., it is either re- 
coverable from the context, or unknown, or intentionally undisclosed by the 
speaker) the preferred 'strategy' is to use the passive construction rather 
than its corresponding active. In fact, in traditional grammars of Persian 
the term siqe-E majhul which means a construction with 'unknown agent' is 
used to refer to the passive voice (Phillott, 1919:285), 

Thus, the syntactic correlation between the active sentences and their 
corresponding passives through the postulation of the Passive Formation Rule 
(28) has to be supplemented with the following rule of 'by-phrase' Deletion 
which is intrinsically ordered with respect to the transformational passive 


(44) 'By-Phrase' Deletion Rule COptional) 

SD: X - tavassot-E + NP - Y 
1 2 3 

Furthermore, I claim that active sentences and their passive counterparts 
are associated with different functions in the use grammar (Chomsky, 1957: 
102) of Persian. The former construction is used when the speaker assumes 
the mentioning of the agent as crucial as the mentioning of the predicate 
(i.e., the rest of the sentence), whereas in the latter type of construc- 
tion, the superficial subject and the predicate are assumed to be crucial, 
hence the speaker intends to draw the hearer's attention to those rather 
than the agent which he considers to be 'redundant' (in the sense specified 
in the last paragraph). As a result, he downgrades (i.e., deemphasizes) 
it either by moving it from the topic position, or by not expressing it. 
All this suggests that the choice made by the speaker of Persian on whether 
to utter an active sentence or its corresponding passive form is pragma- 
tically determined. 

4. Conclusion 

In this paper I addressed the question of passive in Persian. The 
question of passive has been a controversial issue in the transformational 
treatments of Persian. While a group of scholars have postulated the exis- 
tence of passive in Persian, Moyne (1974) has called this construction in- 
choative. In the first half of this paper I challenged the following two 
claims made in Moyne (1974). (1) Moyne's observation in regard to the de- 
mise of the Old Persian passive forms as active forms in the Middle Persian 
and the continuation of that form in the Modem Persian as an indication of 
the non-existence of passive in Modem Persian. I argued that although 
this characterization is true, it is only part of the truth. I claimed 
that evidence based on the diachronic syntax of Persian suggests that since 
Early Modem Persian the Middle Persian intransitive motion verb sudan 'go' 
acquired a specialized usage as a passive auxiliary--i.e. , sodcm 'become'. 
(2) I illustrated that, contrary to Moyne's claim as to the function of the 
'by-phrase' as an instrumental phrase in Persian, the 'by-phrase' indeed 
marks an agentive phrase. In the second half of this paper, I argued for 
a new proposal for the treatment of passive in Persian. In particular, I 
claimed that there is a syntactic category of passive independent of the 
inchoative constructions. I claimed that the application of the passive 
rule to the structures underlying active sentences always produces unambig- 
uous passive constructions. I argued that the output of the application of 
the passive rule to the auxiliary causative constructions in Persian (with 
the 'by-phrase' deleted) may optionally undergo another syntactic process. 
The resulting construction, then, would be ambiguous between an inchoative 
and a passive interpretation. I argued that the rule of passive in Persian 
is a governed rule in the sense that it applies to verbs that express voli- 
tional acts. Finally, I claimed that the choice made by the speaker of 
Persian on whether to utter an active sentence or its passive counterpart 
is pragmatically determined. 


I would like to thank Professor Yeununa Kachru for her valuable com- 
ments and criticism on earlier versions of this paper. 

was done "by analogy with the pattern of the intransitive forms, in which 
the agent is likewise the subject" CBrunner, 1971:246). The process of 
analogy was made possible by two tendencies observed in the language. 
(1) "An agent pronominal suffix is occasionally attached to a perfect form" 
(Brunner, 1971 : 246) , and (2) "auxiliary verbs sometimes show elision after the 
perfect participle" (Brunner, 1971:247). The concatenation of these two 
tendencies, then, was sufficient to force the reinterpretation of peri- 
phrastic passives as active forms on an analog with the active forms in the 

In Modem Persian an equivalent of this sentence, e.g., (i) , is ambig- 
uous between an inchoative and a passive reading. (For a discussion on am- 
biguous passives see pages 15-18 in the text.) The same could have been 
the case in Early Modem Persian. 

(i) zan - an -E mesr az in kar agah sod - and 
woman-plu. Egypt of this matter aware became-subj. 
'The women of Egypt became aware/were made aware of this matter.' 

The uppercase 'E' in the above example and in the examples in the text 
indicate the 'Ezafe' --i.e. , genitival--morpheme. 

In this sentence it is possible to get a reading, but that reading 
does not contradict the issue under discussion. The possible reading is 
that 'The request of the professors became accepted automatically.' Sen- 
tence (26'b) is still acceptable if we have both the adverbial phrase as 
well as the 'by-phrase' in the same sentence. This is shown below: 

(26'') b. darxast-E ostadan xod be xod tavassot-E 
request professors self with self by 

re?is-E dane§gah paziroft- e §od - ()) 
head university accepted-part, became-subj. 

Lit. 'The request of the professors became accepted by 
the head of the university automatically. ' 

The claim that only accusative objects may be passivized in Persian 
is supported by the ungrammaticality of items (ib) and (iib) in which a 
dative and an oblique object have been passivized, respectively. Items 
(ic) and (iic) , on the other hand, are grammatical since in both cases an 
accusative object has been passivized. 

(i) a. nasrin ketab-ra be ali dad- 4> 

Nasrin book -DO to/dat. Ali gave-sub j . 
'Nasrin gave the book to Ali.' 

'ali (tavassot-E nasrin) ketab-ra dad- e sod - 4) 

Ali by Nasrin book-DO gave-part. became-sub j , 

ketab (tavassot-E nasrin) be ali dad-e sod-({) 
'The book was given to Ali (by Nasrin).' 

(ii) a. xoda xatar-ra az nasrin gozar- an -(i)d- 4) 

God danger-DO from Nasrin pass-cause-past-subj , 
Lit. 'God passed the danger from Nasrin.' 


(tavassot-E xoda) xatar-ra gozar- an -(i)d- e 
by God danger-DO pass-cause-past-part. 

sod - (}) 
became-subj . 

xatar (tavassot-E xoda) az nasrin gozar-an-(i)d- 
Lit. 'The danger was passed from Nasrin (by God).' 


That condition (i) must be met in order to derive a grammatical pas- 
sive sentence is shown by the ungrammaticality of item (ib) whose deriva- 
tion violates this condition. In the derivation of the passive item (ib) 
from the structure underlying item (ia) the derived subject is taken from 
the embedded clause whereas the logical subject and the verb that appears 
in its participial form belong to the matrix clause--i.e., a clear viola- 
tion of condition (i) in item (28). For convenience of reference I have 
specified the SD and SC of the passive rule in item (28) --ignoring condi- 
tion (i)--in the (a) and (b) members, respectively. 


Sq[ bacce-ha 



vaqe?iyat-ra si[ 
fact -DO 


ketab-ra be xater-E anha nevest- ({) 

this book -DO for sake 

their wrote -subj . 

accepted-sub j , 



•The children accepted the fact that he wrote this book for 
their sake. ' 

* in vaqe?iyat-ra ke u in ketib be xater-E 

this fact -DO that he this book for sake 


nevest- (|) 
wrote -subj . 

(tavassot-E bacce- ha ) 
by child-plu. 



paziroft- e sod - (fi 

accepted-part, became-subj , 

of the combination of a predicate adjective with the causative auxiliary 
kardan 'do; make'. For a discussion of the syntax and semantics of the 
auxiliary (and other) causative constructions in Persian see Dabir- 
Moghaddam (1982). 

I would like to emphasize that the well-formed sentences in items (c) 
are inchoative constructions containing a reason adverb. They are not 
passive since (1) they do not allow the taoassot-E phrase, and (2) they do 
not have a corresponding 'unambiguous/transparent passive' form (cf. the 
ungrammaticality of items (d)). 


AISSEN, Judith. 1974b Verb raising. Linguistic Inquiry, 4:325-366. 

BRUNNER, Christopher J. 1971. A syntax of Western Middle Iranian. Unpub- 
lished Ph.D. dissertation. University of Pennsylvania. 

CHOMSKY, Noam. 1957. Syntactic structures. Mouton. 

DABIR-MOGHADDAM, Mohammad. 1979. A diachronic and synchronic study of 
passive and causative constructions in Persian. Unpublished paper. 
University of Illinois. 

DABIR-MOGHADDAM, Mohammad. 1982. Syntax and semantics of causative con- 
structions in Persian. Unpublished Ph.D. dissertation. University 
of Illinois, Urbana. 

DAVISON, Alice. 1980a. Peculiar passives. Language, 56:42-66. 

DAVISON, Alice. 1980b. On the form and meaning of Hindi passive sen- 
tences. Manuscript paper. University of Illinois. 

HAJATI, Abdol-Khalil. 1977. A:e-constructions in Persian: Descriptive and 
theoretical aspects. Unpublished Ph.D. dissertation. University of 

HESTON, Wilma Louise. 1976. Selected problems in fifth to tenth century 
Iranian syntax. Unpublished Ph.D. dissertation. University of 

KENT, Roland G. 1950. Old Persian. New Haven, CT: American Oriental 

LAKOFF, George. 1970. Irregularity in syntax. New York: Holt, Rinehart 
and Winston, Inc. 

LI, Charles, and Sandra Thompson. 1976. Subject and topic: A new typol- 
ogy of language. In Li, 437-489. 

MARASHI, Mehdi. 1970. The Persian verb: A partial description for peda- 
gogical purposes. Unpublished Ph.D. dissertation. University of 

MOYNE, John Abdel. 1974. The so-called passive in Persian. Foundations 
of Language, 12:249-267. 

NYBERG, Henrik S. 1974. A manual of Pahlavi. Wiesbaden: Otto Harrasso- 

PALMER, Adrian Shuford. 1971. The Ezafe construction in Modem Standard 
Persian. Unpublished Ph.D. dissertation. University of Michigan. 


PANDHARIPANDE, Rajeshwari. 1981. Syntax and semantics of the passive con- 
struction in selected South Asian languages. Unpublished Ph.D. disser- 
tation. University of Illinois, Urbana. 

PERLMUTTER, David M. , and Paul M. Postal. 1977. Toward a universal char- 
acterization of passivization. Proceedings of the 3rd Annual Meeting, 
Berkeley Linguistic Society, 394-417. 

PHILLOTT, D. C. 1919. Higher Persian grammar. Calcutta. 

SHEINTUCH, Gloria. 1973. Periphrastic verbs in Persian. Unpublished 
M.A. thesis. Tel-Aviv University. 

SOHEILI-ISFAHANI, Abalghasem. 1976. Noun phrase complementation in Per- 
sian. Unpublished Ph.D. dissertation. University of Illinois, Urbana. 

studies in the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 

Hans Henrich Hock 

This paper shows that in three geographically separate areas 
of the world (Europe, Kashmir, and West Africa) a major-constitu- 
ent word order change from SOV to SVO has been initiated by the 
shifting of clitic AUX to clause-second position. This shift, in 
turn, is followed by the tendency for other finite verbs to move 
to the same position. A final step may be the generalization 
that all members of the constituent Verb shift to second position, 
bringing about the order SVO. 

The movement of clitics to clause-second position has been 
shown to be a universal tendency by Steele (1975, 1977a ,b). The 
fact that such a clitic shift of AUX has led to a change from 
SOV to SVO in three geographically disconnected areas suggests 
that AUX-cliticization and the movement of clitic AUX to second 
position is one of the most important mechanisms by which SOV 
languages change their order to SVO. 

There is now a plethora of proposed explanations for the cheinge from 
SOV to SVO syntax. These explanations cover the following range: 

(1) LOSS OF CASE MARKINGS: Across-the-board neutralizations of case 
distinctions through phonological change result in systematic ambiguities 
in SOV languages such that in an excessively large number of contexts it 
will become impossible to tell subjects and objects apart. ^ This systemat- 
ic ambiguity then is said to be remedied by a shift toward SVO order, which 
permits a clear distinction between subjects and objects in terms of their 
relative position next to the verb (Vennemann 1973, 1975). As Hyman (1975) 
has pointed out, however, the difficulty with this explanation is that there 
are languages with perfectly stable SOV syntax, but without any evidence for 
ever having had case markers. To this might be added the evidence of Indo- 
Aryan, where the pervasive loss of case endings led, not to a change toward 
SVO, but rather to a rigidification of the basic SOV order of earlier Indo- 

(2) AFTERTHOUGHT: As Hyman (1975) observed, SOV languages share a cer- 
tain difficulty as compared to VSO and SVO languages. Whereas in the latter 
types of languages it is always possible to add 'afterthoughts' at the end 
of an otherwise complete sentence (such as adverbial NPs , as in Engl. He got 
r\m over by a car ... yesterday, just around the corner from here ) , without 
disrupting the basic ordering principle, this is not possible in SOV langu- 
ages. For the addition of such an afterthought in an SOV language, i.e. of 
an NP to the right of the verb, would contradict the basic verb-final order- 
ing principle. If, then, in spite of this basic principle, we do add an af- 
terthought, it is possible to reinterpret the resulting S + + V + Adv.NP 

as indicating a non-verb-final grammar; and by generalization from structures 
of this sort it is then possible to arrive at structures of the type S + V + 
(+ Adv.NP), with all non-subject NPs to the right of V. A possible diffic- 


ulty with this claim is alluded to in Vennemann 1975, namely that 'after- 
thought' (or extraposition) is found in many quite 'stable' SOV languages 
(such as Japanese and, one might add, Dravidian), i.e. in languages which 
(a) exhibit all the SOV features recognized by Greenberg (1966) and (b) 
do not of f er any evidence for an incipient change toward SVO typology. One 
might well ask under what conditions extraposition will lead to SVO struc- 
tures , if it has failed to do so in languages like Japanese. Note, however, 
that it is probably impossible to predict in a general, non-ad-hoc fashion, 
under what conditions reinterpretation of any given potentially ambiguous 
structure will take place, in any language. The 'afterthought' explana- 
tion, therefore, at this point cannot be rejected out of hand. 

change has been proposed by Hyman (1975). However, as Hyman himself ob- 
serves, he is not aware of any language in which serial-verb constructions 
have led to a change from SOV to SVO order. Without any actual evidence 
for such a change, however, this proposed cause must be considered entire- 
ly speculative. 

(U) TARGET STRUCTURE: This is the somewhat mystical concept proposed 
by Haiman (l97't) as the explanation of the shift to SVO in German (and some 
similar developments in early English, French, and Romansch). Haiman 's 
starting point is a proto-language with inherited superficial verb-final 
order. '(T)he successor languages, among them Old High German, rapidly 
abandoned it in favor of verb-initial and verb-second order. ' The first 
step is a language with underlying VSO order; and from this, the 'V/2 tar- 
get' produces, through various changes, the modern structure SVO, whose 
verb-second position satisfies the target. There are several problems with 
this proposal. First of all, it is not clear how the ancestral SOV of Proto- 
Indo-European changed to the postulated underlying VSO of Proto-Germanic . 
Secondly, there is no cogent evidence for postulating an intermediate stage 
with VSO between earlier SOV and later SVO (cf . e.g. the order of major con- 
stituents in early Germanic given below). Finally, as Steele (l977b) aptly 
observed, '(t)argets are an artifact of the theoretical framework in which 
H[aiman] works.' According to Steele, it is perfectly possible — and natur- 
al — to explain the change toward SVO as the result.of generalization of the 
placement of clitic AUX in clause-second position, for there is a cross- 
linguistics tendency for clitics and AUX to be placed in clause-second po- 
sition (cf. also Steele 1975, 1977a). And since this tendency can be ob- 
served even in (otherwise) solidly SOV languages like Luiseno, a Uto-Aztec- 
an language of Southern Calif ornia,^ there is in her view no need to invoke 
an intermediate stage with VSO in order to get from SOV to SVO. 

As the following discussion will show, Steele's proposal that AUX-CLIT- 
ICIZATION is the motivation for the change from SOV to SVO, although made 
only in passing and without an examination of actual data, is in fact the 
most fruitful and accurate one, not only for Germanic, but also for the 
neighboring Romance, Slavic, and Baltic languages. Moreover, and more im- 
portantly, there is evidence for such a development also in Kashmiri, a lan- 
guage surrounded by SOV languages and belonging to pretty solidly SOV stock, 
and probably also in the West African Gur languages. The fact that this 
change, thus, seems to have taken place independently in three geographic- 
ally quite distant and disconnected areas suggests that AUX-cliticization 
and the movement of clitic AUX to clause-second position may be one of the 


most important mechanisms for the change from SOV to SVO. 

It is generally agreed that Latin, the ancestor of the Romance langu- 
ages, especially in its earliest attestations, had SOV as its basic major- 
constituent order. There is good reason to believe that also Germanic, in 
its earliest attestations, had this order. ^ Lithuanian likewise has predom- 
inant SOV order in its earliest documents." For Slavic, the situation is 
slightly more complex. As Berneker (1900:58, 158-9, and passim) observed, 
there are two major patterns in early Slavic: Verb-initial (i.e. VSO) in 
lively narrative vs. verb-final (i.e. SOV) in descriptions, reflections, 
didactic prose. ^ Since VSO or verb-initial order is found as a marked pat- 
tern also in the other languages ,^'^ frequently in the context of lively 
narration (but also in other contexts), it is possible to argue that Slavic 
here has simply exploited, in a larger number of contexts, a device which 
was available also to the other Indo-European langtiages , and that its (orig- 
inal) basic word order was the other attested major pattern, SOV, just as 
it was in the other languages . 

In early Latin, ais well as in early Germanic, the ordering of auxilia- 
ries (AUX) in respect to their main verb (MV) was consonant with the basic 
SOV order of these languages. Thus, even in Plautus , known for his greater 
tendencies toward using VO patterns than other writers of Latin, we find a 
ratio of 18 : 6 for MV + AUX vs. AUX + MV in a sample passage from the Cap- 
tivi . And the two relevant attestations in pre-600 A.D. Runic Germanic both 
have the order MV + AUX: 

(i) flagda faikinaR ist 'is menaced by evil spirits' 


(ii) brawijan haitinaR was 'was destined for the throes' 


Also in the Gothic of the Skeireins this is the normal order (cf. Smith 1971). 

A clear change, however, can be observed in post-600 A.D. R\jnic: 

(iii) ni s solu sot, ni s Akse stAin skorin '(it) is not hit by the 

AUX MV AUX MV ^^' ^^^ ^*°f ^^ °°* , 

cut with a sharp stone' 

Here the AUX appears in second position in the clause, while the MV is left 
"stranded" at the end of the clause. 

Early Runic offers only example (iii) above, but the Old English of Beo- 
wulf provides parallel evidence for the tendency of AUX to move into second 
position. Later, but still moderately early Old English (as reflected in 
the Anglo-Saxon Chronicle) shows a further shift, namely an increasing, but 
not as progressed tendency for other finite verbs to appear in clause-second 
position, with the models, which also in other respects tend to take an in- 
termediate position between AUX and full verbs, showing intermediate behavi- 
or. (Cf. Fourquet 1938.) The final development, then, is the shift to the 
Modern English order, with AUX, modal, and full verb in second, or post- 
subject position, and with MV directly following its AUX or modal. 

What we can observe here is a clear 'staging' of the shift from SOV to 
SVO. The first element to move is AUX. This is no doubt to be attributed 
to its clitic status and to the tendency of clitic elements to move to second 
position. (Cf. the cross-linguistic arguments and evidence in Steele 1975. 
19T7a, b.-'-^) What is especially important, althoiagh not to my knowledge 
emphasized in the literature, is that there is independent, phonological 
evidence for the clitic status of AUX. Compare the difference between the 
full form ist 'is' of (i) above with the s_ of (iii), which clearly shows 
the effects of clitic shortening; compare also the voicing and rhotacism in 
Old Norse er_ 'is', and the clitic-shortening loss of final t^ in -Old English 
is 'is'. Evidence like this shows that the assumption that AUX was clitic, 
and therefore moved into second position, is not simply ad hoc. 

The next stage shows a generalization, such that other finite verbs 
shift to second position, with the modals tending to be in the vanguard of 
this further development. At this stage, the language probably in effect 
becomes SVO. 

The final step is the generalization that all members of the constituent 
Verb occur in second position, thus removing any remaining exceptions to the 
basic SVO order. And since the language now is SVO, the ordering of AUX 
and MV in that clause-second position will be AUX + MV, the order which is 
consonant with SVO. 

It should be noted, however, that each step in this development is a 
separate change and does not by necessity lead %o the other steps. A case 
in point is Modern Standard German, which has failed to complete the last 
step and still off ers the pattern of finite verbs in second position (in main 
clauses), with MV left "stranded" at the end of the clause: 

(iv) Er liebt seine Frau 'he loves his wife' 


(v) Er hat seine Frau geliebt 'he has loved his wife' 


Along this path from SOV to SVO there are a number of other, more minor 
patterns which can frequently be observed and which, especially together 
with independent, phonological evidence for AUX-cliticization, can be used 
as evidence for a change from SOV to SVO even where there is no direct evid- 
ence for the initial step of shifting (only) cliticized AUX to second posi- 
tion. These are: 

(a) Patterns like (iii) and (v) above, or (vi) below, with AUX in second 
position and MV "stranded" at the end of the clause (or after its object). 

(vi) OE her waes Crist ahangen 'here Christ was hanged' 


(b) Patterns with AUX + MV in clause-final position, as in (vii) below. 
These may cooccur with the old MV + AUX pattern (cf. ibid.) and can be ex- 
plained as a compromise between this MV + AUX pattern and the new pattern 

with AUX in second position and thus preceding its MV, and ^4V "streinded" at 
the end of the clause. 

(vii) OHG dhazs ir man uuardh uuordan 'that he had become man' 
beside dhazs ir man uuordan uuardh (id, ) 


(c) The tendency for dependent clauses to retain verb-final, SOV syntax 
longer than main clauses; cf. (viii) below with (vii) above. 

(viii) OHG ir uuardh man uuordan 'he had become man' 


' It is the appearance of patterns like these , together with the early evid- 
ence for SOV, which sxiggests that also the Romance Isinguages, Slavic ,^5 and 
Baltic acquired their present SVO order through (generalization of) AUX-clit- 
icization and shift to second position. 

Thus, early Romance shows patterns like (ix) below, with AUX in second 
position and MV "stranded" at the end of the clause. Moreover, many early 
Romance languages offer evidence for SOV persisting longer in dependent 
clauses; cf. Richter 1903. Finally, we also have independent, phonological 
evidence for clitic shortening in Lat. habet > It. ha, Fr. a, etc. or est > 
It. e_. Port, e^, Roman. e_ (beside non-shortenend, presumably originally non- 
clitic este ). 

(ix) OFr. vertet est de terre neg* -" 'truth is born from 
AUX MV ^^^ ^^^*^' 

For early Slavic, note the occurrence of patterns like (x) , with AUX in sec- 
ond position and MV "stranded", beside patterns like (xi), in which MV has 
joined its AUX in second position. 

(x) staresinistvo esi s mene sn.lalu 'you have taken (my) 


birthright from me' 

(xi) cemu este sunjali su mene spaciku 'why have you taken 
AUX MV ("*) ^hi^* °^^ "'^' 

Finally, in Old Lithuanian, SVO is limited to the verb 'be', modals, and 
AUX. Moreover, MV may be left "stranded" at the end of the clause (cf. 
(xii)), or AUX and MV may cooccur (in that order) at the end of the clause 
(cf. (xiii)). 

(xii) kaip butu man pats Diewas apreischkies 'as if God himself 
AUX MV ;^ revealed (it) 

to me 

(xiii) pats sawe vi yq esti dawes 'he has given himself 

AUX MV ^°^ i*' 

Evidence for a shift from SOV to SVO through AUX-cliticization is not re- 
stricted to this group of neighboring, E\iropean languages which may well ex- 
hibit an areal spread of a single word order shift. We find it also outside 
of Europe. 

One example is that of Kashmiri , a language surrounded by SOV languages 
and belonging to the otherwise quite solidly SOV stock of Indo-Aryan. Like 
these other languages, Kashmiri has the order Genitive + Noun and Noun + 
postposition, patterns which normally go along with SOV order. Unlike the 
other languages, however, Kashmiri exhibits the following peculiarities: 
Finite verbs occur in clause-second position, with some variation between 
second and final position in dependent clauses. If AUX is in second position, 
the MV may be "stranded" at the end of the clause or it may directly follow 
the AUX. In dependent clauses , AUX + MV is found beside MV + AUX in the 
clause-final position. Cf. (xiv) - (xvi) below. 

(xiv) hi chus ^ith' lekhan 
beside b^ j?hus lekhan cith' 

'I am writing a letter' 


(xv) asokan von ki su yi n4 gulmargi 'Ashok said that he 

will not come to G. ' 
'that' 'come' 

beside b4 chus so6an ki gulmargi gatsh^ha 'I think that I should 
'that' 'go' gotoG.' 

(xvi) yath sodur khon i^hi vanan 'which they call S.Kh.' 

beside yiman . . . mugjrl gardin vanan ^hi 'which they call M.G. ' 

In addition to these patterns which are typical of a language changing from 
SOV to SVO via AUX-cliticization, Kashmiri also offers independent, phonolo- 
gical evidence for the clitic status of 'be', its auxiliary Ygrb: MIAr. 
acch- has undergone clitic shortening to ch- (as in chus- ) . 

A final, probable example of the word order shift under discussion comes 
from (some of) the West African Gur languages and has been discussed in 
greater detail in Garber I98O. I will only summarize the major points. 

In the language family of which the Gur languages are a part , the fol- 
lowing word order types are found: 

Dogon: S + + Adv.NP + i" ^ "I 

( MV + AUX 3 

Mande/Senuf o : J * ^ t + Adv.NP 
■[S + AUX + + MVJ 

(other) Gur: 

Laux + MV3 

Note further that some of the SVO Gur languages exhibit vacillation between 
SVO and SOV patterns. Finally, all the languages in question have the order 
Genitive + Noun and Noun + Postposition, arrangements normally found in SOV 
languages . 

Given what we now know, this geographical record can be interpreted very 
much like the chronological record of Germanic, namely as one of AUX-clitic- 
ization and shift of clitic AUX to second position (cf. Mande/Senufo) , fcpl- 
lowed by generalization of the whole consituent Verb to second position. 
The only potential difficulty is the fact that concurrent with the AUX-in- 
second-position stage we also find obligatory extraposition of Adv.NPs. It 
is therefore possible to argue that this pattern, with the verb in non-final 
position, was at least in part responsible for the further shift toward SVO 
in some of the Gur languages. (Cf. Hyman's 'afterthought' principle.) How- 
ever, it is also possible that this obligatory, rather than optional extra- 
position was at least in part made possible by the fact that AUX had shifted 
into second position, thus creating exceptions to the verb-final pattern. 
At any rate, however, there is good reason to believe that AUX-cliticization 
was one of the factors, if not the single most important factor, motivating 
the shift from SOV to SVO. 

This paper has presented evidence that AUX-cliticization (and shift of 
clitic AUX to clause-second position) is an important factor in the shift 
from SOV to SVO. Given the fact that other theories concerning the causes 
of such a shift are either dubious (loss of case markings, target structures), 
or merely speculative (grammaticalization of serial verb constructions), or 
at best possible, but (as yet) without any conclusive supporting evidence 
(afterthought), it is possible, but not strictly speaking necessary, to con- 
clude that — excluding substratum-induced changes — AUX-cliticization may be 
the only motivation for the shift from SOV to SVO, wherever such a change 
occurs.^ I will leave this as a challenge for further research by other 
linguists . 

-'•Research on this paper has been in part supported by a 1979/80 grant 
from the University of Illinois Research Board. Earlier versions have been 
read at the lOth Anniversary Meeting of the Dravidian Linguistics Society 
(Delhi, 10-12 July I980) and at the Department of Linguistics, Osmania Uni- 
versity (Hyderabad, 12 March I981). I am grateful for comments received at 
these occasions. Needless to say, the responsibility for any errors is 
entirely my own. 

vennemann (197^*, 1975) adds as another element that of topicalization: 
Once morphological distinctions between subjects and objects are lost, it is 
in his view no longer possible in an SOV language to topicalize objects 
without ambiguity, since it will be impossible to distinguish SOV from OSV 
(with fronted, topicalized object). Also this claim is dubious, since lan- 
guages may use different devices, such as stress and intonation, or extra- 
position of the subject to the right of the verb, in order to accomplish 

the same goal. (All of these devices are employed in languages like Hindi, 
•without there being a change from SOV to SVO.) 

^This is especially true for the relative order of object and verb, while 
as observed in note 2, subjects may extrapose to the right of the verb. 

Haiman (p. lU8) makes passing reference to this possibility, but without 
realizing (or acknowledging) the full implications, such as the fact that 
this development makes the ass\miption of an intermediate VSO stage unneces- 

^For data and additional references see Steele 1975. 

"Even Friedrich (1975, 1977), who is highly sceptical concerning the view 
that PIE was SOV, agrees on this point. 

7For the early Rxinic inscriptions and the Gothic of the Skeireins , cf. 
above all the evidence in Smith 1971. A recent paper by Ebel (1978), appar- 
ently independently, also comes to the conclusion that Skeireins Gothic had 
predominant SOV order. See also Fourquet 1938, Ries 1907 for the SOV evid- 
ence of the Old English Beowulf. Finally, compare the statistics in notes 
10 and 11 below. Friedrich (1975, 1976, 1977), ignoring this evidence, con- 
centrates on the VO patterns of later Germanic and on the evidence of gapping 
to buttress his repeated claim that Proto-Germanic was VO. However, the 
later VO pattern clearly must be an innovation, given the evidence of early 
Runic, Skeireins Gothic, and Beowulf Old English. And as Subbarao (1972, 
197^) has conclusively shown, gapping and other intimately related movement 
rules provide no certain evidence whatsoever for basic or underlying con- 
stituent order — unless contrary to all the other available evidence, we are 
prepared to accept Hindi, or even the Dravidian languages, as having basic 
or underlying VO. 

"In a sample from the Old Lithuanian catechism of Baltramiejus Vilentas , 
I have found the following ratios (where X = a short adverbial constituent 
or a conjunction): 

OV/V#: k2 = 65% 

#(X)V(SO): 17 = 26^ 

SVO: 5 = 7.5^ 

Other: 1 = 1.5^ 

9Friedrich (1975, 1976, 1977) distorts this position. 

For Lithuanian, cf. note 8 above. For Latin, cf. the following ratios 
from a sample of Caesar's Bellum Gallic\m . 

0V/#: 71* = 80^ 

#(X)V(S0): 7 = l.h% 

SVO: 7 = 7.h% 

Other: 5 = 5.2!? 


Older Runic: 

0V/#: 22 = 61% 
#V(SO): 6 = n% 
SVO: 8 = 22^ 

Beowulf (1-59): 



Note especially the difference between the epic Beowulf, with its frequent 
passages of lively narrative and its higher ratio of verb-initial patterns, 
and the prosaic Runic inscriptions, with a rather low incidence of #V. 

-'■-'•Ries (19OT) gives the following statistics (excluding verb-initial 
clauses) : 

V(X)# SV 







Full verb 


= 19% 


= 21^ 



= 83% 


= n% 



= 26.6% 


= 73.3^ 

At this point, the major division seems to be between AUX and Modal/Full 
Verb. The difference between the latter two is negligible. Moreover, the 
incidence of SV is as yet not significantly higher than in Older Runic. 

■'■^Note however that similar conclusions were reached — on narrower 
grounds — by earlier scholars, such as Ries (l907:3l6-17) and especially 
Fourguet (1938:19^-5). 

13i do not think that this development requires the additional motivat- 
ing factors suggested by Stockwell (1977), namely the fact that extraposi- 
tion was possible (or in some cases was required) in Old English. The fact 
that Modern German offers very similar patterns of extraposition, without 
shifting its "stranded" MV next to the second-position AUX, suggests that 
extraposition is not a sufficient motivation for the shift. 

Cf . Fourquet 1938, who also offers parallel evidence from Old English. 

•■•^ot all the dialects have shifted to SVO. 

Note that the Latin original offers Veritas de terra orta est , with 
MV + AUX, both occurring clause-finally. 

IT^These data are drawn from Kachru 1973. 

■^"Note that other languages, such as Nepali, have undergone a similar 
change of clitic shortening, without shifting AUX to second position. This 

suggests that the shift to second position is a separate step from AUX-clit- 
icization and that it is a tendency, not a necessary consequence of cliticiz- 

and thus no difference between finite and non-finite, the opportunity for 
an intermediate pattern, with only finite verbs in second position, does 
not arise. 

Armenian may be a language with SVO (from PIE SOV) for which AUX-clit- 
icization does not provide a likely motivation, for as the data in Jensen 
1959:120-1 suggest, Armenian synchronically offers the order MV + AUX, usu- 
ally in clause-final position. (Since in Armenian, MV = participle, i.e. a 
verbal adjective, and AUX = the verb 'to be', this pattern of MV + AUX is 
consonant with the fact that also elsewhere, adjectives and predicate no\jns 
directly precede clause-final 'be'.) 


BERNEKER, Erich. 1900. Die Wortfolge in den slavischen Sprachen. Berlin: 

EBEL, Else. 1978. Zur Folge SOV in den Skeireins. Sprachwissenschaft 3. 

FOURQUET, J. 1938. L'ordre des elements de la phrase en germanique ancien. 
(Publications de la Faculte des Lettres de I'Universite de Strasbourg, 
86.) Paris: Les Belles Lettres. 

FRIEDRICH, Paul. 1975. Proto-Indo-European syntax. (journal of Indo-Euro- 
pean Studies, Monograph 1.) Butte, Montana. 

. 1976. Ad Hock. Journal of Indo-European Studies It. 207-20. 

. 1977. The devil's case: PIE as type II. Linguistic studies offer- 
ed to Joseph Greenberg on the occasion of his sixtieth birthday, 3, ed. 
by A. Juilland, I463-8O. Saratoga, CA: Anma Libri . 

GARBER, Anne. I98O. Word order change and the Senufo languages. Studies 
in the Linguistic Sciences 10:1. ^5-57- 

GREENBERG, Joseph. I966. Some universals of grammar with particular refer- 
ence to the order of meaningful elements. Universals of language, 2nd 
ed. , edited by J. Greenberg, 73-113. Cambridge, MA: MIT Press. 

HAIMAN, John. 197^- Targets and syntactic change. The Hague: Mouton. 

HYMAN, Larry. 1975- On the change from SOV to SVO: evidence from Niger- 
Congo. Li 197^:113-i*7. 

JENSEN, Hans. 1959. Altarmenische Grajmnatik. Heidelberg: Winter. 

KACHRU, BraJ B, 1973. An introduction to spoken Kashmiri. Urbana: Univer- 
sity of Illinois, Department of Linguistics. 

LI, Charles N. , ed. 1975- Word order and word order change. Austin: Uni- 
versity of Texas Press. 

1977. Mechanisms of syntactic change. Austin: University of Texas 

RICHTER, Elise. 1903. Zm- Entwicklung der romanischen Wortstellung aus der 
lateinischen. Halle: Niemeyer. 

RIES, John. 1907. Die Wortstellung im Beowulf. Halle: Niemeyer. 

SMITH, Jesse Robert. 1971. Word order in the older Germanic dialects. Ur- 
bana: University of Illinois Ph.D. dissertation. 

STEELE, Susan. 1975. Some factors that affect and effect word order. Li 

STEELE, Susan. 1977 (a). Clisis and diachrony. Li 1977:539-79- 

. 1977 (b). Review of Hainan 197*+. Language 53.209-12. 

STOCKWELL, Robert P. 1977- Motivations for exbraciation in Old English. 

Li 1977:291-31^+. 
SUBBARAO, K. V. 1972. Extraposition and underlying word order. Papers 

in Linguistics 5.^+76-85. 

. 197^. Is Hindi a VSO language? Papers in Linguistics 7.325-50. 

VENNEMANN, Theo. 1973. Explanation in syntax. Syntax and semantics, 2, 

ed. by J. Kimball, 1-50. New York: Seminar Press. 
. 197^+. Topics, subjects, and word order: from SXV to SVX via TVX. 

Historical linguistics, 1, ed. by J.M. Anderson and C. Jones, 339-76. 

Amsterdam: North-Holland Publishing. 
. 1975. An explanation of drift. Li 1975:269-305. 

studies in the Linguistic Sciences 
Volume 12, Number 1, Spring iy82 


Michael Kenstowicz 
University of Illinois 

In this paper we look at the behavior of geminate conso- 
nants in Tigrinya under a rule that spirantizes postvocalic 
velar stops , arguing that it provides evidence for the claim 
that phonological structure can make a distinction between 
one versus two segmental units linked to two adjacent posi- 
tions in the syllable structure tree. Our results can be 
viewed as another illustration of the general point that 
phonological structure cannot necessarily be characterized 
solely in terms of the inventory and linear arrangement of 
segments but must also take into account their grouping into 
higher-order constituents. The paper opens with a brief 
review of the literature on gemination and spirantization in 
Biblical Hebrew, showing the motivation for Leben's (I98O) 
analysis in which geminate consonants are expressed as the 
linking of a single phonological segment to two positions 
in the syllable structure tree. In the next section we see 
that quite comparable data exist in Tigrinya motivating a 
similar analysis, but with the interesting twist that the 
Obligatory Contour Principle is in fact a language-specific 
parameter. The following section provides an analysis for 
the complex patterns of gemination found in Tigrinya at the 
junction of a verb stem and object suffix. In the final 
section we look at two rules of complete assimilation and 
argue that they are best treated in terms of rules that 
delete a phonological unit from the segmental tier with the 
resultant long consonant arising from general autosegmental 
principles of reassociation. 

Biblical Hebrew exhibits a number of phenomena illustrating the classic 
problem of the treatment of geminate consonants that have been the subject 
of much recent discussion in the literature. Sampson (1973) argued that 
the contrast between long versus short consonants in Biblical Hebrew should 
be represented in terms of the feature [long] on the basis of the behavior 
of these segments under the riile spirantizing postvocalic obstruents. Spe- 
cifically, an underlying stop will spirantize intervocalically or in post- 
vocalic final position; when initial in a consonant cluster it will spiran- 
tize so long as it is not identical to a following consonant: e.g. /kataba/ 
«»• /kaGava/ (eventually ka : 6va ) , yi-xtov , but gibbor (=gib:or) not * givbor 
'hero'. Sampson argues that if gib: or is represented as a geminate /gibbor/, 
then the spirantization rule would have to be given the complex reformulation 
in (1) to prevent it from affecting the initial consonant in a cluster just 
in case it is not identical with the following consonant, which is a round- 
about way of saying that spirantization does not apply to long consonants. 




-*- [+continuant] / V 

[+sonorant ] 
{ [-gcoronal] } 

But if long consonants are represented as [+long] , then the rule can be 
restricted to just [-long] consonants . 

However, Barkai (197^) points out that there are numerous places in the 
phonology of Biblical Hebrew where long consonants pattern with consonant 
clusters, thus supporting the geminate notation. For example, vowels are 
regularly reduced to schwa in the plural of nouns when in the context 

CVCV (cf. malk-i 'my king' but mslax-im from /malak-im/ 'kings ') .Reduction 
is blocked by a following cluster (galgal-im 'wheels') as well as by a 
following long consonant (sap:ir-im 'sapphires'). If long consonants are 
treated as geminates, then the reduction rule will automatically fail to 
apply in sappir-im . Barkai also points out that in some cases long conso- 
nants clearly originate from underlying consonant clusters. For example, 
in the perfect of the verb the 1 and 2 sg. suffixes begin with /t/. When 
suffixed to a stem ending in /t/, a long consonant arises that fails to 
spirantize: /karat+ti/ is realized as karat : i 'I made a covenant'. Simi- 
larly, the derivational prefix /hit-/ normally appears as hi9- in hitpa??el 
verbs (e.g. hi9-gaddel 'became great'), but fails to spirantize when the 
following root begins with a /t/: /hit+tammem/ 'he acted uprightly' is 
realized as hit :fumnem not * hi9-tammem . Finally, the derivational morpho- 
logy of the language exhibits processes showing the equivalence of geminates 
and consonant clusters. Most quadriradical verbs appear in the Pi??el class: 
gilgel 'rolled', kirsem 'gnawed', kirbel 'clothed'. The causative of tri- 
radical verbs is formed by geminating the middle radical; the resultant 
verb appears in the Pi??el class: cf. ga5al 'he grew up', but giddel 'he 
brought up (educated)'. If the long medial consonant of giddel is treated 
as a geminate, then the fact that it appears in the same verbal class 
(Pi??el) as quadriradicals such as kirbel is explained. 

We thus have a rather paradoxical state of affairs . Rules such as 
vowel reduction show that long consonants behave like clusters (motivating 
the geminate notation), while the spirantization rule shows that long con- 
sonants do not behave as consonant clusters and is thus inconsistent with 
the geminate notation (motivating the feature notation for length). But 
the same segment cannot be consistently represented in two different ways. 

This dual behavior of long consonants is bound to remain a paradox in 
a framework where phonological structure is conceived of as a single linear 
string of feature matrixes. However, with the advent of multi-linear rep- 
resentations, we can begin to make some sense of this dual behavior. Leben 
(1980) argues, convincingly in my opinion, that the problem of long conso- 
nants in Biblical Hebrew can be solved if phonological structure is conceived 
as involving syllable structure trees linked to the linear string of feature 
matrixes. With this dual level of structure we can represent a long segment 
as one that is associated with two consecutive nodes in the syllable structure 
tree. The similarities and differences among kirbel , giddel , and ga5al can 
be successfully represented as in (2).-*- 



y V 

gidel ga3al 

I I \\\ II I II 

CVCCVC cvcvc 

In these multi-linear representations kirbel and gidiel share a CVCCVC 
syllable pattern, with the long /d:/ of gid:el linked to two consonantal 
positions , while gid:el and ga5al are equivalent in termsof their consonantal 
structure on the segmental tier, sharing the radicals /gdl/. The morpholo- 
gical relationship between gid:el and gaSal can thus be viewed as the sub- 
stitution of the CVCCVC syllable pattern for the basic CVCVC pattern (with 
attendajit differences in vowel melodies; see McCarthy I98I for discussion). 

The other factors cited by Barkai to motivate the geminate notation in 
the linear theory can also be viewed as making reference to the CV syllable 
tier. So, for example, given that malax- fm (from /malak-im/), galgal-im , 
and sap:ir-im are represented as in (3), the reduction rule can be given the 
formulation in (U). 

Mill II 

N V V 

V M \/ 

I I \\\ I I 

y \ V 

o a a 


[+vocalic]^- a / 

V C V C V 

Since the reduction rule {k) requires the reducing vowel to be in an open 
syllable (i.e. followed by Just one consonant), the rule will correctly fail 
to apply to sap:ir-im because its initial vowel is followed by two conso- 
nants in the syllable tree and is thus not in an open syllable. 

Given that a long consonant is represented as a segment that is mapped 
to two C slots while a short consonant is one that is mapped to Just a single 
C slot, we can now distinguish the length of a segment in terms of the number 
of positions in the syllable tree that it is associated with and hence cein 
dispense with the feature [long] . The spirantization rxile can now be form- 
ulated as in (5), where a line drawn through theassociation link means that 
the consonant is associated with Just one position in the syllable tree, i.e. 
that it is short. 

(5) [-sonorant] -»- [+continuant ] / [+vocalic] 


In order to be consistent with the fact that geminate consonants arising 
from the Juxtaposition of identical consonants across morpheme boundaries fail 
to spirantize, Leben proposes extending the Obligatory Contour Principle dev- 
loped in tonal phonology to the treatment of geminates. By this principle a 


sequence of identical phonological segments is automatically restructured 
into a single element on the segmental tier linked to two positions in the 
syllable tree, as indicated below. By virtue of this principle /karat+ti/ 
will escape the effects of spirantization. 

(6) Obligatory Contour Principle 

X + y z karat+ti karati 

I \ ^ /\ 1 1 1 1 1 1 1 1 1 1 1 ^\ 

C + C C + C (x=y=z) CVCVC CV "^ CVCVCCV 

a a o a a a 

1. Tigrinya Spirantization 

In this paper we shall look at some analogous data from another Semitic 
language — Tigrinya, a South Semitic language of Eritrea. Like Biblical 
Hebrew Tigrinya also has a rule of spirantization that is sensitive to the 
length of consonants. Although the two languages are genetically related, 
the Tigrinya spirantization rule is historically unconnected to that found 
in Biblical Hebrew. In addition many of its geminate consonants occur in 
patterns unlike those found in Biblical Hebrew. The similarities between 
the languages are thus on the typological rather than the historical level. 
We shall argue that the Tigrinya data also support the basic conclusion of 
Leben's study, i.e. that consonant gemination is best treated as the linking 
of a single phonological segment to two consonantal positions in the 
syllable tree. 

The Tigrinya spirantization rule is a totally general process that 
spirantizes the velar stops k and g postvocalically.-^ The rule applies both 
within morphemes (cf. ma-rkab infinitive, raxaba 3sg.m. perfect 'find') as 
well as across morpheme and word boundary (cf. kalbi 'dog', ?axalab 'dogs', 
?9ti xalbi 'the dog'). In this paper we shall be concentrating on the 
behavior of this rule with respect to geminate consonants, but in the course 
of doing so we will make several observations on the syllable structure and 
the gemination rules to be found in the language. For purposes of discussion 
we can distinguish four categories of geminate consonants: i) underlying 
geminates, ii) geminates arising from a rule of gemination, iii) geminates 
arising from a rule of complete consonant assimilation, iv) geminates arising 
from the juxtaposition of identical consonants across morpheme boundaries. 
We will see that spirantization fails to affect geminates from one of the 
first three categories but will apply to those in (iv). 

An example of the first type is provided by a verb such as fakkara 
'boast', which displays the CVCCVC root structure of Biblical Hebrew giddel . 
Leslau (I9I+I) classifies Tigrinya verbs into four major categories: Type A 
verbs are simple triradicals with the root shape CaCaC in the perfect (e.g. 
raxab-a 'he found'); Type B show the middle radical geminated (e.g. fakkar-a 
'he boasted'); in Type C the initial root vowel is a (e.g. barax-a 'he 
blessed'); and Type D are the quadriradicals (e.g. maskar-a 'he witnessed'). 
For our purposes here the most important point is that when the second 
radical of a Type B verb is a velar k it never spirantizes to x, showing 

that the spirantization rule must be restricted to short nongeminate con- 
sonants. The rule is thus essentially identical to the Biblical Hebrew 
rule (5), differing in Just its restriction to the velars. 

(7) [li,q] -> [ +continuant ] / [+vocalic] 

Given this formulation, the rule will apply to the underlying representa- 
tions of Type A /rakab-a/ and Type C /barak-a/ , but will fail to apply to 
Type B /fakkar-a/ and Type D /maskar-a/. 






1 1 1 1 1 1 

IIN\\ 1 

1 1 1 1 1 1 

1 1 1 1 1 1 1 

M ^ V 

a a 

a Q 

^ N V 







1 1 1 1 1 1 

1 1 1 1 1 1 

a a 


underlying rep. 


Before turning to one of the other categories of geminates in Tigrinya, 
let us look briefly at the syllable structure of the language. Towards that 
end, examine the possessed noun paradigms in (9). The paradigms for garat 
'bed' and katama 'town' show that the possessive suffixes for the second 
person begin with the velar stop /k/ , while those for the third person begin 
with a vowel. After a vowel-final stem such as katama we find that the under- 
lying /k/ has spirEintized to x, while the third person suffixes show a glottal 
stop as a hiatus breaker. 
























katama- xi 
katama- ?a 
katama- xum 
katama- ?om 
katama-? an 















wad d- ay 












ma rax- ay 
mar ax 

Tigrinya syllable structure is quite rigid: only CV and CVC syllables are 
possible. The hiatus breaking rule can thus be viewed as a reflex of a pro- 
cess that inserts a glottal stop to supply an onset to a syllable that lacks 


c / [. 



llllll 1 

llllll II 



MM^ 1 

M M M ^ 

Let us now turn to the paradigm for kalbi in (9)» which has the stem 
/kalb/ (cf. ?axalab 'dogs'). The final consonant of the stem will form the 
syllable onset with a vowel initial-suffix (ll). Before a consonant-initial 
suffix such as the Ipl. we have the underlying representation of (12). 


V V 




Due to the rigid syllable canons of the language the stem-fined, /b/ can 
appear neither in the coda of the first syllable nor in the onset of the 
second syllable because onsets and codas can contain at most one consonant 
in Tigrinya. Let us suppose that there is a universal constraint to the 
effect that every segment must be affiliated with some syllable in order to 
be pronounced. Accordingly, a vowel must be added to the CV tier to support 
the orphaned consonant. We can formulate this rule as one that inserts a 
supporting schwa vowel after any consonant that has not be linked to a a 
node. As a result of the application of the epenthesis rule in (13) , the 
underlying representation of /kalb+na/ in (12) will be derived as in (lU). 
We .assume that the syllable structure rules of the language apply to the 
output of epenthesis, constructing a CV syllable out of the inserted vowel 
and the orphaned consonant. 

(13) a 



-> V 

1 c 








1 1 

1 1 1 1 II 1 




CV -* 



y 1 V 

a a a 

a a a 

The epenthesis rule also applies in the derivation of the bare stem form 
kalbi from /kalb/. Its application is slightly obscured by a subsequent 
general rule of Tigrinya phonology that turns schwa to i_ at the end of a 






1 II II 



^ \K V 

a a a 

^ \ 
a a 


Note that second person forms such as kalbaxa 'your dog' show that spiranti- 
zation applies to the output of epenthesis. 

Consider now the paradigm for 'son' in (9). Given that geminates are 
represented as a single consonantal segment linked to two C positions on the 
CV tier, we see that /wadd/ is equivalent to /kalb/ on the CV tier and hence 
exhibits exactly the same pattern of syllabification, requiring epenthesis 
before the consonant-initial suffixes and to the bare stem. 





1 1 ^ 1 

\\h II 



Y M 

a a 


inappl . 



y M V 

a a a 


inappl . 


///I 1 
V V 

underlying representation 

epenthesis (13) 

a -> i / 

It is worth noting the interesting mismatch among the number of elements on 
each of the three separate tiers in the underlying representation of waddi 
in (l6). There are three imits on the segmental tier and due to the syllable 
canons of the language a o node can dominate at most three segments , But 
these two levels of representation are mediated by the CV tier, which contains 
four elements CVCC. It is precisely this mismatch that invokes the applica- 
tion of the epenthesis rule (13). The parallel behavior of the paradigms 
for 'dog' and 'son' in (9) is thus our first example of the equivalence of 
consonant clusters and geminates in Tigrinya phonological structure. 

Let us now tvirn to the paradigm for 'calf in (9). When combined with 
a second person possessive suffix (e.g. /marak+ka/) we have a sequence of 
identical consonants separated by a morpheme boundary. But unlike in Biblical 
Hebrew, the first consonant spirantizes in Tigrinya. Underlying /marak+ka/ 
is realized as marax+ka . The second person forms for 'calf show two things. 
First, the spirantization r\ile will have to be able to distinguish underlying 
geminates in forms such as B verbs like fakkara from those that arise from 
morpheme Juxtaposition. In terms of the analysis we have proposed, this 
distinction can be made on the basis of a single unit of the segmental tier 
associated with two adjacent consonantal positions in the CV tier versus two 
successive segmental units each of which is associated with a separate unit 
on the CV tier. 




/y/ii 1 

1 1 1 II II 



Sy H \/ 

N Nf V 

a a 


Given these representations, spirantization (?) will apply to the second but 
not to the first. Secondly, in order to maintain the above distinction, we 
will have to assume that Leben's Obligatory Contour Principle is in fact a 
parameter whose setting must be stipulated for individual grammars (operating 
in Biblical Hebrew but not in Tigrinya). 

In the face of these data one might attempt to save the Obligatory Con- 
tour Principle in its full generality by claiming that the spirant x in marax 
has been lexicalized so that the underlying representation is in fact /marax/. 
In general, such an analysis is possible for nouns, since the stem-final 
consonant never alternates with /k/ due to the relatively stable syllable 
structure of noun stems. Nevertheless such an analysis would be at odds 
with the fact that /x/ and /k/ are in complementary distribution phonetically 
and all instances of /x/ can be derived from /k/ by the natural spirantiza- 
tion rule (7). More importantly, when we turn to the much richer system of 
the Tigrinya verb, we see that such a restructuring analysis is not possible 
in any case and thus that the difference between Tigrinya and Biblical Hebrew 
with respect to the Obligatory Contour Principle is a genuine case of para- 
metric variation. 

In the perfect system of the Biblical Hebrew verb the suffixes that mark 
the first person singular and the second person singular and plural begin 
with the consonant /t/. Barkai (197**) cites karat : i from /karat+ti/ 'I made 
a covenant' and sihat :a from /sihet+ta/ 'you corrupted' with spiran- 
tization blocked (* kara9+ti , * siha9+ta ). As we have seen, it is precisely 
these forms that led Leben to invoke the Obligatory Contour Principle. In 
the South Semitic languages, on the other hand, these suffixes begin with 
a velar stop. The complete paradigms for the perfect of ma-sbar 'to break' 
and ma-btax 'to cut' are given below. 
































Biblical Hebrew and Tigrinya are thus minimally different with respect to 
the imposition of the Obligatory Contour Principle. In both languages there 
is a rule spirantizing post-vocalic consonants that must be prevented from 
applying to geminates; in one language (Biblical Hebrew) the geminates arising 
from morpheme juxtaposition block spirantization, while in the other (Tigrinya) 
these geminates undergo spirantization in exactly the same grammatical context. 

The only point remaining to be established is that the final radical in 
batax-a does in fact derive from an underlying stop. Unlike in the noun 
marax, the final radical of a Type A verb can be found in a post-consonantal 
context due to a rule of allomorphy regulating the distribution of the root 
shapes CaCCaC and CaCC in the imperfect of all Type A verbs. The latter 


shape appears when the root is followed by a vowel and the former shape 
appears elsewhere. Thus, the 3sg.m. and 3pl.m. imperfect forms of ma-sbar 
'to break' are ye-sabbar and ya-sabr-u ; the corresponding forms for ma-btax 
'to cut' are ya-battax but ya-batk-u , with underlying /k/ appearing on the 
surface. Since all Type A verbs with final radical /k/ show the k-x alter- 
nation, we are justified in assioming that the x appearing in the second 
person perfect forms such as batax-ka does indeed derive from an under- 
lying /k/ by the spirantization rule. 

To summarize briefly, in this section we have seen that underlying 
morpheme internal geminates do not spirantize (e.g. f akkar-a ) while geminates 
arising across morpheme boundaries are susceptible to this rule (e.g. 
batax-ka from /batak+ka/). In the next section we look at geminates 
that arise from a phonological r\ile of gemination. 

2. Consonant Gemination in Tigrinya 

The verb inflection in Tigrinya offers another case of alternation 
between k and x due to the operation of several complex patterns of gemina- 
tion arising from the addition of object suffixes to a preceding verbal 
word. In Tigrinya the object suffixes are the same series of suffixes as 
the possessive on nouns, except for the Isg., which is -ni on verbs and 
- ay as a possessive on nouns. For purposes of discussion, the gemination 
can be broken up into two major subdivisions depending on whether the 
preceding verbal word ends in a vowel or a consonant. 

Let us look at vowel-final stems first. Here there are two basic 
patterns. In pattern 1 a consonant-initial object suffix is added directly 
to the stem, while a vowel-initial suffix shows a hiatus breaker: the 
hiatus breeiker is a glottal stop if the stem ends in a, and a glide w if 
the stem ends in u. (We cite -a_ 'her' as representative for vowel-initial ob- 
ject suffixes and where 
(18) 3pl.f .perfect 3pl.m. perfect possible -ka 2sg.m. and -ni_ 

'me' for consonant-initial 
qatal+a qatal+u obj . suffixes. In some cases 

[qatal+a]?a [qatal+u]wa the final vowel of the stem is 

[ qatal+a ]xa [ qatal+u ]xa an augment that does not appear 

[ qatal+a ]ni [ qatal+u ]ni on bare verbs . ) 

In pattern 2 we find that a consonant-initial object suffix will have 
its initial consonant geminated. For a vowel-initial suffix we find that 
w is the hiatus breaker if the stem ends in u and ^ is the hiatus breaker 
elsewhere, i.e. after a, a_, and 9_. The hiatus breaker is also geminated in 
pattern 2. 

(19) 2sg.m. per feet Ipl. perfect 

qatal+ka qatal+na 

[qatal+ka]yya [ qatal+na ]yya 

[ qatal+ka ]nni [ qatal+na ]kka 


2sg.f .perfect 

[ qatal+ka ]yya 
t qatal+k.3 ]nni 

2sg.f .imperfect 

[ ta+qatl+a ]nni 

Isg. gerundive 


BsR.m. gerundive 


Unlike in possessed nouns, where the hiatus breaker is consistently a glottal 
stop regardless of the nature of the surrounding vowels (cf.9), the phonetic 
content of the hiatus breaker in verbs is variable. Accordingly, the rule 
for verbs can be interpreted as adding an onset on the CV tier that is not 
associated with a phonological unit. 


- C / [ 

Since there are no adjacent consonants from which the inserted onset can 
acquire a phonetic content, one of the adjacent vowels must contribute a 
feature to the consonantal onset. In Tigrinya the quality of the hiatus 
breaker is determined by the preceding rather than by the following vowel. 
(An informal survey of hiatus breaking in the languages known to me suggests 
that this is the unmarked case and is presumably a reflection of the general 
tendency for unassociated "anchors" to link to an autosegment on the left 
instead of the right (cf. Clements I981)). The inserted onset is w if the 
preceding vowel is u (i.e. [+round]) and ^ elsewhere (i.e. if the preceding 
vowel is [-round]). I will thus assume that the appearance of the inserted 
onset as w versus ^ is a reflex oflinkage to the preceding vowel, as indicated 
by the dashed lines in the partial derivations for [qatil+u]wwa, [ qatal+ka ]yya 
and [qatal+najyya below. 

(21) [qatil+u]a 


M M V I 

[ qatal+ka ]a 
I M M M I 


M M/ M I 

underlying rep. 


r-. I 

CVCVC vcv 
\l V V M 

[ qatal+ka ]a 
CVCVC cvcv 
M V M V 

a a 

[ qatal+na ]a 

MMI I r-. I 
CVCVC cvcv 
M V M M 

hiatus breaking (20 ) 

The gemination process can be viewed as simply the reflex of a rule that 
increases the length of the stem-final syllable from one to two moras by in- 
serting a postvocalic consonant on the CV tier to give a CVC syllable. This 
inserted C will link to the following consonantal element of the segmental 
tier without stipulation on the reasonable assumption that a C slot will link 
to an adjacent consonant of the segmental tier when one is available. 

The rule of gemination may thus be expressed as in (22). 

(22) ^ C / V _ ] [+segment] (a=syllable, a=verb) 

We assume that the object suffixes are adjoined to the verb so that the struc- 
ture of the complex word is [ [verb] ]. The rule is thus to be interpreted 

verb verb 
as increasing the length of the stem-final syllable of a verb by one mora when 
that syllable is followed by a suffix (i.e. by phonological material indi- 
cated by the feature [+segment]). Rule (22) thus converts the underlying 
representation [qatil+u]ni into [qatil+u]nni as indicated in (23). 

(23) [qatil+u]ni [qatil+u]ni 

I I I I I I i I I I I I I I .-1 I 
CVCVC V cv -^ cvcvc vccv 
M M V V ^J M V' I' 

When there is no adjacent consonantal element on the segmental tier for the 
C inserted by (22) to link to, we must assume that it will link to the pre- 
ceding vowel by the logic of our earlier treatment of the hiatus breaker. 
The derivations for [qatil+u]wwa, [ qatal+ke ]yya , and [ qat al+na ]yya are thus 
completed by the application of gemination (22) to the representations in (21). 


I I I I I r. I 
CVCVC vcv 
M N V 7 

[ qatal+ke ]a 

Mill I r-. I 

CVCVC cvcv 
a a a a 

[qat al+na ]a 
II I I I I r-. I 

hiatus breaking (20) 

1 1 1 1 1 AA I 



I I M I //i\ I 


00 O 

[qat al+na ]a 
I I I I I //.\ I 

M >i' ^* V 

a a a a 

gemination (22) 

Now what about the forms in pattern 1 (l8) where we find no gemination? 
According to Leslau (1939) the final vowels in these verb forms derive from 
etymological long vowels. In present-day Tigrinya phonetics £ and a are 
shorter than all other vowels, but this is the only consistent length diff- 
erence. In particular, there is no discernible phonetic difference between 
the final vowel in pattern 1 qatal+a (3pl>f -perfect ) versus pattern 2 
qatal+na (ipl. perfect) nor between pattern 1 qatal+u (3pl.m. perfect) and 
pattern 2 qatil+u (3sg.m. gerundive). Hence the pattern 1 forms will have 
to be marked as morphological exceptions to the gemination rule (22). 
Nevertheless if Leslau is correct in tracing these vowels back to etymolo- 
gical long vowels, it is clear why rule (22) would not have applied in 
these categories: their final syllables would already be long. 

We can now turn to the second subdivision in the overall pattern of 
gemination — verbs that end in a consonant. Here what we find is that the 
final consonant of the stem is geminated if it is preceded by a vowel and 
the following suffix begins with a vowel. But if the suffix begins with 


a consonant or the stem ends in two consonants (e.g. 3sg.m. imperfect 
[y3+qatl]a 'he kills her'), then no phonological change takes place. 



[ya+qtal ]ka 

3sR.f. perfect 

[ qatal+att la 

The gemination rule (22) that lengthens the final syllable of the verb stem 
by one mora automatically explains these forms. It will fail to apply in 
[ya+qatl]a and [ ya+qtal ]ka since the stem-final syllable is closed, but will 
apply in [ya+qtall]a and [qatal+att ]a, since the stem-final syllable is open.-' 


[ ya+qtal ]ka 

[ya+qtal ]a 

^ N V 

underlying representation 


[ ya+qtal ]a 
II I I l\i 

CV ccvccJv 
^ V M 

gemination (22) 

Having analyzed the gemination process, we can now turn to its bearing 
on spirantization. When the 2sg.m. object suffix /ka/ is added to a stem 
that triggers the gemination rule, an extra C slot will be added to which 
the initial k of the suffix will be linked creating a branching structure 
which will escape spirantization (7). Thus, [qatal+na]kka is derived from 
[qatal+na]ka as in (27). 

(27) [qatal+na]ka [qatal+na]ka 
II I I I II I I I I I I I I I .''I I 

a a a a o a a 

Now what about the jussive form of a stem ending in an underlying velar? 
Our formulation of gemination predicts that a branching structure will 
result here as well and so no spirantization should occur. In fact, this 
is correct. Compare the jussive forms of ma-btax 'sever'. 


3sg.m. jussive 
Bpl.m. jussive 
3sg.m. jussive 
+3sg.m. object 


'let him sever' 
'let them sever' 
'let him sever it' 

In terms of our analysis ya-btax-u and [ya-btakk]o receive the derivations 
in (29). 



V \J V 

o a a 


V M V 

a a a 

II I I I :\ I 

underlying representation 

gemination (22) 


>4/ M V 


spirantization (7) 

3. Complete Assimilation 

Let us now turn to the final source of geminate consonants in Tigrinya — 
those arising from a rule of complete consonant assimilation. The first case 
of this type occ\u-s in the passive. Compare the active and passive 3sg.m. 
forms of the pattern A verb raxaba 'find'. 



















The passive is marked by the prefix ta- in the perfect, gerundive, and 
imperative, and by the gemination of the root-initial consonant in the Jussive 
and imperfect. A reasonable hypothesis is that the passive morphology in the 
jussive and imperfect consists of the prefix /t-/ placed between the root 
and the person prefix and thus that jussive yarraxab arises from /ys+t+rakab/. 
The absence of surface gemination in the imperfect of Type A verbs ( yerekkab ) 
is phonologically governed, as can be seen by a comparison of the passive 
forms for type B, C, and quadriradical verbs. 







Type B 

Type C 


A type C verb does show gemination in the passive imperfect ( yabbarax ) . 
We can account for the superficial absence of gemination in the passive 
imperfect of types A, B, and quadriradical verbs by invoking a specieLL rule 
to degeminate the stem-initial geminate if the following syllable ends in a 
consonant. The rule will take the form of deleting a C element from the CV 
tier when followed by a closed syllable. 


[ [+cons] 
a A 
(32) C -> / ^CVC (a=passive imperfect) 

Note that this is another case where geminates pattern with consonant 
clusters in the phonology of Tigrinya. 

There are two pieces of evidence in favor of this approach. First, 
the stem-initial geminate posited as an intermediate stage in the deri- 
vation of the passive imperfect does show up in the frequentative of all 
verbs. The frequentative is formed by infixing a syllable of the shape 
Ca before the second radical of the root, with C= the second radical: 
e.g. sabara and sabiru have the frequentative forms sababara and 
sababiru respectively. Since this operation opens the stem-initial 
syllable, the geminate posited as an intermediate stage in the derivation 
of the passive imperfect may appear on the surface. Compare the passive 
frequentative gerundive and imperfect forms of the following type A, B, 
and C verbs in (33). 


gerundive ta-sababiru ta-badaddilu ta-bararixu 
imperfect ye-ssababar ya-bbadaddal ye-bbararax 

Secondly, there is a phonetic constraint in Tigrinya that the gutterals 
[h,9,h,?] may not appear geminate. These consonants only appear as short. 
When one of these consonants occupies the second radical position in the 
passive imperfect, we find that the stem-initial geminate appears: cf. 
ye-ssahab 'he is pulled' from the verb sahaba 'pull'. 

There are thus good reasons for supposing that the stem-initial short 
consonant in the passive imperfect ys-rakkab derives from /ye-rrekkab/ by 
the special degemination rule (32). Now what of the rule completely 
assimilating the passive prefix /t-/? In a theory which treats phonolo- 
gical structure as a single linear string of feature matrixes the rule 
would be given the formulation in (3^). 

(3U) t ^ [aF's] / C 

[+passive] [aF's] 

But given the multi-linear representation of the syllable adopted in this 
study, another formulation of the rule is possible — namely to delete the 
[t] of the passive morpheme from the segmental tier, leaving an unassociated 
C slot on the CV tier. 

(35) t ->■ / [+cons] 


By general autosegmental principles of reassociation (Clements I981), this 
C slot will then link to the following consonant of the segmental tier. In 
terms of this analysis the passive imperfect ya-rskkab and jussive ys-rraxab 
receive the derivations in (36). 



V V V 

a a 

II 1 1 1 II 1 

CV c cvcvc 
\I/ ^ M/ 

ys+ rakab 
1 1 .•■! 1 \\\ 

CV ccvccvc 

V V ^ 

ya+ rakab 
II /I 1 1 1 1 

CV ccvcvc 
V M V 

y3+ rekab 

cv cvccvc 



y3+ raxab 
II /I 1 II 1 


underlying representation 

passive deletion (35) 

degemination ( 32 ) 

spirantization (7) 

These two possible formulations of the complete assimilation rule (i.e. 
3^+ and 35) thus differ in terms of the formal structure sissigned to the gemi- 
nate r in/ya-rakkab/: r r versus r. We have seen that the Tigrinya rule 

' I ^ 

C C c c 
of spirantization is sensitive to precisely this difference in structure, 
applying in the former case, but blocked in the latter. The spirantization 
process can thus be exploited to test which of the alternative formulations 
of the complete assimilation is correct. It turns out that it is the second 
one that must be assumed to operate in Tigrinya. Geminate stops arising from 
(35) fail to spirantize (unless they have been degeminated by rule (32)). 
This is shown by the 3sg.m. forms in (37). 


active perfect 
passive perfect 
passive gerundive 
passive imperfect 
passive jussive 

kafata kahasa kalkala 

taxafta taxahsa taxalkala 

taxafitu taxahisu taxalkilu 

yexaffat yakkehas yexslkal 

yakkafat yakkahas yakkalkal 

'open' 'pay retri- 'prohibit" 
bution ' 

The passive imperfect and jussive forms of yaxaffat and yakkafat are thus 
derived as follows. 

(38) ys+t+kefat ya+t+kafat 

I I I I I \\\ I I I I I I I I 

cv c cvccvc cv c cvcvc 

V V V V M V 

o o o a 

underlying representation 

ya+ ksf at 

CV ccvccvc 

ya+ kafat 
II /I I I I I 

V M V 

passive deletion (35) 

ya+ kafat 
II I I \\\ 
CV cvccv.c 



degemination (32) 

ya+ xafat 
11 I I N\\ 


inapplic . 

spirantization (7) 


Note that we cannot maintain rule (3^+) by simply ordering spirantization 
before (3U) since spirantization must follow degemination (32) and degemi- 
nation must follow the rule assimilating (or deleting) the passive morpheme 
(i.e. (3!+) or (35)). 

There is one other rule of complete assimilation operative in Tigrinya 
that permits a similar test with respect to the spirantization rule. In 
contrast to the rule for the passive prefix, this rule is not morpholo- 
gically restricted and even operates (optionally) across word boundaries. 
By this rule /g/ and /q/ assimilate completely to a following /k/, giving 
a geminate /kk/. This geminate does not spirantize. Compare the possessed 
paradigms for ?a?dug (plural of ?adgi 'donkey', sanduq 'box', and marax 
'calf (from /marak/). The symbol £ stands for the spirantized realization 
of /q/ produced by (?)• 




























sanduq- na 




sanduq- an 











?a?dug kafilu ~ ?a?duk kafilu 
sanduq kafitu ~ sanduk kafitu 
marax kafilu - *marak kafilu 

'he payed (in) donkeys' 
'he opened a box' 
'he payed a calf 

The contrasting behavior of the forms ?a?duk-ka (from /?a?dug+ka/) 
and sanduk-ka (from /sanduq-ka/) versus marax-ka (from /marak+ka/) can be 
explained if we assume that the rule of complete assimilation for the velars 
involves the deletion of [g.cj] before [k] on the segmental tier, leaving 
behind an unassociated C slot on the CV tier, which then associates with the 
following /k/ of the possessive suffix. 




M/ V M 

llllll II 


V V \1 



>j ^ M 


underlying representation 

?a?du +ka 
I I I I I .•- II 


sandu +ka 
I I I I I .-'I I 

V V \l 

inapplic. [g,q.] -»• / k 



MM/ ^l 

spirantization [l] 

In this paper we have reviewed some evidence from Tigrinya which sup- 
ports the view that quantity distinctions are best represented in terms of 
the relationship between the segmental tier and the CV tier. Specifically, 
we have seen that Tigrinya makes a distinction between the first three of 
the four possible relations that can exists between successive units of the 
two tiers. 


c c 

k k 

I I 

c c 

( k k ? ) 


(It is an interesting question if evidence will ever be found for the fourth 
type of case. There are of course many examples in which two successive 
segmental elements count as single units of quantity (e.g. affricates, diph- 
thongs, etc.). But I am not aware of any cases in which these can be argued 
to derive from underlying sequence of identical phonological segments.) 
Theories of phonology in which phonological structure is described solely 
in terms of the inventory and linear arrangement of phonological elements 
are in principle incapable of making this kind of distinction and hence 
Tigrinya spiratization can be taken as strong support for the multilinear 
view of phonological structure. 


Leben (198O) actually assumed that phonological segments were mapped 
to terminal nodes in the syllable tree labeled strong and weak. We follow 
more recent developments in the theory of syllable structure (in particular 
Clements and Keyser I981) in assuming that phonological segments are mapped 
to the CV tier which in turn is organized into syllables. We do not take 
any stand in this paper on whether it is necessary to assume that subcon- 
stituents of the syllable such as rime, onset, or coda are represented in 
terms of labeled nodes in the tree. 

The research on Tigrinya reported in this study was conducted during 

the 1979-1980 academic year. I shoxild like to acknowledge the patient 

assistance of Efrem Mehretaeb who served as my consultant. A recent paper 

by Schein (1981) independently reaches conclusions similar to mine on the 

relevance of the Tigrinya spirantization riile to the analysis of geminate 

consonants . 

Spirantization also applied, though less systematiceilly, to /b/ in 
the speech of my consultant. This paper is limited to the discussion of 
spirantization as it affects just the velars. It should be noted that 
spirantization is not a neutralization rule in Tigrinya. All occurrences 
of the velar spirants x and 2, can be derived from /k/ and /q/ by the spi- 
rantization rule. 

type A verb is a velar, it will show the 
stop-spirant alternation in the imperfect: e.g. me-rkab 'to find' has the 
3sg.m. and 3pl.m. imperfect forms ya-rakkeb and ys-raxb-u respectively. 

of the gemination rule to these forms. Recall that the intuition under- 
lying our analysis is that the stem-final syllable is increased by one mora 
from CV to CVC when the verbal word is followed by an object suffix. The 
problem is that the notion "stem-final syllable" mixes notions from two 
distinct categories: "stem" is a grammatical notion while "syllable" is 
a prosodic one. These notions correspond in a representation such as 
[qatil+u]ni (which appears as gatilunni), since the syllable that gets 
increased by one mora does exhaust the final portion of the stem. But in 
a representation such as [y9-qtal]a (which appears as ys-qtalla ) the end 
of the stem and the final syllable of the stem do not coincide, since the 
stem-final segment 1 is an onset to the following syllable whose nucleus 
is the suffixal vowel. This problem is not of coiurse particular to our 
analysis alone but will arise whenver a rule makes reference to a stem- 
terminal syllable that does not happen to coincide with the stem exactly 
(e.g. rules syncopating a stem-final open syllable or stressing a stem~ 
final syllable). 

An alternative description of Tigrinya gemination that avoids this 
particular problem is available within the general theoretical framework 
of our analysis. On this analysis the C slots arising from oixr rules 


of gemination (22) and hiatus breaking (20) would instead be treated as 
unassociated C slots in the underlying representation of the relevant 
verbal suffixes and object suffixes respectively. Thus, in this approach 
a verbal stem terminating in a pattern 2 suffix such as the 3sg.m. gerundive 

qatil+u would have the underlying representation of (ia), while an object 
suffix such as the 3sg.f. /-a/, which we have treated as vowel-initial, would 
be analyzed as containing an unassociated C onset in the underlying represen- 
tation, as in (ib). 


lllll I 

cvcvc vc 

C V 


When a pattern 2 stem combines with a suffix containing an underlyingly 
associated onset we get gemination, as in (iia). And when a suffix with an 
underlyingly unassociated onset combines with a stem ending in a single 
consonant, we get gemination of the stem-final consonant, as in (iib). 
Finally, when a stem ending in an underlyingly unassociated consonant is 
combined with a suffix containing an underlyingly unassociated onset, we 
get both C slots mapped to the preceding vowel. Just as in our original 
analysis (iic). 


[qatil+u] ni 
lllll 1 .-'11 

CVCVC vc cv 
MM \l/ M 

a a a o 

b. [qatal+at] a 

lllll 1 r-.. 1 
CVCVC vc cv 
\J M M' M 

a a a a 

c. [qatil+u] a 

lllll I-:-.. 1 
cvcvc vc "cv 

a a a a 




On this analysis a pattern 1 suffix such as the 3plm. perfect that evokes 

no gemination can be represented as not containing an unassociated underlying 

C slot. 


[qatal+u] ni 

M M N V 

a a a a 

[qatal+u] a 
lllll I-.. I 

MM \1 M 


While this alternative analysis would appear to avoid the problem our gemin- 
ation rule has in making proper reference to the notion "stem", it nevertheless 
faces problems of its own. First, there are rules of the morphology regula- 
ting the distribution of vowel augments that are sensitive to whether 
the following object suffix begins with a consonant or vowel. These rules 
consistently treat as vowel-initial the object suffix that the alternative 
analysis represents with an underlying unassociated consonantal onset on the 
CV tier. For example, the Ssg.m. imperfect stem /ya+qatl/ takes the augment 
/a/ before consonant-initial suffixes (and being a pattern 2 vowel, the 
augment induces gemination of the following consonant): e.g. ya+qatl+a+nni . 


But before vowel-initial suffixes no augment appears: ya+qatl+a. Thus, 
under the alternative analysis we will have to say that the distribution of 
the augment is determined by the segmental level (in effect ignoring the 
underlying unassociated onset of the 3sg.f . object suffix) appearing before 
[ +consonantal ] segments, but absent before [+vocalic] segments. But then 
the underlying representation for ys+qatl-a will appear as in (iv), with 
three successive C-slots on the CV tier. 

(iv) [ye+qatl] a 

II nil I 


M V M 
We have seen that Tigrinya has a rule of epenthesis that applies in 
exactly this environment, converting CCC to CCaC. Why doesn't this rule 
apply in (iv)? The alternative analysis thus requires an ad hoc rule to 
delete the underlyingly unassociated onset slot from the segmentally 
vowel-initial suffixes precisely in the environment where an independently 
motivated rule of the language should apply. The original analysis of the 
text does not face this problem since it treats the suffix as vowel-initial 
on both the segmental and the CV tier. The final consonant of the stem 
forms an onset to the syllable whose nucleus contains the suffixal vowel 
and the gemination rule fails to apply since the stem-final syllable is 

(v) [ya+qatl] a 
Mini I 

M V V 

a a a 


BARKAI, M. 1975. On duration and spirantization in Biblical Hebrew. 

Linguistic Inquiry 5. ^56-^*59. 
CLEMENTS, G. I981. Akan vowel harmony: a nonlinear analysis. In G. Clements, 

ed. Harvard Studies in Phonology 2. Indiana University Linguistics Club 

CLEMENTS, G. and S. KEYSER I98I. A three-tiered theory of the syllable. 

Occasional Paper 19, Center for Cognitive Science, MIT. 
LEBEN, W. 1980. A metrical analysis of length. Linguistic Inquiry 11. 1*97-510. 
LESLAU, W. 1939. Essai de reconstruction des desinences verbales du Tigrigna. 

Revue des Etudes Semitiques . 70-99, 
LESLAU, W. 19i*l. Documents Tigrigna. Collection Linquistique, Societe de 

Linguistique de Paris XLVIII. 
MCCARTHY, J. I98I. A prosodic theory of nonconcatenative morphology. 

Linguistic Inquiry 12. 373-^18. 
SAMPSON, G. 1973. Duration in Hebrew consonants. Linguistic Inquiry k. 101-loU. 
SCHEIN, B. 1981. Spirantization in Tigrinya. In H. Borer and Y. Aoun, eds. 

Theoretical Issues in the Grammar of Semtic Languages (MIT Working Papers 

in Linguistics 3. 32-i*l) 

Studies In the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 


Chin-Chuan Cheng 


Charles W. Klsseberth 

This paper examines the tonological status of nasal 
consonants (specifically, pre-consonantal nasals) in two 
dialects of Makua, a Bantu language spoken in Mozambique and 
Tanzania. We demonstrate that nasals derived from sequences 
consisting of a nasal followed by a vowel regularly function 
as "tone-bearing" units in both of the dialects examined. On 
the other hand, nasals which cannot (synchronically at least) 
be shown to derive from nasal plus vowel sequences do not 
exhibit such uniformity in their behavior. In the Ikorovere 
dialect, most such non-derived pre-consonantal nasals are not 
tone-bearing, but there are certain exceptional cases; in the 
Imit upi dialect, there are many more cases of tone-bearing 
non-derived pre-consonantal nasals. A comparison of the two 
dialects reveals in part at least how the present-day situation 
came into being. On the basis of the data presented, we 
conclude that in Makua one cannot synchronically predict 
whether a given nasal consonant is tone-bearing or not. 
Consequently, the notion "tone-bearing" is not characterizable 
in purely phonetic terms, but Instead represents a more abstract 

In the autosegmental approach to the analysis of tone, tonal specifi- 
cations (henceforth, "tones") are located on one tier of the phonological 
representation (the "tonal tier") and are associated, by means of "association 
lines," with units on another tier (the "tone-bearing units"). Although it 
has sometimes been proposed in the literature on tone that the syllable is 
the tone-bearing unit, within the autosegmental framework it has been more 
common to find vowels and syllabic nasals being identified as the tone- 
bearing units. (Of course, insofar as each syllable has one and only one 
vowel or syllabic nasal, the two approaches may be indistinguishable.) 
The present paper explores the notion "tone-bearing" as it pertains to the 
tonological structure of Makua, a Bantu language spoken in northern 
Mozambique and southern Tanzania. In particular, we will be concerned in 
detail with the tonological status of pre-consonantal nasal consonants in 
two dialects of Makua — Ikorovere and Imit^upi.! We will argue that there 
are two classes of tone-bearing units in Makua: vowels and pre-consonantal 
nasals. However, not all pre-consonantal nasals are tone-bearing, and it is 
not possible to predict in all cases which ones are tone-bearing and which 
ones are not. In other words, the distinction between tone-bearing and 
non-tone-bearing must be included as part of the underlying representation 
and cannot be derived from any independent phonological fact. (It should be 
noted that none of the pre-consonantal nasals discussed in this paper are 
syllabic consonants.) 

In our discussion of the notion "tone-bearing" in Makua it will be 
necessary to refer to a number of principles of Makua tonology without 
providing a complete exposition of these principles. For additional 
discussion, the interested reader is referred to Cheng and Kisseberth (1979, 
1980, and 1981). A complete description of Makua phonology and morphology 
is under preparation. 

The cornerstone of Makua tonology is a principle that we refer to as 
High Tone Doubling (HTD) . This rule simply says that a high tone associated 
with one tone-bearing unit is also manifested on the immediately following 
tone-bearing unit. We shall refer to the first high tone in the resulting 
pair of high tones as the "primary high" and the second as the "doubled 
high". The primary high may either be (a) an unpredictable fact about the 
pronunciation of a given lexical item or grammatical formative that must 
be included as an idiosyncratic part of the underlying phonological 
representation or (b) a predictable aspect of the pronunciation that can 
be assigned to the phonological representation by virtue of a rule (the 
rules in question being triggered by morphological considerations rather 
than strictly phonological factors). The doubled high, of course, is not 
part of the underlying representation but simply the reflex of HTD. The 
doubled high is not, however, phonetically manifested in all environments 
since there are two tonological principles that delete the doubled high. 
A rule of Phrase-Final Lowering (PFL) lowers a doubled high when it occurs 
at the end of a phrase. A rule of Long Fall (LF) lowers a doubled high on 
the second of two consecutive tone-bearing units when it is followed by 
just one additional tone-bearing unit in the phrase. The rules of HTD, PFL, 
and LF are common to both Ikorovere and Imithupi. Imithupi exhibits two 
additional phenomena. The Short Fall (SF) rule converts the doubled high 
into a falling tone when the doubled high is followed by just one tone-bearing 
unit in the phrase. (Note that the difference between LF and SF is that 
the former rule affects only the doubled high on the second of two 
successive tone-bearing units whereas the latter rule affects a doubled 
high even if it is not immediately preceded by another tone-bearing unit.) 
The Mid-Tone (MT) rule converts a primary high to mid when it is (a) not 
preceded by a high and (b) neither followed by a high on an immediately 
adjacent tone-bearing unit nor followed by a low on the next tone-bearing 
unit (immediately adjacent or not). 

Let us now illustrate the rules mentioned above (HTD, PFL, LF, SF, and 
MT) and, in so doing, demonstrate that vowels must be included in the category 
of tone-bearing units in Makua. Consider the following examples from 

(1) ki-no-vah-a. . . 'I'm giving s.t. away' 
ki-no-luul-a. . . 'I'm spitting s.t. out' 
ki-no-thum£h-a. . . 'I'm selling s.t,' 

In these examples, as well as in all other examples, three dots after a 
citation indicate that the pronunciation given is the one appropriate for 
phrase-internal position. The absence of dots at the end of a citation 
indicates that the pronunciation is appropriate for phrase-final position. 
The examples in (1) illustrate one of the verb tenses where a primary high 
tone is assigned to the first vowel of the verb stem. The primary high tone 
induces a doubled high on the following tone-bearing unit. In the case of 


kl-no-vah-a . . . the doubled high appears on the /a/ vowel which occurs at 
the end of this and many other verbal constructions in Makua. In the case 
of ki-no-luul-a . . . the doubled high appears on the second mora of the long 
vowel in the stem /luul/. In the case of ki-no-thum£h-a . . . the doubled high 
appears on the second vowel of the stem /thumlh/; the second vowel of this 
stem can be analyzed readily as part of the causative morpheme /ih/. 

The examples in (1) illustrate a couple points. First, given that a 
primary high tone is located on a particular vowel, the doubled high will occur 
on the next tone-bearing unit regardless of the morphological relationship 
of that unit to the unit bearing the primary high. Thus the high tone on 
the first stem vowel in the /no/ tense doubles onto the grammatical elements 
/a/ and /ih/ as well as onto a root element in the case of /luul/. Second, 
the data in (1) show that the tone-bearing unit in Makua is the vowel rather 
than the syllable. This fact is demonstrated by the example ki-no-luul-a ... 
If the tone-bearing unit were the syllable, then we would expect a high tone 
to be associated with the first syllable of the stem and then to double onto 
the following syllable. But this is not what happens. Rather, the high tone 
is associated with the first vowel of the stem and doubles onto the second 
vowel, even when that vowel is part of the same syllable as the first. (Note 
that in Makua two successive identical vowels belong to the same syllable 
and are pronounced simply as one long vowel . ) 

Analagous examples from Imit upi are given in (2) . 

(2) ki-no-cis-a, . . 'I'm carrying s.t.' 
ki-no-wI£h-a. . . 'I'm bringing s.t.' 
kl-no-hukul-a. , . 'I'm sieving s.t.' 

These examples differ from Ikorovere only in terms of the effects of the 
application of MT. Recall that in Imit^^upi a primary high is realized as 
mid when it is not preceded by a high and not followed by either a high on 
an immediately adjacent tone-bearing unit or a low on the next tone-bearing unit 
(adjacent or not). Thus the primary high in ki-no-cis-a . . . and 
ki-no-hukul-a . . . is realized as mid since it is preceded by a low and followed 
by a high (that is not on an immediately successive tone-bearing unit). 
On the other hand, the primary high in ki-no-wlih-a . . . is blocked from under- 
going MT since it is followed by a high on an immediately adjacent tone-bearing 

Another verbal construction is illustrated in (3). These data are 
from Ikorovere. 

(3) ki-n66-lfm-a. . . ki-noo-lfm-a 'I'm cultivating' 
ki-noo-tlpur-a. . . ki-n66-t£pur-a 'I'm hoeing deeply' 
ki-noo-leeh-a. . . ki-noo-leeh-a 'I'm saying farewell' 

The tense marker here is /noo/ and it has a primary high tone located on 
its first mora. A doubled high appears on the second mora of /noo/. In 
this construction, as in the /no/ tense, a primary high tone is assigned 
to the first vowel of the verb stem. The primary high tone doubles onto 
the vowel that follows the first stem vowel. The left-hand column, which 
reflects the phrase-medial pronunciation of these verbs, shows the effects 
of HTD clearly. The right-hand column, which reflects the phrase-final 


pronunciation of these forms, manifests the application of PFL (in the case 
of ki-noo-llm-a ) and LF (in the case of ki-noo-leeh-a ) . Recall, PFL requires 
that a doubled high lower if on a phrase-final vowel, and LF requires that 
a doubled high lowers if it is located on the second of two adjacent tone- 
bearing units when there is just one more tone-bearing unit in the phrase. 
Note that the doubled high in ki-noo-ttpur-a does not undergo either PFL or 
LF since it is not located on a phrase-final vowel nor is it located on 
the second of two successive tone-bearing units. 

Comparable data from Imit upi are cited in (4). 

(4) ki-noo-rup-a. . . ki-no6-rup-a 'I'm sleeping' 
ki-noo-terekh-a. . . ki-no6-terek -a 'I'm cooking' 
ki-noo-wi£h-a. . . ki-noo-wfih-a 'I'm bringing' 

These data differ from those in Ikorovere only in showing the effect of 
SF in the case of ki-noo-terekh-a . Recall that in Imit^upi SF converts 
a doubled high to fall when the doubled high is located on a tone-bearing 
unit that is followed by just one more tone-bearing unit in the phrase. 
Incidentally, notice that MT does not apply to the primary high tone on 
the first stem vowel since the preceding tone-bearing unit has a high tone; 
MT applies only to primary highs that are not preceded by a high. (MT does 
not apply to the primary high on the first mora of /noo/ either, since a 
primary high does not undergo MT when it is followed by a high on an immediately 
adjacent tone-bearing unit.) 

Up until this point we have cited only verbal words in illustration 
of HTD, PFL, LF, SF, and MT. But nouns also exhibit the same phenomena. 
In Makua, every noun has (at the lexical level) at least one primary high 
tone. Some nouns have two or more primary high tones. Generally speaking, 
these primary high tones are never located on adjacent tone-bearing units 
(in other words, room is left for the primary high to double onto the following 
tone-bearing unit). While Ikorovere shows a considerable degree of predictability 
with respect to the location of the primary highs, there are still significant 
problems with claiming that these highs are assigned by rule; in any case, 
Imit^upi exhibits many lexical tonal contrasts in nouns and there is no 
question but that the location of the primary highs in Imit^upi nouns is 
part of their underlying structure. 

Consider the following data (K=Ikorovere and M=ImitKjpi): 



n-luto. . . 





oil drum 


. K 

ni-parari . 

. M 

'rib, side 

Each of these nouns has a primary high tone on the first vowel of the noun 
stem. (Makua nouns consist of a prefix, which in some instances may be 0, 
and a stem. We have separated the prefix from the stem by means of a hyphen.) 

This primary high tone is followed by a doubled high on the second vowel 
of the stem. However, the doubled high is absent if the second vowel is 
phrase-final (cf. n-luto K,M versus n-luto... K and n-luto... M) or if it 
is the second of two successive tone-bearing units and is followed by just 
one more tone-bearing unit in the phrase (cf. ni-piipa K,M versus ni-pffpa . . . 
K,M) . Furthermore, in Imlt^upi the primary high is realized as mid in 
n-luto . . . and ni-parari . . . / ni-parari since it is not preceded by a high and 
not followed either by a high on an immediately successive tone-bearing unit 
or a low. Notice that in n-luto M, the primary high is not lowered to mid 
since it is followed by a low (that low being the consequence of PFL) . 
The primary high in ni-pffpa . . . M is not lowered to mid since it is followed 
by a high on an immediately adjacent tone-bearing unit, and the primary 
high in ni-pflpa is not lowered to mid since it Is followed by a low (that 
low being the consequence of LP) . 

In (6) we give some examples of nouns with two primary high tones. 






'bird dung' 


. K 









i-huruusi. . 









'sp. lizard 

In the three examples from Ikorovere, there is a primary high tone on the 
first vowel of the noun stem and on the third vowel. The primary high tones 
double onto the second and fourth stem vowels. The doubled high on the second 
stem vowel always surfaces since it is never in the environment to undergo 
PFL or LF. The doubled high on the fourth stem vowel is lowered in phrase- 
final position as a consequence of PFL. Notice, incidentally, that in 
n-t6ndoo¥f . . . K, the second primary high is on the second mora of the long 
vowel. The first mora of that long vowel gets a doubled high on it. Notice 
also that in n-t6nd66¥i K, the primary high on the second mora of the long 
vowel does not undergo LF (cf. ni-p£lpa . . . K vs. nl-piipa K) . LF must be 
restricted to a doubled high. 

In the case of the Imit^upi example ni-huruusi . , . , there is a primary 
high on the prefix and on the second stem vowel (which happens to be the first 
mora of a long vowel). The first primary high is subject to MT but the 
second primary high is immune from this rule on two grounds (it is preceded 
by a high and it also followed by a high on an immediately adjacent tone- 
bearing unit). The doubled high on the first stem vowel is immune from all 
the rules affecting doubled highs (PFL, LF, and SF) , as is the doubled high 
on the third stem vowel. The example nt-huruusi differs only in that the 
doubled high on the third stem vowel is affected by LF. Turning to nl-kutukut^u 
we see that this example also has a primary high on the prefix, but the second 
primary high is on the third stem vowel. Both of the primary high tones are in 
the environment for MT to apply. The doubled highs on the first stem vowel and 
on the fourth stem vowel are immune from the rules affecting doubled highs. 
nl-kutukut^u differs in that (a) the doubled high on the fourth stem vowel Is 
lowered as a consequence of PFL and (b) the primary high on the third stem 
vowel escapes KT (since it stands before a low as a result of the application 
of PFL). 

We have illustrated several principles of Makua tonology, and we have 
seen that on the basis of these principles and their applicability to the 
examples discussed so far, it makes sense to claim that vowels are tone- 
bearing units in Makua. We now turn our attention to nasal consonants. 
First of all, it should be noted that there is no evidence that pre-vocalic 
nasals are tone-bearing units. For instance, examine the data in (7). 

(7) i-p ome... K i-p ome... M 'blood' 
i-phome i-p^ome 

i-kNinx... K i-k^un£... M 'firewood' 

i-k^uni i-kNini 

These nouns have a primary high tone on the first vowel of the stem. 
The first vowel of the stem is followed by a pre-vocalic nasal. Notice 
that the doubled high appears not on the pre-vocalic nasal but on the 
vowel that follows it. Evidence that the doubled high is associated with 
the vowel rather than the nasal comes from the fact that the doubled high 
is subject to PFL (which applies only to tone-bearing units at the end of 
the phrase). If the doubled high were associated with the nasal, it would 
not be in the environment for PFL. Another piece of evidence that the 
doubled high is associated with the final vowel rather than the nasal 
comes from the fact that in Imit^upi MT does apply to the primary high in 
i-phome . . . and i-k^unf . . . Remember that MT does not ordinarily affect a 
primary high if that high is followed by an immediately adjacent tone- 
bearing unit containing a high. If the nasal in i-p^ome . . . and i-k"un£ . . . 
were indeed a tone-bearing unit, we would expect MT to be inapplicable 
since the primary high would be followed by an adjacent high. 

The case of pre-consonantal nasals is not, however, so straightforward. 
First, it can easily be demonstrated that some pre-consonantal nasals are 
not tone-bearing. (8) offers some relevant examples. 

(8) ni-kwinjiri. . ./ni-kwlnjiri K 'brass or iron bracelet' 
ni-kwinj f ri . . . /ni-kwinj Ir i M 

ni-p umbulu. . . /ni-p umbulu K 'sucker (of a plant, tree)' 
nl-p^umbulu. . . /ni-p^umbulu M 

n-tambwaarl. . ./n-tambwaari K 'cassava flour' 
n-tambwaarf . . . /n-tSmbwaari M 

Each of the nouns in (8) has a primary high tone on the first stem vowel. 
This vowel is followed by a pre-consonantal nasal. But in each case the 
evidence is clear that it is the following vowel, rather than the pre- 
consonantal nasal, that receives the doubled high tone as a result of HTD. 
For instance, consider ni-kwfnjiri . . . K and ni-phumbulu . . . K. There is a 
high tone on the second stem vowel. This high tone cannot be a primary high — 
since if it were a primary high, it would have to double onto the next tone- 
bearing unit. But it does not. Thus if the second stem vowel has a doubled 
high, the pre-consonantal nasal cannot be tone-bearing. If this nasal were 
tone-bearing, it would receive the doubled high and not the second stem vowel. 



The Imit upi forms ni-kwlnjirl and nl-p umbulu provide confirmation that the 
high tone on the second stem vowel is a doubled high since both of these 
examples reveal that this high tone is subject to SF (a rule which applies 
only to doubled highs, not primary highs). The noun n-tambwaarl . . . K differs 
from the first two examples by virtue of the fact that in addition to a 
primary high on the first stem vowel there is also a primary high on the 
third stem vowel. The high tone on the second stem vowel must be interpreted 
as a doubled high, which means that the pre-consonantal nasal between the first 
and second stem vowels must not be tone-bearing. 

It Is important at this juncture in the exposition to make one point. 
When we say that the pre-consonantal nasals in (8) are not tone-bearing we 
do not mean that in phonetic representation they are not pronounced on any 
tone. Being a sonorant element, these nasals must be pronounced on some 
pitch level. However, the pitch of these nasals is entirely predictable: 
they are pronounced on the same pitch as the preceding vowel. Thus if the 
preceding vowel is low, such a nasal will be low; if the preceding vowel is 
high, such a nasal will be high; if the preceding vowel is mid, such a nasal 
will be mid. By "tone-bearing" we do not refer to phonetic pitch. Rather 
we refer to whether or not the unit is capable of being associated with a 
tone at the phonological level. 

We have given examples of pre-consonantal nasals that are not tone- 
bearing. There are, in addition, such nasals that are tone-bearing. In 
particular, in both Ikorovere and Imit upi, nasals which can plausibly be 
argued to derive (synchronically) from a sequence consisting of a nasal 
plus a vowel are regularly tone-bearing. Due to space considerations, we 
cannot give an exhaustive survey of pre-consonantal nasals which derive from 
NV sequences, but we will give sufficient data to establish clearly that such 
nasals are indeed tone-bearing. 

Nominal morphology provides two relevant cases. Makua nouns consist of 
a prefix plus a stem. Nouns can be sorted into different "noun classes" 
(in part) on the basis of the phonological shape of their prefixes. Typically 
the noun classes are paired such that when a given stem is used with the 
characteristic prefix of one of these paired noun classes, the noun will have 
a singular meaning, whereas when the stem is used with the characteristic 
prefix of the other member of the pair, the noun will have a plural meaning. 
These observations can be illustrated by the following examples. 

(9) ni-hute K,M 'cloud' ma-hute K,M clouds 


ni-k uva K 'bone' ma-k, Gva K 'bones' 

ni-k uva M 

nl-kosa K 'bracelet' ma-kosa K 'bracelets' 

ni-kosa M ma-kosa M 

ni-vala K 'sp. rat' ma-vala K 'rats' 

ni-vala M ma-vSla M 

ni-piipa K,M'oil drum' ma-pfipa K,M 'oil drums' 


From these data we can see that there is a noun class characterized by 
the prefix /ni/ used to form singular nouns which is paired with another 
noun class characterized by the prefix /ma/ used to form plural nouns. How- 
ever, this picture is slightly complicated by examples such as the following 
from Ikorovere: 






















' palm ' 



In these examples, we find the prefix /n/ used in the singular and the 
prefix /ma/ in the plural. There is, however, an obvious connection 
between the /ni/ prefix of (9) and the /n/ prefix of (10): they differ 
only in that the former contains a vowel that is missing in the latter. 
Thus the possibility exists that the /n/ is simply a phonological variant 
of the /ni/. This possibility is supported by the observation that the 
/ni/ and /n/ shapes are in complementary distribution. The /ni/ occurs 
before consonants other than coronals whereas the /n/ occurs before coronals. 
Thus a rule can be posited that deletes the vowel of /ni/ when that prefix 
stands before a coronal. (There is additional evidence that /ni/ and /n/ 
are in fact instances of the same noun class prefix; this evidence derives 
from the fact that each noun class governs a particular pattern of grammatical 
agreement, and the singular nouns in (9) and (10) govern precisely the same 
pattern of agreement. We will not, however, provide the pertinent data 
illustrating this point here.) 

The /n/ alternant of /ni/ is a tone-bearing unit. This is immediately 
evident from a consideration of Imlt^pi. In Imlt^upi (unlike Ikorovere), 
the noun class prefixes /ni/ and /ma/ may be either low-toned or high-toned. 
Compare the examples in (11) with those in (12). 

(11) ni-vasi (ma-vasi) 'scar' cf. ni-vasl^.. 
ni-menjo (ma-menjo) 'fish-hook' ni-menj6... 
ni-phavela (ma-p^avela) ' lung' ni-p^avela. . . 
ni-piipa (ma-pfipa) 'oil drum' ni-pllpa... 

(12) ni-k^oci (ma-k^oci) 'snail shell' cf. ni-k oci... 
nl-vali (ma-vali) 'potshard' nl-vSli... 
ni-p ooru (ma-p ooru) 'foam, scum' ni-p"ooru... 
nl-pet^e (ma-pot^e) 'boll, abscess' ni-pothe... 

In (11) there is a primary high tone on the first vowel of the stem. This 
primary high is realized as a mid tone when it is in the environment to undergo 
MT. A doubled high appears on the second stem vowel. This doubled high 
undergoes PFL in ni-vasi and ni-menjo ; it undergoes SF in ni-p^avela ; and 
it undergoes LF in ni-ptipa . In (12), the primary high is located on the 
prefix and is subject to MT. A doubled high appears on the first vowel of 
the stem. This doubled high undergoes SF in the examples nl-khSci , nl-vSli , 
and ni-pot^e . 



In (13) and (14) we see that the /n/ alternant likewise can be either 
low-toned or high-toned. 

(13) n-l£li (ma-lfli) 'sleeping mat cf. n-lilf... 

n-sflo (ma-sflo) 'lower grinding stone' cf. n-sil6. 
n-t^uli (ma-t^uli) 'piece of meat' cf. n-tNllf... 
n-tendehu (ma-tendehu) 'hornet' cf. n-tendehu... 

(14) n-sepa (ma-sepa) 'valley' 

n-t^avi (ma^t^avi) 'hunting net' 
n-th^uwa (ma-thWiwa) 'cooking stone' 
n-rama (ma-rama) 'cheek' 

cf , 

n-sepa. . . 



The data in (13) and (14) are obviously entirely parallel to those in 
(11) and (12), except that whereas in the former pair the prefix has a 
vowel in it, in the latter pair the preflxal vowel is missing. The nasal 
prefix in examples like n-sepa must be associated with a primary high tone 
that can undergo MT and this primary high tone must be able to double onto 
the following tone-bearing unit. 

There is another noun class prefix in Makua that also has an alternant 
that consists of just a nasal consonant. Note the data In (15) and (16) from 






'month, moon 




'bee sting' 







(15) mw-eto 

(16) m-puuno 

The examples in (15) apparently Involve noun stems which are vowel-initial, 
while those in (16) involve consonant-initial stems. The noun class prefix 
in the plural examples is obviously /ml/; the vowel /I/ obviously glides 
before the /a/-inltial and /e/-lnitial stems shown in (15). But what Is the 
underlying shape of the singular prefix? The most likely candidate is /mu/. 
Such an underlying representation would quite plausibly produce the alternant 
/mw/ in pre-vocalic position by a gliding process. In pre-consonantal position, 
however, the vowel of /mu/ must undergo a process of vowel elision, with the 
nasal /m/ assimilating the point of articulation of the following consonant. 
The elision of the vowel of /mu/ creates, therefore, another case of a pre- 
consonantal nasal that originates from a NV sequence. 

In Imlt upi (unlike Ikorovere), the noun class prefix /mu/ may either 
be low-toned or high-toned. In (17) we give examples of the low-toned form 
of the prefix, while in (18) we give examples of the high-toned form. In 
all these examples the /mu/ prefix appears simply as a pre-consonantal nasal. 


1-hukulo (mi-hukulo) 

-somero (ml-somero) 

n-muto (mi-muto) 

n-phfta (ml-phfta) 

i-theko (mi-theko) 

(18) n-lapa 


(mi -mini) 


beer sifter 

a kind of drinking vessel 



task, work 

baobab tree 



sp. black ant cf . ii-cooc 

In an example such as n-somero it is clear that the primary high is on the 
first stem vowel and that a doubled high appears on the second stem vowel 
(for in ImitVpi the mid tone results only from a primary high and the 
falling tone on a short vowel results only from a doubled high). On the 
other hand, in an example such as n-lapa it is equally clear that the 
primary high must be on the prefix /n/ and the doubled high on the first 
stem vowel (for as we just pointed out, mid tones come uniquely from primary 
highs and short falling tones come uniquely from doubled highs). 

In Ikorovere, the nasal consonant derived from /ni/ and /mu/ is low- 
toned since in this dialect these prefixes are regularly low-toned (i.e. 
contrasts such as those shown above for Imit'^upi do not exist in this 
dialect). Nevertheless, the nasal consonant in question must be regarded 
as tone-bearing, even if it is regularly a low tone that it bears. This 
can be seen from the fact that when a noun such as n-lako 'door' occurs 
after another word, the /n/ will be pronounced on a low tone even if the 
preceding word ends in a high tone. If the nasal were not tone-bearing, 
it would automatically be pronounced on the same pitch level as the 
preceding vowel. Thus the fact that the low-toned realization of the /n/ 
in n-lako is independent of the nature of the preceding vowel demonstrates 
that this /n/ is tone-bearing. 

We have cited two instances in Makua where a pre-consonantal nasal 
derives from a NV sequence and that nasal is tone-bearing. As mentioned 
earlier, in all parallel cases, the nasal is tone-bearing. Given such 
examples, one might be tempted to claim that these pre-consonantal nasals 
are not underlyingly tone-bearing. Under lyingly, it is the vowel of the 
prefixes /ni/ and /mu/ which is tone-bearing. The nasal becomes tone-bearing 
only in surface structure. Given this view, one could maintain that in 
underlying structure only vowels are tone-bearing and that one can predict 
which nasals will superficially appear to be tone-bearing. 

There is, however, one case of a grammatical formative that consists 
just of a nasal which is tone-bearing but does not clearly derive from a 
NV sequence. In Ikorovere there is a suffixal element /al/ that follows a 
verb stem and precedes a final vowel /e/. Some examples of this formative: 


k a-a-pah-al-e 

/k a-a-rap-al-e 
tupul-al-e(. . .) 

he hasn 
he hasn 
he hasn 

t burned s.t.' 
t bathed' 
t cut it' 

In the construction illustrated, a primary high tone is assigned to the 
second vowel of the verb stem. This primary high tone vd.ll of course 
double onto the following vowel (and the doubled high will undergo lowering 
if it is phrase-final, etc.)» 

Items such as those in (19) regularly have an alternative form in 
Ikorovere where the /al/ formative is absent and in its place a nasal 
consonant appears before the last consonant of the stem. Compare (20): 

(20) k, a-a-pa^h-e. . ./k, a-a-pa^h-e 'he hasn't burned s.t.' 
k^a-a-ramp-e. . . /k a-a-ramp-e 'he hasn't bathed' 
k a-a-tupunl-e. . ./k"a-a-tupunl-e 'he hasn't cut it' 

The nasal that occurs before the final stem consonant in these examples 
is homorganic with that consonant. It is also tone-bearing. When this 
nasal is preceded by just one vowel in the stem, the nasal receives the 
high tone that is assigned to the second tone-bearing unit in the stem. 
When the nasal is located after two stem vowels, the nasal receives a 
doubled high from the primary high that is located on the second stem vowel. 
Furthermore, the doubled high on the nasal is subject to the LF rule in an 
example such as k^a-a-tupunl-e . 

In Imit^upi, the suffix /d 
the case that not all stems allow this /il/ to be replaced by a homorganic 
nasal located in front of the final consonant of the stem. Nevertheless, 
the facts presented above for Ikorovere have their counterparts in Imit^upi 
as well. 

We have shown that the pre-consonantal nasal in (20) is tone-bearing. 
We also know that in some sense this nasal is an alternative to the suffix 
-al- (-il- in Imit upi) . But what is not so clear is that this nasal 
should be phonologically derived from the sequence -al- (-11-) . If the 
nasal is not phonologically derived, then we would have a case where at 
least one pre-consonantal nasal must be underlyingly tone-bearing. If the 
nasal is phonologically derived from -al- (-il-) , then we could continue 
to claim that only vowels are underlyingly tone-bearing. This would then be 
a case where the sequence VC yields a tone-bearing nasal. 

There are other problematic cases of pre-consonantal tone-bearing nasals. 
In both Ikorovere and Imithupi there is a large set of nouns which have no 
prefix in the singular and a high-toned /a/ prefix in the plural. Some examples: 

(21) k^apa K,M 'tortoise' a-k apa K, a-k apa M 'tortoises' 
ret^e K,M 'sp. shrew' a-ret^e K, a-ret e M 'shrews' 
hukula K, hukula M 'hare' a-hukula K, a-hukula M 'hares' 

k araka K,M 'potato' a-k araka K, a-k arSka M 'potatoes' 

Within this set of nouns, however, there is an unexpectedly large group whose 
stem begins with the syllable /na/, suggesting that perhaps this /na/ was at 
some point in time a grammatical element of some type. (22) provides a few 


(22) nakopo K,M 'sp. tree' a-nakopo K, a-nakopo M 'trees' 

nakuluwe K, nakuluwe M 'sp. bean' a-nakuluwe K, a-nakuluwe M 

' beans ' 

nap ulu K,M 'frog' a-nap ulu K, a-nap ulu M 'frogs' 

nasinuku M 'porcupine' a-nas£nuku M 'porcupines' 

Within the group of /na/-lnltlal nouns of this type, there is a 
substantial number where a pre-consonantal nasal follows and this nasal 
is tone-bearing. Examples: 

(23) nampembere K 'giant' a-nampembere K 'giants' 

naiiipya K,M 'sp. bird' a-nampya K, a-nampya M 'birds' 

nampaap i K 'leaves of sp. bean' a-nampaap"i K 'pi. leaves' 

nancuwa K, nancuwa M 'sp. snake' a-nancuwa K, a-nancuwa M 'snakes' 

nanlume K, nanlume M 'male elephant' a-nanlume K, a-nanlume M 

'male elephants' 
nanhoko M 'sp. snake' a-nafjhoko 'snakes' 

The nasal consonant following /na/ in these examples must be tone-bearing 
since it is pronounced with a high tone (or a mid tone, as a consequence of 
MT in Imit^upi) regardless of whether the preceding vowel in /na/ is low-toned 
(as in the singular) or high-toned (as in the plural). Recall, pre-consonantal 
nasals that are not tone-bearing are pronounced on the same pitch level as 
the preceding vowel. 

It is quite possible that at one stage in the history of Makua the 
formative /na/ was prefixed to a noun containing a prefix and that the 
pre-consonantal tone-bearing nasals in (23) were actually Just 
reduced forms of /ni/ or /mu/. While such an historical possibility exists, 
the fact remains that synchronically the tone-bearing nasals in (23) cannot 
plausibly be derived from a NV sequence. However, it does appear to be true 
that pre-consonantal nasals following /na/ are regularly tone-bearing, so it 
could be claimed that these data do not lead to the conclusion that there is 
no way to predict whether a given pre-consonantal nasal is tone-bearing or not. 

There is, however, evidence that the ability to bear tone is an unpredict- 
able aspect of pre-consonantal nasals. Let us begin by considering verbal 

(24) u-menj-a... / u-menj-a K 'to catch fish with hook and line' 
u-menj-a... / u-menj-a M 

u-plnd-a... / u-p£nd-a K 'to twist' 
u-plnd-a... / u-p£nd-a M 
u-pangac-a(. ..) K 'to fix, repair' 
u-pangac-a. . ./ u-paqgac-a M 

The infinitive form of the verb in Makua regularly has a primary high tone 
assigned to the first vowel of the verb stem. This high tone will naturally 
double onto the next tone-bearing unit. In (24) it is obvious that the 
pre-consonantal nasal following the first stem vowel is not tone-bearing 
since it is the following vowel that receives the doubled high. 

Most of the verbs in Makua containing a stem-internal pre-consonantal 
nasal are like those in (24) in that the nasal is not tone-bearing. However, 
there are a few examples where such as nasal is tone-bearing. 



./ u-punth-a 


u-hafikw-a . 

./ u-ha^w-a 



./ u-t^fnt-a 


'to pick out, pry out, remove using 
a sharp instrument' 

'to go into the bush to defecate* 
'to play with water' 

The fact that these pre-consonantal nasals (a) receive a doubled high tone 
in examples like u-punth-a . . . , u-hankw-a . . . , and u-t|^ffit-a . . . and (b) undergo 
LF in examples such as u-punth-a , u-hankw-a , and u-t^ffit-a demonstrates un- 
equivocably that they are tone-bearing. But there is no way to predict that 
the nasals in (25) are tone-bearing whereas those in (24) are not. 

When one turns to nouns, the problem of unpredictability — especially 
in Imit^upi — becomes even more striking. Consider the examples in (26) and 

(26) non-tone-bearing pre-consonantal nasals 















ni-wambwe M 





-th urjkwa M 
-t^ei^ga K,M 


















trip' (cf. u-rendo... K, u-rendo... M) 

the long side of a loin cloth' 

a kind of arrow' (cf. n-soonga... K) 

a kind of musical instrument' (cf. rilmba. 


tree' (cf. i-rfnth a...K, i-rinth a 

branch of 




message, messenger' 


a kind of walking staff (cf. n-toondo. . .K) 

type, fashion' 

pocket of grass within a burnt area' 

area along river for cultivating rice' 

bedpost ' 


n-ronth o 





























n-cfinga K 

'stick used for poking' 

'reed mat' 


'shallow lake formed after rains' 

'bunch of bananas' 

'piece of cloth' 

'kind of boiled maize' 





'a kind of drum' 

tone-bearing pre-consonantal nasals 



ni-venkwa M 


























n-fyanfyo K 

a kind of musical instrument' 

a kind of horn' 

edge, shore' 

pocket of grass within a burnt area' 


a kind of drum' 



shallow lake formed after rains' 

a kind of walking staff (cf. ii-tonto...) 

a kind of arrow' (cf. n-s6nka...) 

kind of boiled maize' 

dried saliva' 

a kind of squash' 

green cashew nut' 

whip, small stick used to beat s.o.' 

Comparison of the data in (26) and (27) provides some insight into 
the source of many of the tone-bearing pre-consonantal nasals in Imit^upi. 
There are numerous examples where Ikorovere has a long vowel followed by 
a cluster of nasal plus voiced consonant where Imit^upi has a short vowel 
followed by a nasal plus voiceless consonant. For example, we have riimba K 
but rimpa M; cuuncu K but cuncu M; n-taanda K but n-tanta M. What seems 

to have happened Is the following: these Items originally contained a 
long vowel (=two tone-bearing units) ; when this long vowel shortened in 
Imithupi, the number of tone-bearing units was retained by converting the 
nasal into a tone-bearing element. Although this provides a plausible 
account of the origin of many of the stem-internal tone-bearing nasals in 
Imithupi, it does not alter the fact that from a synchronic point of view 
one cannot predict whether a pre-consonantal nasal will be tone-bearing or 
not. Thus in Imit^upi we have sequences of vowel-nasal-voiceless consonant 
where the nasal is tone-bearing (as in n-tanta ) but also cases where it is 
not (as in i-rfnt^a , n-ronth^o , and ni-kump^a ) , 

We have suggested earlier that some instances of tone-bearing stem- 
internal nasals may have originated from prefixes of the shape /ni/ or 
/mu/. Some evidence for this claim is provided by a comparison of forms 
in Ikorovere with those in Imit^upi. For example, Ikorovere has the noun 
t^oro 'field mouse' ( a-t^oro 'field mice'); but in Imithupi we find nt^gro 
as the singular form and a-nt^ro as the plural form. The stem-initial /n/ in the 
Imithupi form bears a low tone in the case of hthgro since it is underlyingly 
low-toned, but receives a doubled high from the prefix /a/ in the case of 
a-nt"6ro . The Ikorovere item demonstrates that originally the stem was simply 
/t^oro/ and that the nasal in Imit^upi represents an element added to the 
stem. The only elements that are regularly added before the stem are noun 
class prefixes. Thus it seems most likely that the /n/ in nt^oro originated 
as a prefix. Synchronically, however, there is no evidence whatsoever for 
identifying this nasal as a prefix. From the point of view of the speaker 
of Imithupi, nt^oro is unanalyzable. 

It should perhaps be noted (in connection with the discussion immediately 
above) that not all stem-initial pre-consonantal nasals are tone-bearing. 
Consider the example ndeembe K 'cockerel'. The plural form of this stem 
involves prefixing /a/, which bears a primary high tone: a-ndeembe. Notice 
that the primary high of the prefix doubles onto the first vowel of the stem — 
in other words, the pre-consonantal nasal is skipped over, showing that it 
is not tone-bearing, A parallel example is provided by mbwaani K, M 'sp. 
cassava' which has the plural form a-mbwaani K, a -mbwaani M. Once again we 
see that the high tone of the prefix doubles onto the first vowel of the stem 
/mbwaani/, skipping over the pre-consonantal nasal. The data presently 
available to us suggest that a nasal in a stem-initial NC cluster is ordinarily 
not tone-bearing when the consonant after the nasal is voiced, while the nasal 
is ordinarily tone-bearing if the consonant is voiceless. Whether we are 
dealing here with a tendency or an absolute rule is not entirely clear at 

Let us summarize what we have demonstrated about the status of pre- 
consonantal nasals in Makua. Nasals that are derived (synchronically) from 
NV sequences are always tone-bearing in Makua. It is not necessary to assume 
that such nasals are underlyingly tone-bearing, although they must be tone- 
bearing in surface structure. There are, however, many instances (particularly 
in Imit^upi) of pre-consonantal nasals that are tone-bearing and cannot be 
claimed to derive (synchronically) from a NV (or VN) source. Such nasals 
must, apparently, be underlyingly tone-bearing. But not all pre-consonantal 
nasals in Makua are tone-bearing. Consequently, whether a pre-consonantal 

nasal is tone-bearing or not does not follow from any segmental information 
about the nasal itself. In other words, the ability of a pre-consonantal 
nasal to bear tone is a fact that is independent of the segmental structure 
of the nasal. It is an abstract property that must be included in the under- 
lying representation. 

Notice that whether or not a pre-consonantal nasal is tone-bearing in 
Makua cannot be indicated simply by having a tone associated with that nasal 
in underlying structure. That is, one cannot represent the difference 
between tone-bearing and non-tone-bearing by whether or not the element in 
question has or does not have an underlying tone. The reason for this is 
extremely simple. In Makua, verbal stems do not have lexical tone. There 
are no lexical tone contrasts in Makua verbal stems. The tonal shape of 
a stem is entirely predictable in terms of the morphological construction in 
which the verb stem appears. Thus u-menj-a 'to fish with a hook and line' 
(with a non-tone-bearing nasal) and u-punth-a 'to pick out, pry out' (with 
a tone-bearing nasal) do not have any tones associated with their stems in 
underlying structure. But even though these stems have no tones associated 
with them, they must have a difference in their representation such that 
the nasal in /menj / is not tone-bearing while that in /punth/ is tone-bearing. 
We conclude, therefore, that the difference between tone-bearing and not tone- 
bearing is an abstract property of underlying structure that is present 
independently of whether or not the unit in question has an underlying tonal 
specification or not. 

In Makua, all vowels predictably have the abstract property of being 
tone-bearing. Non-nasal consonants predictably lack this property. Pre- 
vocalic nasals predictably lack this property. Pre-consonantal nasals 
predictably have the property of being tone-bearing when they derive from 
a NV sequence. Other (non-derived) pre-consonantal nasals may or may not 
be tone-bearing. They must be lexically marked as having the property or 


We would like to thank the University of Illinois Research Board, the 
African Studies Program of the University of Illinois, and the National 
Science Foundation (Grant No. BNS-7924523) for their support of the research 
that we have reported on in this paper. 

■""The data from Ikorovere were provided by S.A.C. Waane (of Tunduru 
district in Tanzania) while the data from Imithupi were provided by J.A.R. 
Wembah-Rashid (of Masasi district in Tanzania) . We would like to thank both 
of our consultants for their tireless patience and enthusiasm. The dialects 
of Makua have been little explored and we cannot at present determine to what 
extent the Tanzanian dialects described here extend into Mozambique. Since 
the Makua apparently migrated into Tanzania from Mozambique, there is good 
reason to expect that forms of Ikorovere and Imithypi are to be found inside 
Mozambique as well. 

CHENG, Chin-Chuan and Charles W. Klsseberth. 1979. Ikorovere Makua tonology 
(Part 1). Studies in the Linguistic Sciences 9.1, 31-63. 

CHENG, Chin-Chuan and Charles W. Klsseberth. 1980. Ikorovere Makua tonology 
(Part 2). Studies in the Linguistic Sciences 10.1, 14-44. 

CHENG, Chin-Chuan and Charles W. Klsseberth. 1981. Ikorovere Makua tonology 
(Part 3). Studies in the Linguistic Sciences 11.1, 181-202. 

Studies in the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 

Shlomo Lederman 

This paper applies McCarthy's (1979, 1981) autosegmental 
theory of Semitic morphology to modern Hebrew data, and points 
out some methodological and empirical weaknesses in it. 

First a brief sketch of McCarthy's theory is given (§3), 
illustrated by McCarthy's analysis of Arabic (§4). The rest of 
the paper is devoted to its application to Hebrew (§5). 

In §3 it is shown that the second universal convention of 
association is redundant and that the third convention must be 
able to re-apply, yet this re-application must be restricted 
in some way. 

In §4 it is argued that an i_-deletion analysis of mapping 
the vowel melodies onto Arabic prosodic templates is preferable 
to McCarthy's Vowel Association analysis. 

When McCarthy's theory is applied to modem Hebrew data it 
is shown that modifications have to be introduced into the system 
because of the loss of medial gemination and the lack of 
phonological motivation for positing medial or final semivowels 
in some verbal roots. 

Although the principles and devices proposed in McCarthy's 
analysis of Arabic also handle large parts of Hebrew verbal 
morphology, it is argued that his treatment of medial gemination 
and reduplication - both prevalent in Semitic languages - is no 
more restrictive and less powerful than a transformational 
treatment of these phenomena.* 


McCarthy (1979, 1981) adopts an independently developed and motivated 
theory of autosegmental phonology, and in terms of this theory, together 
with what he claims to be slight modifications and natural extensions of 
the theory, he analyzes Standard Arabic morphology and some aspects of 
Biblical Hebrew. 

A major motivation for McCarthy's analysis is his desire to do away 
with transformational morphological processes that are called for in 
segmental analyses of such processes as reduplication. Allowing for 
transformational morphological rules, argues McCarthy, is too unrestrictive 
and powerful in that it makes possible the formulation of rules not attested 
in any natural language. McCarthy shows that employing a prosodic analysis 
eliminates the need for transformational notation in morphological rules. 
He also shows that such an analysis explains hitherto unexplained and 
inexpressible facts about Arabic and Semitic morphology, such as the 
impossibility of identical consonants occupying first and second root 
consonant slots in a triradical root versus the possibility of identical 
consonants in second and third positions, first noted in Greenberg (1950). 

Moreover, the autosegmental theory is so constrained as to enable 
only the generation of occurring words while making impossible the derivation 
of non-occurring ones. This is to be contrasted with a theory allowing 
transformational notation, a device that makes non-occurring derivations 
theoretically possible. Thus, the choice of the formal apparatus of a theory 
has substantive empirical consequences, as pointed out in Kiparsky (1968). 

2. AIM 

The aim of the present paper is to show some methodological and 
empirical weaknesses in McCarthy's analysis. It will be argued that some 
of the modifications of autosegmental theory, dictated by the facts of 
Semitic morphology, do not follow from the theory, and that the auto- 
segmental theory, so modified, becomes so unconstrained and permissive 
so as to allow the formulation of processes that would generate non- 
occurring derivations, diminishing thus the empirical content of the 
theory. The constraints built into the initial autosegmental theory have 
to be relaxed whenever the facts so dictate, considerably weakening the 

After a succinct outline of the autosegmental theory adopted by 
McCarthy I will critically discuss the modifications dictated by the facts 
of Arabic morphology and then evaluate this theory in light of Hebrew 
morphology. I will restrict myself mostly to verbal morphology. 


McCarthy assumes the theory of autosegmental phonology of Clements 
and Ford C1979), a refinement of Goldsmith (1976). 

The universal conventions of association are formulated in terms of 
mapping of melodic elements (units on an autosegmental tier) onto melody- 
bearing elements (units on the segmental tier). There are three such 
conventions, illustrated in (1) by the association of lower case melodic 
elements with upper case melody-bearing elements: 

i. If there are several unassociated melodic elements and several 
unassociated melody-bearing elements, the former are associated one-to- 
one from left to right with the latter. This transforms (la) to (lb). 

ii. If, after application of the first convention, there remain one 
unassociated melodic element and one or more unassociated melody-bearing 
elements, the former is associated with all the latter. This transforms 
(Ic) to (Id). 

iii. If all melodic elements are associated and if there are one or 
more unassociated melody-bearing elements, all of the latter are assigned 
the melody associated with the melody-bearing element on their immediate 
left if possible. This principle, which has the effect of automatic 
spreading, will alter (le) to (If). 

(1) One-to-one from left to right 
a. A B C b. A 

I I 

X y z X y z 


;. ABCD d. ABCD 


X y z X y z 


e. ABCD f. ABCD 

II \\^ 

X y X y 

It is very important to note that (Ic) can never be the immediate 
output of convention (i), although convention (ii) is so formulated as to 
be fed by (i). The immediate output of the first convention on an input 
like (2a) would be (2b), to which the third convention would apply to 
yield (2c) : 

(2) a. ABCD b. ABCD c. ABCD 


xyz xyz xyz 

To get the configuration (Ic) we must assume that it is derived from (2c) 
by the erasure of the associations of z to C and y to B, since only then 
can the second convention apply. Given, however, the necessary application 
of conventions (i) and (iii) before convention (ii), there is no need for 
convention (ii) - it is a special case of convention (iii) (in conjunction 
with the erasure of the association of z to C, which is needed anyway). 

McCarthy modifies earlier versions in that no provision is made for 
the automatic association of an unassociated melodic element with a melody- 
bearing element that already has an association. The representation in (3) 
is therefore well- formed: 

(3) ABC 


w X y z 

If z remains unassociated throughout the derivation, it receives no phonetic 
realization. Floating melodic elements are not anchored. McCarthy calls 
this the "prohibition against many-to-one associations". 

The notion of autosegmental tier is enriched to allow for consonantal 
roots and vocalic melodies in Arabic to be represented on separate auto- 
segmental tiers. This will require an additional convention that associates 
vowel melodies with V-slots on the segmental template and root melodies 
with C-slots. 

McCarthy also adopts a revised version of Leben's Obligatory Contour 
Principle. Leben's (1973) principle says that no autosegmental tier can 
contain adjacent identical elements. McCarthy revises this principle so 
that a grammar having adjacent identical elements is less highly valued 
than a grammar not having adjacent identical elements on an autosegmental 

We will illustrate how this theory accounts for Arabic morphology. 
The stem form of First Binyan^ perfect katab is mapped as follows i^ 

(4) u 


- perfective active melody 

C V C V C - First Binyan prosodic template 


ktb - root melody 

The root is associated to the template by the first convention, and the 
vowels are associated by the first and third conventions. 

In those binyanim having prefixes there is an additional autosegmental 
tier for the prefix which is associated before all other material. 

In Binyanim IX and XI the final reduplicated consonant is readily 
accounted for by spreading, as we see in (5) (after application of the 
first and third conventions) : 

(5) a. IX b. XI 

CCVCVC (ktabab) CCVVCVC (ktaabab) 

w^^ — \\i/ — 

ktb ktb 

The same phenomenon of spreading also explains the possibility of 
identical second and third consonants in a triradical root as against the 
impossibility of first and second identical consonants. Thus samam is 
analyzed as derived from a biradical root sm (Leben's Obligatory Contour 
Principle rules out a root * smm) , associated with First Binyan template 
by the first and third conventions, as follows: 

(6) CVCVC ( samam ) 


Verbs with identical second and third (or more generally, last and 
penultimate) radicals ( verba mediae geminatae ) are very common in Semitic, 
and the prosodic analysis neatly accounts for them. Thus we have in Akkadian 
naparruru 'disband', in Hebrew^ mad ad 'measure', zamam 'plot', komem 
'rouse', 'orer 'awaken'. In Ethiopic languages final reduplication is 
prevalent: Tigre hassa 'be weak', halla 'bray', fart at a 'crumble, v.t.', 
maadada "spread' . 

On the other hand, identical first and second consonants are ruled 
out by the first convention (one-to-one left to right association) which 
rules out *sasam being derived from the biradical root sm. Notice that in 

this theory we cannot posit a root ssm , which is ruled out by the Obligatory 
Contour Principle, ruling out * sasam being derived from it by the first 
convention, or under the revised Obligatory Contour Principle a granunar 
containing the root ssm is less highly valued than a grammar which has no 
such roots. 

However, in the Second and Fifth Binyanim we find medial gemination 
(II kattab, V takattab ) , for which the theory is inadequate. Without any 
modification in our analysis medial gemination is ruled out and we would 
instead expect to find * katbab and * takatbab , respectively, as illustrated 
in (7), after application of the first and third conventions: 

(7) a. CVCCVC -second binyan template b. t 
\ 1 1/ ( katbab) I 

ktb CVCVCCVC -fifth binyan 


To account for the gemination in the Second and Fifth Binyanim McCarthy 
must formulate a rule erasing the first association of the last root 
consonant. This rule, restricted to Second and Fifth Binyanim, will convert 
(7) into (8): 

(8) Erasure 

a. CVCCVC b. t 






The third convention will now re-apply and associate the unassociated 
melody-bearing element with the melody associated with the melody-bearing 
element to its immediate left, i.e., t^, and will yield the correct forms, 
as in (9) : 

(9) Reassociation 

a. CVCCVC (kattab) b. t 

ktb CVCVCCVC ( takattab) 


The autosegmental theory looses much of the predictive and restrictive 
power it has if it allows the formulation of the Erasure rule. The theory, 
without the Erasure rule, predicts the impossibility of medial gemiantion. 
This is a very strong, falsifiable empirical claim, and indeed it is 
falsified in the Second and Fifth Binyanim in Arabic. Notice that McCarthy 
offers no motivation for the Erasure rule, except for the motivation 

of making the theory agree with the facts. Notice that once the theory 
allows language- specific rules such as the Erasure rule (at some cost to 
the grammar, of course) , then the theory offers no principled way of dis- 
allowing rules of erasure that would, e.g., erase the second rather than 
the penultimate root consonant, yielding such non-occurring strings as 
* sasam from sm, * kaktab from ktb , thereby loosing whatever predictive and 
restrictive power it had. 

Therefore, the introduction of the rule of Second and Fifth Binyanim 
Erasure is not a slight modification in the theory, and it can be incorporated 
into it only with tremendous cost, unless the theory can specify, from 
independently motivated general principles, what language-specific rules 
are allowed and which are not, allowing the Erasure rule formulated by 
McCarthy and rejecting the erasure rule that gives rise to initial gemination. 

Notice also that, given the Second Universal Association Convention, 
we can derive gemination by a rule erasing in (7) the second and third 
associations. Such a rule would do away with the need for the third 
convention to re-apply to (8) . A theory in which more than one equally 
plausible analyses are possible is not constrained enough, and the theory 
in question has no evaluation measure to tell us which analysis is to be 

It should be pointed out that in order for McCarthy's analysis to 
work the third convention must be able to re-apply. Obviously, the auto- 
matic application of this convention must be restricted in some way, other- 
wise it would wrongly re-fill slots emptied by morphological rules. Thus, 
in Tigre, verbs with a medial semivowel lose the semivowel in the perfect 
and jussive Cimperative) of type A verbs and the derivative verb type 'a-A - 
a morphological rule, as the semivowel is not elided in the imperfect and 
in the other verb types, although the semivowel appears in exactly the 
same phonological environment. Thus we find geda 'he hurried' versus legayed 
'he hurries' (root gyd ) . After semivowel drop the derivation of 'he hurried' 
should look like (10) : 

(10) CVCVCV 


The third convention should automatically re-apply to (10) to yield * gagda 
'he hurried'. (For the vowel alternations see Kisseberth, 1978, and Raz, 
1980a, b). But instead we have geda , with the second C of the template un- 
associated. Therefore, the third convention cannot apply after semivowel 
deletion - a morphological, not phonological, rule (assuming that any 
morphologically restricted rule is morphological). Hence, the third con- 
vention cannot automatically re-apply as suggested by McCarthy. 

Recall McCarthy's rejection of transformational morphological rules 
as too powerful and unrestrictive. It is not clear at all how to compare 
the relative power of such disparate theories as McCarthy's and a segmental 
theory allowing transformational power. Given this and the need to enrich 
McCarthy's theory, it is not at all clear that it is the more restrictive 
and less powerful in accounting for such morphological processes as 

Up to now we discussed the mapping of the root melody onto the binyan 
template. The same conventions map the vowel melodies, yet also here a 
crucial language-specific rule is formulated to ensure the correct output. 

McCarthy isolates the vowel melodies for all binyanim except I, as 
follows: £ active perfective, ui^ passive perfective, ua passive participle, 
and uai active participle. 

The universal conventions of association alone are not sufficient to 
ensure the correct mapping of these melodies with the V-slots of the prosodic 
template. If we take the root ktb , mapping the active participle melody 
uai in the sixth binyan, after application of the first and third conventions 
we would get (11) : 

(11) CVCVCWCVC *mutakiitib 



However, the correct form is mutakaatib . To accommodate facts such as 

this McCarthy formulates a rule of Vowel Association taking precedence 

over all the universal conventions and associating the melodic element 

i_ in the final V-slot of the melodic template. Then the universal conventions 

of association apply to give the correct results such as mutakaatib .** 

Notice however that given this theory we can account for the data 
another way. We can let the universal conventions apply first, yielding 
mutakiitib , as shown in (11). Then we apply a rule of i_-deletion, deleting 
all but the last association lines of £, as formulated in (12) (the rule 
applies iteratively) : 

(12) i_-deletion 

V V 

I ^ / (C)V 

Spreading would then automatically re-apply, yielding the correct mutakaatib . 
It might be argued that because of simplicity the Vowel Association analysis 
is to be preferred to the i_-deletion analysis. However, in the Vowel 
Association analysis we must stipulate that the Vowel Association rule 
applies before all universal association conventions. To the extent that 
McCarthy's theory does not constrain language-specific irules to apply 
either only before or only after the universal conventions, his theory is 
too powerful and not easily falsifiable - considerably weakening his claim 
to the superiority of his analysis over a transformational one. Notice also 
that the j^-deletion analysis exactly parallels the Erasure rule of the second 
and fifth binyanim, making this analysis quite feasible within McCarthy's 

There is another advantage of the ^-deletion analysis over the Vowel 
Association one proposed by McCarthy. Thus, in the perfective passive ( ui) 
of the eighth binyan we have the form ktutib . To derive such forms only 
the universal conventions need apply, whereas in the Vowel Association 


analysis the claim is made that even in such forms as ktutib the i^ is 
first associated by the language-particular rule, and only then do the 
universal conventions apply. If universal conventions can do exactly the 
work that a language-particular rule does, then there seems to be no reason 
for that language-particular rule to exist. On the other hand, the i_-deletion 
rule treats all binyanim uniformly, appllying only when its structural 
description is met, i.e., when there is more than one i^. 

As to the vowel melody of the active imperfective (again, except for 
the first binyan) it is uai in II, III, IV, QI, ai in the seventh through 
fifteenth binyanim and in QUI, QIV, and a in V, VI and QII. One may assume, 
as McCarthy does, that the avtive imperfective melody is uai and then 
formulate partly morphologically conditioned rules deleting the u and i^ 
melodic elements. Note that the u deletion rule in such a case will intro- 
duce a new type of rule into the theory. To prevent, for example, aktabib 
from being realized as * uktabib it would not be enough for the u deletion 
rule simply to delete the association of the u^ to the first V-slot, because 
if we formulate the rule thus, the first convention would automatically 
re-apply to give * uktabib once more (remember that there is nothing 
against the conventions re-applying, and in fact at least the third con- 
vention must automatically re-apply in McCarthy's analysis). Therefore, 
the rule of u deletion must delete the u melody from the vowel melody tier, 
not just erasing an association line - a rule unlike all other rules in 
McCarthy's analysis. Likewise, the i_ deletion rule in such active imper- 
fective forms as atakattab - the erasure of only the association line 
would ensure the reassociation of the i^ by the reapplication of the first 


Although many generalizations about Hebrew morphology can be captured 
in McCarthy's theory the same way it was done for Arabic, there are some 
divergences that pose grave problems for the theory. I will discuss these 
problematic areas, some of which are discussed in McCarthy (1981).^ 


McCarthy claims that his autosegmental analysis solves the otherwise 
unexplainable asymmetry in the distribution of the root consonants in 
Semitic described in Greenberg (19E0) and explained on page 144 of this 
paper. Yet Schwarzwald (1974) provides a convincing phonological explanation 
of these facts. 

Greenberg (1950) came up with the following three results in his 
survery of the Semitic verbal root: a) in the first and second root consonants 
there are no identical or homorganic consonants; b) in the second and third 
place there are no homorganic consonants (with many exceptions), but there 
are identical consonants; c) in first and third place the constraint against 
identical or homorganic consonants still holds, but not absolutely. 

Schwarzwald points out that the many exceptions to constraints (b) and 
(c) are not surprising, when we bear in mind that between the second and 
third consonant in the verb there intervenes a vowel (where there is no 


phonetic vowel, there is an intervening vowel underlyingly) , and a vowel 
also separates the first and third consonants. Therefore, these are only 
apparent exceptions to (b) and (c) . The generalization here is that identical 
or homorganic consonants can occur when there is a vowel between them. 

On the other hand, the first and second root consonants occur mciny times 
in Hebrew as a consonant cluster in the verbal conjugation: in the future 
(imperfective) and the imperative of Pa'al Binyan (e.g., ' e^mor 'I'll guard'; 
Smor 'guard, imp.', from ^mr) , in the past (perfective) and present 
(participle) of Nif'al Binyan (e.g., ni^bar 'was, is broken', from sbr) , in 
Hif'il Binyan (e.g., hilbi^ 'cause to dress', from Ibg) , and in Huf'al 
Binyan (e.g., hulba^ 'was made to dress', from lb£) . 

Schwarzwald therefore reformulates Greenberg's constraints as one: 
"Identical and homorganic consonants cannot occur in a consonant cluster" 
(p. 132). 

There are some verbs in Hebrew with identical first and second root 
consonants. This contradicts the absolute version of the Obligatory Contour 
Principle. Under the revision of this principle suggested by McCarthy, such 
adjacent identical elements in the autosegraental tier are allowed only at 
great cost to the grammar. However, under Schwarzwald' s analysis these 
occurrences are quite natural and pose no problem, as they appear only in 
the Pi 'el, Pu'al and Hitpa'el Binyanim where there is always an intervening 
vowel between the first and second root consonants. Thus we have in Hebrew 
verbs like: mime^ 'realize, execute' (root mm^) , mimen 'finance* (root mmn) , 
dida 'toddle' (root ddy ) , kixev^ 'star' (root kkb ) , lilyen 'do acrobatics' 
(root llyn ) , mimzer 'bastardize' (root mmzr ) , Si^a 'divide into six' (root 
£5), SiSben 'be a sponsor, best man" (root ^Jbn) , sisgen 'variegate' (root 
ssgn ) . 

It should be emphasized that the surface adjacency of the identical 
consonants is not a result of the deletion of an underlyingly intervening 
consonant by a morphological or phonological rule, as sometimes happens 
with the deletion of semivowels. There is no reason to assign the above 
verbs roots with an intervening consonant. Such an analysis could perhaps 
work for Biblical Hebrew in such examples as the denominative kikkev 'star' 
where we might posit the root kwkb . However, in modem Hebrew there is no 
evidence for such abstract representation, the semivowel never showing up 
on the surface. And anyway, kikkev is not attested in Biblical Hebrew, it 
is a recent neologism. 

Recall also that while McCarthy claims to explain only the distribution 
of identical adjacent consonants, Schwarzwald's analysis covers both 
homorganic and identical consonants with one and the same principle (identical 
consonants being a special case of homorganic consonants) . 

Schwarzwald shows that the constraint against consonant clusters 
applies also at the level of the phonetic word, and that therefore the 
constraint should be formulated as a constraint on phonetic output and 
not as a constraint on the root. Thus there are many nouns and inflected 
verbs where a vowel intervenes between homorganic or identical consonants - 
nouns like bub a 'doll', bimuy 'stage-directing', dad 'udder', diduy 'toddling' 


lulyan 'acrobat', tatran 'one lacking olfactory sense', sason 'celebration', 
and inflected verbs like lamadati 'I studied' (/ lamad+ti/ ) , yaladat 'you 
( gave birth' (/ yalad- t-t/) . In the last two forms note the epenthetic 
schwa, introduced to break the homorganic consonant cluster. Compare this 
with Samart (/ Samar- i-t/) 'you ( guarded', which is the regular case 
without epenthesis. It is also important to note that in the last two 
examples the constraint holds across morpheme boundaries, and is thus not 
restricted to the root only. 


In his analysis of the Arabic verbal system McCarthy shows that each 
binyan has a unique prosodic template, convincingly showing that tradi- 
tionally separate binyanim having identical prosodic templates actually 
belong to one and the same binyan. Thus, da^raj (root dhr j ) , traditionally 
assigned to binyan QI, is simply a quadriradical root mapped onto binyan II 
(cf. kattab, with the triradical root ktb in binyan II). Likewise, binyan 
QII (e.g., tadaliraj ) is nothing but a quadriradical of binyan V (e.g., 
takattab ), QUI (e.g., dhanraj ) belongs to XIX {e.g., ktanbab ) , and QIV 

3), QUI (e.g., dhanraj ) belongs to XIX (e 
dtiarjaj_) belongs to XI (e.g., ktaabab ) . ' 

On the other hand, when McCarthy's analysis is applied to Hebrew Pa'al 
Binyan, we are forced to posit three distinct prosodic templates for Pa'al 
Binyan (CVCVC for regular Pa'al verbs, e.g., lamad 'study'; CVC for mono- 
syllabic biradical Pa'al verbs, e.g., kam 'get up'; and CVCV for final 
semivowel Pa'al verbs, e.g., kana 'buy') unless we want to posit very 
abstract roots for the last two verb groups, i.e., derive kam from the 
root kwm and kana from the root kny, mapped onto the regular CVCVC Pa'al 
prosodic template, with later loss of the semivowel and certain vowel 
adjustments. We will discuss these verb groups in this and the next sections. 

Within McCarthy's analysis Hebrew Pa'al Binyan (First Binyan) has the 
prosodic template CVCVC, e.g., lamad 'study', 'axal 'eat', safar 'count'. 
This template, together with automatic spreading, also accounts for the 
reduplication of the final consonant of biradical roots in Pa'al, such as 
madad 'measure' (root md) , galal 'roll' (root gl) , tasas ^ 'effervesce' 
(root ts^) , gavav 'pile up' (root gb^) , gazaz 'shear' (root gz) , balal 'mix' 
(root bl) , 'afaf 'surround' (root ^) , bazaz 'loot' (root bz) , zalal 
'devour' (root zj_) , xafaf 'shampoo' (root x£) and many more. Given the 
biradical root, the universal conventions of association will automatically 
yield the occurring forms, so that there is nothing special or unnatural 
about these verba mediae geminatae , and the lexicon (or the root list) 
need not stipulate that the second consonant is reduplicated. 

However, there is a small number of about fifteen monosyllabic bi- 
radical verbs, mostly intransitive, that do not undergo this reduplication, 
e.g., we have ^at 'sail' (root £t) and not * Satat . These verbs are: gar 
'reside', ba' 'come', sam 'put', rac 'run', kam 'get up', cac 'emerge', 
^av 'return', ^ar 'sing', zaz 'move', §at 'sail', met 'die', hot 'be 
ashamed', sav 'turn', xa5 'feel', nax 'rest'.^ 

Corresponding to gar 'reside' we find garar 'drag', and corresponding 
to rac 'run' we find racac 'crush'. The semantic difference between the two 


forms derived presumably from the biradical roots gT_ and re, respectively, 
indicate that this is a case of root homonymy and we really have two roots 
gr - ^1 and ^r^ and likewise for re. 

Traditionally these monosyllabic verbs are analyzed as Pa'al verbs. 
This classification arises in part, but not exclusively, from diachronic 
and comparative Semitic studies. Historically some of these verbs had had 
a medial semivowel, and the synchronic conjugation seems to indicate that 
some verbs may have had different underlying semivowels, e.g., 'a^ir 'I'll 
sing' (from * §yr ) but 'akum 'I'll get up' (from * qwm) . ^ " But apart from 
these vowel alternations there is nothing in the phonology or morphology 
of these forms in modem Hebrew to suggest that they have a synchronic 
triradical root (in sharp contrast to corresponding medial semivowel roots 
in Arabic and Ethiopic) . 

For the sake of the uniformity of the Pa'al prosodic template one 
might suggest to analyze these monosyllabic verbs as having a triradical 
root with medial semivowel, mapping these roots into the familiar CVCVC 
Pa'al template. Later rules will delete the semivowel and adjust the 
adjacent vowels. However, this analysis should be rejected in that it 
posits a rule of absolute neutralization, without any compelling motivation, 
as the purported semivowel doesn't show up in any of the conjugations or 
in any other derivationally related words. The only motivation for semi- 
vowel drop is paradigm regularity and the alternations in the adjacent 
vowels. ^^ 

An alternative would be to posit a different prosodic template for 
the monosyllabic verbs, namely, CVC. The question then arises whether this 
template belongs to Pa'al Binyan (as an alternate to the regular CVCVC 
Pa'al template) or constitutes a separate binyan. If we include this 
template in Pa'al Binyan, for there are many morphological characteristics 
these verbs share with regular Pa'al verbs, we will have to distinguish in 
our root list those biradical roots that are mapped onto a CVCVC template (e.g. 
garar 'drag', from the root gr ) and those mapped onto a CVC one (e.g., gar 
'reside', from the root gr) . 


Apart from the regular triradical Pa'al (e.g., ^avar 'break') and the 
monosyllabic biradical Pa'al (e.g., kam 'get up') there is a third group 
of Pa'al verbs showing on the surface two root consonants, for which we 
will have to posit a third Pa'al template, namely CVCV. These verbs are 
such as: raca 'want', kana 'buy', ^ata 'drink', 'asa 'do', maxa 'protest', 
zaxa 'win', man a 'count', baxa 'cry', xala 'fall ill', saxa 'swim', dama 
'be similar', raza 'lose weight', and bana 'build'. 

If we assume that these verbs have biradical roots, then these roots 
must be marked in the lexicon for being obligatorily mapped onto the CVCV 
template and none of the other Pa'al templates. We do find homophonous 
biradical roots mapped onto the various Pa'al templates, e.g., tc_ in 
racac 'crush', rac 'run', and raca 'want'. Although the root is ambiguous 
none of the conjugated verbs is. 

Another analysis, more in agreement with McCarthy's analysis, would 
posit no new template, but will derive this third group from a triradical 
root, mapping it onto the CVCVC template, later dropping the final conso- 
nant. Indeed such is the traditional treatment, deriving all these verbs 
from a root with the third consonant being the semivowel y_ which drops 
stem- finally. 

However, for modem Hebrew there is weak phonological motivation 
for such an analysis, as the y doesn't show up in any of the inflectionally- 
related forms of the verb. The only case where a final y does show up is 
in the passive participle, e.g., kanuy 'bought', latuy 'drunk', 'asuy 'done' 
from the roots kny , Sty , 'sy , respectively (compare these with the passive 
participle of regular verbs, e.g., ^avur 'broken' versus ^avar 'break', 
from the root |br ) . However, only very few of the roots in this group 
can be instantiated in the passive participle, casting doubt as to whether 
kanuy, ^atuy , etc., are derived rather than lexical (i.e., we do not find 
* zaxuy from zaxa , * razuy from raza , * xaluy from xala , etc.). 

Nevertheless, we do find support for the final semivowel analysis in 
deverbal action nominal s derived from Pa'al verbs. With a regular verb 
this Word Formation Rule yields zrika 'throwing' from zarak 'throw', smira 
'guarding' from ^amar 'guard'. With a putative final semivowel the y shows 
up in the deverbal nouns, as in yariya 'shooting' from yara 'shoot', kniya 
'buying' from kana 'buy', zxiya 'winning' from zaxa 'win'. 

This kind of support for the final semivowel is not very convincing 
if we bear in mind the irregular semantic relations between many deverbal 
nouns and their bases and the absence of many deverbal nouns corresponding 
to existing bases. This suggests that we are not dealing here with a syn- 
chronic derivational relationship, i.e., in modem Hebrew the deverbal 
noun is lexical rather than derived and we cannot find motivation for a 
final y in the verb because we find it on the surface of a form not related 
to the verb by a synchronic derivational process. Thus we do not find 
* msxiya from maxa 'protest', nor * r3ziya from raza 'lose weight', etc. 
Also, ^tiya means 'beverage' as well as 'drinking', kniya means 'purchase' 
(from kana 'buy'), yariya means 'shot, n. ' (cf. yara 'shoot'), and ra'iya 
means 'eyesight' (cf. ra'a 'see'). On the other hand, we have many CCiCa 
noun forms without a corresponding CaCaC verb. 

The two Pa'al templates CVCVC (e.g., ^amar 'guard') and CVCV (e.g., 
kana 'buy') can be collapsed into one template CVCV(C) , but the root list 
will have to indicate which root is mapped onto which of the two possible 
realizations of the template. A better solution is to collapse both temp- 
lates in terms of syllable structure: both templates are disyllabic, the 
last syllable capable of being either closed or open. Analysis in terms 
of syllable structure allows greater generalization than that captured by 
the notion of the prosodic template. 


Pi 'el and Hitpa'el in Hebrew correspond to the second and fifth 
binyanim, respectively, of Arabic. In Biblical Hebrew the second root 
consonant is geminated, as in Arabic, so we can have here exactly the same 


analysis McCarthy proposed for Arabic, specifically, the Pi 'el template is 
CVCCVC, the Hitpa'el template is CVCCVCCVC, and the Erasure rule that 
ensures medial gemination. Only the vowel melodies are different: ia is 
for Pi'el and Hitpa'el perfective active (the a is changed to £ word- 
finally, e.g.. Biblical Hebrew sipparti 'I told' vs. sipper 'he told'; 
hitlabbaSti 'I got dressed' vs. hitlabbejf 'he got dressed'). ^^ The 
gutturals \ ^, h^, h and r are never geminated. So within the prosodic 
analysis these will be first geminated, and later there must be a rule that 
degeminates them. 

In contrast to Biblical Hebrew, modern Hebrew has no phonetic gemi- 
nation anywhere morpherae-intemally. Thus we have siper 'tell' (cf. Biblical 
sipper ), hitlabeg 'get dressed' (cf. Biblical hitlabbeS ) , etc. Since 
gemination never shows up on the surface there is no motivation to posit 
gemination in some step in the derivation. This will lead us to the revision 
of Pi'el and Hipa'el templates for modem Hebrew. McCarthy noted that a 
rule such as Second and Fifth Binyanim Erasure is very costly and the 
apparent loss of this rule in modem Hebrew would be confirming such a 
view. However, if it were that the rule of Erasure was lost in modem 
Hebrew, the template remaining unchanged, we would get derivations as 
follows (after application of the first and third conventions):^^ 

(13) Pi'el Hitpa'el 
CVCCVC (sifrer) y 

spr h t 

CVCCVCCVC (hitlavsel) 



However, in modem Hebrew we have siper and not * sifrer , hitlabes and not 
*hit_lav£e£. Clearly these will have to be derived from new prosodic temp- 
lates, namely CVCVC for Pi'el and CVCCVCVC for Hitpa'el as follows: 

(14) a. CVCVC -Pi'el template b. h t 

\l/ I I 

spr (siper) CVCCVCVC -Hitpa'el template 

\|/ . 

lb? ( hitlabe?) 

What is remarkable about this output is that the medial stop remains un- 
spirantized although there is a rule of Spirantization in Hebrew 
spirantizing the stops £, b^, and k^ after a vowel. ^'* Thus we have safar 'count' 
from spr , savar 'break' from ?br in Pa'al vs. giber 'break into pieces' 
from ?br in Pi'el. This failure of the medial root consonant in Pi'el and 
Hitpa'el to spirantize can be readily accounted for if we assume that when 
Spirantization applies the medial stop is not preceded by a vowel, but 
rather by a consonant, motivating an analysis in which Pi'el and Hitpa'el 
templates have not been changed in modem Hebrew, rather, they are identical 
to Biblical Hebrew templates. Only after Spirantization applies does 
Degemination apply. 


Such an analysis would require an odd degemination rule since after 
Spirantization we have the intermediate stage sifper and hitlavbe^ , the 
first geminate meeting the structural description of Spirantization. De- 
gemination would then apply to non-identical segments. However, it can be 
argued that Degemination is a functional phonological rule having the 
effect of preventing homorganic consonant clusters, akin to the constraint 
discussed by Schwarzwald (1974). If we accept this analysis we have a pro- 
blem in Biblical Hebrew, since it too had the rule of Spirantization 
(covering the stops g, d, t_, as well), so we must somehow prevent sipper 
from becoming * sifper . (The environment for Spirantization is clearly post- 
vocalic, not inter-vocalic. E.g., savta , or more commonly after devoicing 
assimilation safta, 'grandmother' (/ sabta/ ) , sifrer 'put numbers', from 

Another solution to the failure of Spirantization to apply in Pi 'el 
and Hitpa'el is to say that Pi 'el and Hitpa'el in modem Hebrew have the 
templates as in (14) and that the rule of Spirantization is morphologically 
restricted in that it does not apply in Pi 'el and Hitpa'el. This solution 
is preferable, more natural and less abstract in light of modem Hebrew 
morphology and phonology. 

However, quadriradical roots conjugated in Pi'el and Hitpa'el, re- 
duplicated biradical roots and the phenomena of spreading of triradical 
roots in Pi'el and Hitpa'el dictate positing for modem Hebrew the old 
Biblical templates, namely, CVCCVC and CVCCVCCVC, in addition to the 
templates in (14). Thus, in Pi'el, we find quadriradical roots like Silhev 
'set alight' (root |j_hb ) , sirbel 'make awkward' (root srbl) , 'imlen 'starch' 
(root 'mln ) , kirsem 'munch' (root krsm ) ; reduplicated biradical roots like 
gilgel 'roll, v.t.' (root gl) , sigseg 'prosper' (root s£) , bilbel 'confuse' 
(root bl); and spread triradical roots like sifrer 'put numbers' (root spr) , 
cixkek" ^ 'chuckle' (root cxk ) , ' i^rer 'ratify' (root '^r) . We find exactly 
parallel phenomena in Hitpa'el: Quadriradicals - hitbargen 'become bourge- 
ois' (root brgn ) ; Reduplicated biradicals - hitbalbel 'become confused' 
(root bl_); Spread triradicals - hictaxkek 'chuckle to oneself (after 
metathesis of £ and t^; root cxk) . To illustrate: 

(15) a. CVCCVC (silhev)l^ b. CVCCVC (cixkek) 

\JI/ \l/ 

slhb cxk 

The prohibition against many-to-one associations prevents the above verbs 
from being derived from the template in (14a). It seems, than, that the 
template for modern Hebrew Pi'el should be captured in terms of syllable 
structure, stating simply that the Pi'el template is bisyllabic, rather 
than positing the conflicting templates CVCVC (e.g. siper) and CVCCVC (e.g., 
silhev ) . Similarly in Hitpa'el. 

The spreading phenomenon of triradical roots in Pi'el seems to con- 
stitute strong evidence for McCarthy's analysis of the Arabic Second and 
Fifth Binyanim (and the corresponding Pi'el and Hitpa'el in Hebrew). It 
seems that the spread verbs are just cases where the putative rule of 
Erasure simply fails to apply. Interestingly we find cixkek 'chuckle' (see 
15b) where the Erasure rule has failed to apply, alongside cixek 'laugh' 
with application of Erasure, both derived from the root cxk . Other examples 

are: 'ivrer 'ventilate' (root 'vr , from 'avir 'air, n.'), kidrer 'dribble 
(a ball)' (root kdr , from kadur 'ball'), Siklel 'weigh (statistics)' (root 
?kl , cf. ^akal 'weigh'), ' irbev 'mix' (root 'rb , cf. Pi'el 'irev 'mix' from 
the same root), Sixrer 'free' (root ^xr) . There are around thirty such verbs 
showing spreading of the third consonant in Pi'el. 

We can therefore mark these roots as not undergoing Erasure. The 
question then arises as to why Erasure fails to apply in these forms, a 
minority of Pi'el verbs. Certainly there is no free variation in the appli- 
cation of the Erasure rule. A possible answer may lie in the facts that 
the spread triradical roots are more characteristic of modem as against 
Biblical Hebrew verbs and that the spread verbs are mostly denominative, 
i.e., they are usually derived from a noun functioning as the base. 


Multiradicals in Pie'l seem to strongly militate against the prohi- 
bition against many-to-one associations and thus pose a problem for the 
proper formulation of the Pi'el prosodic template. Pi'el is the productive 
binyan par excellence in modern Hebrew, accommodating many multiradical 
roots and deriving many denominatives (Cf . Bolozky, 1978) . 

Consider the denominative tilgref 'telegraph, v. ' (root tlgrf , from 
telegraf 'telegraph, n.'). Its mapping would look like (16): 

(16) CVCCVC -Pi'el template 
tlgrf ( tilgref ) 

Such mapping is ruled out by the prohibition against many-to-one asso- 
ciations. The principles of the autosegmental theory predict the following 
derivation, with f floating: 

(17) CVCCVC (tilger) 



* tilger is the wrong output. The mapping in (16) is even more problematic 
to the theory because we do not simply get the anchoring of the floating 
f_ to the last C-slot (i.e., we do not have * tilgerf) after application of 
the first convention. The form tilgref is an apparent counterexample to 
the first universal convention of association in that the root-internal 
£ is associated with an already occupied C-slot. 

Some additional multiradicals mentioned in Yannay (1974a, b) are: 
' ibstrekt 'abstract, v.' (root 'bstrkt , from 'abstrakt 'abstract, n. ' - 
student slang); sinxren 'synchronize' (root snxrn , from sinxrun 'syn- 
chronization'); cintrafeg 'centrifuge, v.' (root cntrfg , from centrifuga 
'centrifuge, n.'); stingref 'take shorthand' (root stngrf , from stenografya 
'shorthand'). It should be noted that in many of these multiradicals the 
presence of a sonorant (n, _r or I) in the consonant cluster eases the 
pronunciation of the cluster. This seems to be the constraint on the 
formation of multiradical Pi'el verbs. 


In this group we also find a spread quadriradical : flirtet 'flirt, v.' 
(root flrt , from flirt 'flirt, n.'), for which we have the mapping as in 
(18), which is ill-formed in McCarthy's theory: 

(18) CVCCVC ( flirtet) 


Whereas the autosegmental theory predicts the derivation in (19): 

(19) CVCCVC (filret) 


* filret is the wrong output. 

Bolozky (1978) notes that at least for borrowed words innovators 
tend to preserve the relationship between nouns and verbs derived from 
them, "so as to have a kind of paradigmatic uniformity across syntactic 
categories... Since realization as * filret would obliterate the original 
clustering of the noun, the innovator resorted to reduplication of the 
last consonant and sticking the stem-final vowel between the two identical 
consonants - flirtet 'flirt (V)'... Reduplication is an accepted device 
in Hebrew to express diminution." (Bolozky, 1978, p. 122). 

The idea of a derivational process having access to the complete 
source word (i.e., flirt as input to flirtet rather than only the root 
flrt) is impossible within McCarthy's theory, where it is only the root 
which is mapped onto a new prosodic template. 


Verbs with a reduplicated biradical root, very frequent in Hebrew 
and widespread in Tigre, pose a very serious problem for the autosegmental 
theory, to solve which McCarthy (1981) introduces a powerful new device 
of root morpheme reduplication (assuming, as it were, an auto-autosegmental 
level). This is not a simple slight extension of the theory, but a major 
departure from it, weakening considerably its empirical content. 

The problematic verbs are such as gilgel 'roll (v.t.)', and hitgalgel 
'roll oneself (root gl_. Cf. galal 'roll (v.t.)'). The mapping of the root 
£l_ onto Pi 'el, required to yield the correct result would be as in (20): 

(20) CVCCVC ( gilgel) 

However, this derivation is ill-formed in that the autosegmental theory 
prohibits the crossing of lines of association. We can therefore relax 
this prohibition in the case of Pilpel and Hitpalpel, with concomitant 
weakening of the theory. It is implausible also that the crossing of lines 
of association is simply very costly for the grammar employing it (that is 
if 20 is the only solution to reduplicated roots), as this phenomenon is 
very prevalent in Tigre, for example, and also Hebrew exhibits many such verbs. 

A second solution is to posit a quadriradical root, e.g., glgl for 
gilgel . This will entail a great loss of generality as many of these verbs 
are clearly related to biradical roots, e.g., gilgel 'roll (v.t.)' vs. 
galal 'roll (v.t.)', both derived from the root £l_. 

McCarthy (1981) opts for a process whereby "the root is reduplicated 
by one-to-one morpheme-to-morpheme association, and then elements of these 
morphemes are mapped onto the prosodic template... reduplication is 
accomplished here by mapping one root morpheme onto two root morpheme 
positions in a separate tier." (p. 408). For example: 

(21) CVCCVC (gilgel) (McCarthy's, 1981, 52a) 
l/\l ^^ 
gl gl 
/ ^ 

[ root ] [ root ] 



This kind of solution is a radical departure from the autosegmental theory 
initially adopted by McCarthy. There is no motivation for it within the 
theory and certainly there is no independent motivation for the process 
invoked to account for Pilpel and Hitpalpel. There isn't much that distinguishes 
this process from a transformational treatment of reduplication, which leads 
us to conclude that transformational notation is needed for at least some 
morphological processes. 

5.7. Pa'AL'AL^^ 

There is a very limited number of verbs in Hebrew showing a reduplicated 
last syllable, belonging to a binyan called Po'al'al. Thus corresponding to 
Pa'al saxar 'go about' we find the reduplicated form sxarxar 'palpitate', 
both from the root sxr . Other Ps'al'al verbs are: ' ahavhav 'flirt' (from 
ahav 'love'): xavarvar 'become slightly pale' (from xavar 'pale, v.i.) 
(Yannay, 1974a, b, lists seven such verbs). 

McCarthy (1981) notes that the prosodic template of Ps'al'al is ano- 
malous in Hebrew, since it involves an otherwise nonoccurring CVCVCCVC 
prosodic template. McCarthy suggests that it is derived from the CVCVC 
template of Pa'al by the suffixation of the syllably CVC, and that the 
syllables of Pa'al are mapped - left-to-right - onto the syllables of 
this new template. Thus "a further extension of this theory also handles 
the forms [of Pa'al'al]" (p. 409). 

This, of course, introduces a new type of rule to the theory - syllable 
reduplication - unlike any other rule previously posited. Recall that 
McCarthy posits a rule of Root Morpheme Reduplication to account for 

Pilpel and Hitpalpel forms. Intuitively, the reduplication in Pilpel, 
Hitpalpel and Pa'al'al is of one and the same type - final syllable 
reduplication - yet in McCarthy's analysis they are two unrelated pheno- 
mena, generated by distinct rules. 

The formulation of the syllable reduplication rule for Pa'al'al makes 
theoretically possible a process of non-final syllable reduplication - a 
process never attested in Hebrew and Tigre (i.e., a process allowing such 
forms as papa'al , etc.)^^. Indeed, that the theory is not so constrained 
is suggested by McCarthy (1981) in his analysis of the reduplicated 
Tagalog pag-lalakad 'walking' from um-lakad 'walk' (p. 413, number 59). 

In short, McCarthy started with a nice theoretical framework and ends 
up with an unconstrained, overly permissive one, severely undermining his 
ostensible aim of restricting the power of morphological rules. 

Although in Hebrew there are very few Ps'al'al verbs, there are quite 
a few nouns and adjectives with reduplicated last syllable, usually 
having a diminutive effect on the base meaning. E.g., yarakrak 'light green' 
from yarok 'green'; bcalcal 'small onion' from bacal 'onion'; zkankan 
'small beard' from zakan 'beard'; xataltul 'kitten' from xatul 'cat'; 
cmarmoret ^^ 'tremor' (cf. camar 'shiver, v.'). 

It is not at all obvious that the rules invoked by McCarthy to account 

for reduplication phenomena such as the above are in any way less powerful 

and to be preferred to the transformational approach taken by Aronoff (1976) 
and Lieber (1980). 


It is crucial for McCarthy's system that when a word is derived from 
another, only the root of the base or source word is available for the 
derivation of the new word. Thus, in a case of a verb in the second binyan 
of Arabic derived from a first binyan verb, the second binyan derivation 
need only take the root morpheme of the source word, the template of the 
second binyan being unique to this binyan. It is this very behavior of the 
root functioning as an independent unit in word formation that motivated 
McCarthy's formulation of a separate autosegmental tier for the root 
morpheme . 

However, there are word formation rules in Hebrew suggesting that the 
Word Formation Rule has access to the whole word, rather than solely to 
the root of the base, which is then mapped onto the prosodic template of 
the new word. We find this behavior in denominal verbs. In Hebrew, 
corresponding to the about seven verb conjugations (binyanim) , we find 
a considerably larger number of noun declensions ( mishkalim 'weights, 
matrices', in Hebrew) - noun prosodic templates onto which are mapped roots, 
affixes and vowel melodies to yield nouns. 

One such template is CVCCVC^^, and with the vowel melody a it defines 
professionals (very much like the English -er) . E.g., ganav 'thief, sap an 
'seaman', zamar 'singer', zaban 'seller', napax 'blacksmith', sapar 'barber', 
sartat 'draft sman ' . 


Another template is CCVCV and with the vowel melody ia it defines the 
deverbal noun of Pa'al verbs. E.g., Smira 'guarding', from ^amar 'guard'. 
Given a verb in Pa'al the deverbal noun Word Formation Rule need have 
access only to the verb's root morpheme to map it onto the deverbal noun 

Some of these noun templates have an affix added. E.g., forming abstract 
nouns from the template CVCCVC with the prefix t^- and the vowel melody au 
to give words like tallum 'payment' (cf. ^ilem 'pay'), talmud 'study' (cT. 
leimad 'study, v.'), tanxum 'condolence' (cf. nixem 'condole'). 

Many of these nouns can serve as the base for forming new denominal 
verbs. McCarthy's theory seems to predict that such derived verbs will take 
only the root of the base and map it onto a given verbal template. However, 
in many of these verbs all the consonants of the base are used in the 
derived verb, root plus affixes. 

For example, mafte'ax 'key' is derived from a template CVCCVC^^, 
the prefix m-, the vowel melody ae, and the root ptx (cf. pat ax 'open'). 
Other examples in this mishkal are: mazrek 'syringe' (cf. zarak 'throw'); 
maspex 'funnel' (cf. ^afax 'spill'); maklet '(radio) receiver' (cf. kalat 
'receive'). This mishkal forms, then, tools and appliances. 

The noun mafte'ax 'key, index' can serve as a base for a verb in Pi'el, 
but instead of the derived verb mapping only the root ptx to yield pite'ax 
we get the verb mifte'ax , in which both the prefix and the root are mapped 
onto the Pi'el template. This verb means 'use a key, make a key, index'. 
It might be argued that there already is a Pi'el verb having the root ptx , 
namely pite'ax 'develop', but usually in cases of occupied slots the deri- 
vation is simply blocked or another binyan is chosen. 

frame, n.' (cf. sagar 'close, shut') 

yields the verb misger 'frame'. Here there is no occupied slot in Pi'el, 
i.e., siger is possible and unoccupied. The noun takciv 'budget' (cf. 
kacav 'allot, ration') yields likewise the verb tikcev 'budget'. 

Such word formation processes support a word-based morphology, as 
against the root-based rules within McCarthy's analysis. One can, of 
course, claim that for such denominal verbs the root base is reanalyzed as 
including the prefix (i.e., the root of misgeret 'frame, n. ' is msgr and 
not sgr , etc.). Such an analysis is not easily accommodated in McCarthy's 
autosegmental analysis, as the prefix and the root are represented on 
separate autosegmental tiers. 


We saw that the principles and devices proposed in McCarthy's analysis 
of Arabic also handle to a large extent Hebrew morphology. Yet here as 
there crucial reference is made to the rule of Erasure to account for medial 
gemination and to morpheme and syllable reduplication rules to account for 
reduplication in Semitic, and we raised serious questions about the un- 
constrained character of these rules, and lack of independent motivation 
for such types of rules. 


Although McCarthy's analysis offers many insights into Semitic morphology, 
it is not clear that by the modification of autosegmental theory to account for 
the phenomena of gemination and reduplication, we end up with a theory that 
is more restrictive and less powerful than a theory that treats these 
phenomena by resorting to transformational morphological rules. 

*I would like to thank Michael Kenstowicz and Charles Kisseberth for 
their comments on an earlier version of this paper. 

■^McCarthy adopts the use of the Hebrew word binyan for Verb Conjugation 
(pi., binyanim ) as more felicitous than the Latin "conjugation", which 
has misleading connotations when applied to Semitic morphology. 

''Henceforth, the y (for 'morpheme') notation will not be indicated 
on melodies. However, it will be continued to be assumed that it is there. 

Unless otherwise stated, all Hebrew examples are from modem Hebrew, 
and all Hebrew verbs are given in their third person, singular masculine 
past form, which usually coincides with the past stem form. 

Given such a rule as Vowel Association, one might also posit a similar 
Consonant Association rule in the second and fifth binyanim, which asso- 
ciates the last root consonant prior to the universal conventions of asso- 
ciation. E.g., this rule would convert (22a) into (22b), to which the first 
and third conventions would apply to yield the geminated form in (22c) : 

(22) a. CVCCVC b. CVCCVC c. CVCCVC ( kattab) 







This analysis would do away with the need for the Erasure rule in the Second 
and Fifth binyanim. 

^Although McCarthy discusses Biblical Hebrew, the phenomena he discusses 
occur in modern Hebrew as well. 

The second k^ is spiranti; 
nant in Pi 'el is not spirantized. We would expect to find kikev. However, 
this is a denominative verb derived from the noun koxav 'star', and the 
spirantized k is kept in the derived verb. Likewise tilfen 'phone, v.', from 
telefon 'telephone, n.', and not * tilpen . The b spirantizes word-finally. 

^The relation between QIV and XI is not so obvious. Cf. McCarthy (1981, 
p. 395). 

This is an apparent counterexample to Schwarzwald's analysis. The 
homorganic t^ and £ appear in a consonant cluster, e.g., 'etsos 'I'll effervesce' 
hitsis 'cause to effervesce, incite'. Schwarzwald notes that there are a 

few instances of the clusters ts^, t^ , tc , tz , mainly word-initially, where 
in Biblical Hebrew there was an epenthetic schwa breaking the cluster. In 
modem Hebrew Schwarzwald proposes that these clusters be analyzed as one 
phoneme £ or i_. This analysis is unavailable in 'etsos , where _ts is 
obviously not one phoneme (cf. tasas ) . However, such cases are indeed rare. 

Under McCarthy's analysis cac and zaz would be uniradical, derived 
from the roots £ and z_, respectively (unless one would posit in these 
phonetically biradical verbs an underlying medial semivowel). In traditional 
terms, uniradical roots are extremely rare in Semitic. 

It was suggested by some Semitists (cf. Moscati, 1969, p. 72, 
Frajzyngier, 1979) that proto-Semitic had both biradical and triradical 
roots with corresponding biradical and triradical templates. Some time 
thereafter the triradical template took precedence, and the biradical 
roots were mapped onto triradical templates, yielding the verba mediae 
geminatae . It is certain that at that stage the monosyllabic verbs like 
gam 'get up', sam 'put', were triradical (having a medial semivowel) and 
that the semivowel was lost after the reduplication of biradical roots 
occurred, otherwise we would have had qamam from gam , samam from sam, etc. 
But if we assume that the proto-Semitic biradical roots are biradical also 
synchronically, as McCarthy's analysis seems to claim they are, then the 
Hebrew monosyllabic verbs pose a problem. 

reasonably derived from a triradical root. Moreover, we do find some 
triradical roots with a surface medial vowel, e.g., gawa^ 'die', cawax 
'shriek', 'ayav 'be a foe', forms which would prevent us from assuming 
a rule of semivowel drop in the monosyllabic verbs. (It should be noted 
that the semivowel w is phonetically realized in modem Hebrew as the 
fricative v^, except for some Oriental dialects). 

Another difference are the different prefixes of Hitpa'el and the 
Arabic Fifth Binyan. 

The output shows the effect of Spirantization on £ and b. 


For modem Hebrew the rule of Spirantization cannot be stated so 

simply. There are too many apparent counterexamples. 

k^'s in modem Hebrew, one that spirantizes (the reflex 
of the Biblical and proto-Semitic 1^), and one that doesn't (the reflex of 
Biblical and proto-Semitic £) . Cf. sakar 'survey' (from *£^) vs. saxar 
'dam up' (from *lc ) . 

After Spirantization. 

The schwa is not underlying; it is sometimes inserted to break word- 
initial consonant clusters. McCarthy's discussion of these forms in Biblical 
Hebrew suggests that he assumes that the schwa is underlying. 


^^Although Littmann and Hoefner (1956-1962) list many C1C2C3C2C3 

forms in Tigre, I found no C1C2C1C2C3 forms. Hebrew has only one Papa'al 
form. This is the verb yafyafa 'be extremely beautiful' (first attested 
in Psalm 45,3 and retained in modem Hebrew) from yafa 'be beautiful' 
(root ypy ) . One can doubt, of course, the triradicality of this root, 
assuming a biradical root y£ instead, thus dissolving the counterexample. 


The _t is the feminine singular morpheme. 

This is the same template posited for Biblical Hebrew Pi 'el binyan 
(only the vowel melody differs). As with Pi'el forms, the examples, taken 
from modern Hebrew, do not show gemination and spirantization. 


The epenthetic ^ in mafte'ax is inserted before a word-final x 

which is the reflex of the Biblical h. It is not inserted before a 

spirantized k_, e.g., me lex 'king' (root mlk) . 


ARONOFF, M. 1976. Word formation in generative grammar. Cambridge, MA: 

MIT Press. 
AVINERY, I. 1976. The book of mishkalim (in Hebrew). Tel Aviv: Jezre'el. 
BARKALI, S. 1977. Complete verb tables (in Hebrew). Jerusalem: Rubin Mass. 
BLAU, J. 1974. Phonology and morphology (in Hebrew). Hakibbutz Hame'uchad. 
BOLOZKY, S. 1978. Word formation strategies in the Hebrew verb system: 

denominative verbs. Afroasiatic Linguistics 5, 111-136. 
CARRIER, J. 1979. The interaction of morphological and phonological rules 

in Tagalog. Ph.D. dissertation, MIT. 
CLEMENTS, G. N. and K. Ford. 1979. Kikuyu tone shift and its synchronic 

consequences. Linguistic Inquiry 10, 179-210. 
EVEN SHOSHAN, A. 1975. The new dictionary (in Hebrew). Jerusalem: Kiryath 

FRAJZYNGIER, Z. 1979. Notes on the R1R2R2 stems in Semitic. Journal of 

Semitic studies 24, 1-12. 
GESENIUS, W. 1909. Hebraeische Graimnatik, 28. von E. Kautzsch umgearbeitete 

Auflage. Leipzig: Vogel. 
GOLDSMITH, J. 1976. Autosegmental phonology. Ph.D. dissertation, MIT. 
GREENBERG, J. H. 1950. The patterning of root morphemes in Semitic. Word, 

VI, 162-181. 
KIPARSKY, P. 1968. Linguistic universals and linguistic change. In E. Bach 

and R. Harms, eds.: Universals in linguistic theory. New York: Holt. 
KISSEBERTH, C. 1978. Notes on Tigre phonology and morphology. MS., University 

of Illinois. 
LEBEN, W. 1973. Suprasegmental phonology. Ph.D. dissertation, MIT. 
LIEBER, R. 1980. On the organization of the lexicon. Ph.D. dissertation, MIT. 
LITTMANN E., M. Hoefner. 1956-1962. Woerterbuch der Tigre-Sprache. Wiesbaden: 

MANDELKERN, S. 1896. Veteris testamenti concordantie: Hebraicae et Chaldaicae. 

Leipzig: Veit. 
McCarthy, J. 1979. Formal problems in Semitic phonology and morphology. Ph.D. 

dissertation, MIT. 

McCarthy, J. 1981. a prosodic theory of nonconcatenative morphology. 

Linguistic Inquiry 12.373-418. 
MOSCATI, S. (Ed.). 1969. An introduction to the comparative grammar of the 

Semitic languages. Wiesbaden: Otto Harrassowitz. 
RAZ, S. 1980a. The morphology of the Tigre verb (MansaC dialect), I. 

Journal of Semitic Studies 25.66-84. 
RAZ, S. 1980b. The morphology of the Tigre verb (Mansa'^ dialect), II. 

Journal of Semitic Studies 25. 205-238. 
SCHWARZWALD, 0. 1974. Roots, patterns, and morpheme structure (in Hebrew). 

Le^onenu 38.131-137. 
YANNAY, I. 1974a. Multiradical verbs in the Hebrew language (in Hebrew). 

LeSonenu 38. 118-130. 
YANNAY, I. 1974b. Multiradical verbs in the Hebrew language (conclusion) 

(in Hebrew). Le?onenu 38.183-194. 

Studies In the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 


Bruce Arne Sherwood 

Taped Esperanto conversations among skilled speakers 
were transcribed and statistically analyzed. The 
frequencies of phonemes, two-phoneme sequences, and 
grammatical categories were obtained. Statistics on the use 
of compound and derived words are presented. The most 
interesting data deal with the use of the accusative. It is 
shown that spoken Esperanto Is dominantly SVO, and that 
other constituent orders are quite rare and restricted to 
special constructions. There Is a discussion of the 
soclollngulstic and language-planning consequences of the 
observed accusative usage. 

1. Introduction 

Statistical analyses of Esperanto usage have typically been based on 
written texts (e.g., van Themaat 1977). Sociolingulsts have emphasized not 
only that speech can be quite different from writing, but that informal 
speech is in many ways the most representative of the various ways in which 
people speak. Sociolingulsts go to great lengths to devise interview 
strategies which will elicit Informal speech. It would be interesting to 
analyze informal Esperanto speech. Of general interest, there might be 
special effects due to native-language influence, the greater emphasis on 
writing and reading due to the nature of Esperanto usage, and the possibly 
greater conscious awareness of formal rules on the part of Esperanto 
speakers. More narrowly, it is interesting to see how the language is 
actually spoken, in possible contrast to its prescriptions. 

The audio tape service of the Esperanto League for North America (Box 
1129, El Cerrito CA 94530) includes in its catalog of recordings not only 
formal speeches but also conversations at the 1972 World Esperanto Congress 
held in Portland, Oregon. These conversations are presumably more self- 
conscious than would be ideal for some purposes, but they at least provide 
a more natural data base than writings or formal speeches, which may simply 
be readings of written papers. One of the participants (Duncan Charters, 
private communication) reports that during the conversations headphones and 
special microphones were worn in order to improve the audio quality, which 
may have tended to make the speech more self-conscious than desirable for 
the present purpose. However, another participant (William Auld, private 
communication) feels that the conversation between him and Peter de Smedt 
was rather natural despite the apparatus, partly because the taped 
conversation was a continuation of a conversation already in progress. 

Only the pauses seemed awkward, due to the consciousness that one should 
keep going, and some slight embarrassment Is noticeable at these points on 
the tape . 

It was hoped that contrasts could be obtained for diverse native- 
language backgrounds. Unfortunately, among the relatively small number of 
taped conversations there are no native speakers of non-Indo-European 
languages, which is a disappointment. Nevertheless, the present analytical 
study may provide a framework and a stimulus for more field work to obtain 
samples from more diverse and more representative types of speakers. When 
this study was nearly completed, it was learned that the International 
Cultural Service (Amruieva 5/1, 41000 Zagreb, Jugoslavia) is presently 
processing fifty thousand words of speech recorded at recent Esperanto 
World Congresses, in order to obtain conversational word frequencies for 
designing better teaching materials. As they are using a computer for this 
analysis, perhaps they will extend their work to other aspects of these 

2. Methodology 

The conversational material was transcribed and typed into a computer 
file for statistical processing. In order to facilitate this processing, 
the first letters of morphemes internal to words were indicated by capital 
letters, so that nekredeble 'unbelievably' , for example, was entered in the 
form neKredEble . Note that grammatical endings were not indicated unless 
they were internal to a word, as in unuAFoje 'for the first time' or 
slNTeno 'attitude' , since it is possible to identify final grammatical 
endings by computer program (Sherwood 1981a, 1981b). Also to facilitate 
computational processing, participle endings were indicated by 
capitalization, as in neKredAnto 'unbeliever' or korektita 'corrected' . 

The frequencies of compound words, grammatical categories, and 
phonemes were extracted by computer program. Statistics on the order of 
major constituents (subject, verb, object) were determined by hand, as it 
appeared too difficult to do this by program. In Esperanto, the dominant 
order is SVG but, at least in writing, other orders are quite common, and 
it is of interest to have some measure of this property for conversations. 
Although the direct object is marked by the accusative ending "-n" , there 
are some other uses of the accusative case, which makes it difficult to 
identify constituent order by computer program. Another class of items 
which was treated by hand were inflected words (such as a verbal root used 
as a noun). 

Compound words for the purposes of this treatment were defined simply 
as words which contained more than one morpheme, not counting grammatical 
endings. Included were not only the relatively few compounds composed of 
full roots such as labor Ritmo 'work-rhythm' but also the many compounds 
made by the highly productive roots traditionally called "affixes", such as 
junUloj 'young people' and movAdo 'movement' , and even the words formed 
with participle endings, such as kredAnto . The rationale for considering 
junUloj as a compound is that ulo by Itself is a person, and since ul^ is as 
free a morpheme as jun 'young' it seems Inappropriate to call ul a suffix, 
although that is what it has traditionally been called. The example of 
movAdo 'movement' is more problematical, in that it may seem more 

appropriate to analyze the word as having ad 'continued action' as a suffix 
which modifies mov 'move' , rather than to think of the word In terms of 
"move-like continued action". 

In Individual cases the distinction between compounding and 
suf fixation seems to Involve somewhat subjective judgement, within the 
system of Esperanto word-formation. It seems preferable to call all multi- 
root words "compounds", especially since the so-called "affixes" are 
actually morphemes which are as free as any other content morphemes. (By 
"content" morphemes I mean those which require grammatical endings, unlike 
the "function" morphemes such as prepositions which do not require 
grammatical endings.) As for participle endings, despite the fact that 
anto or into 'present or past actor' are almost never used by themselves in 
Esperanto, such usage is not "ungrammatlcal" and is similar to the use of 
"ism" in English. In any case, all "compound" words are listed in Appendix 
A so that the reader may re-analyze them as desired. 

In this analysis "inflected" words are simply those whose grammatical 
endings are different from the basic grammatical category of the root, as 
determined by checking in a standard dictionary (Wells 1969). For example. 
Wells lists the adjective simila 'similar' as the base form, so the verb 
similas 'is similar to' is counted as an inflected word. In the 
introduction to his dictionary Wells (1969:7-9) gives a good summary of the 
traditional theory that each Esperanto content root has an inherent 
grammatical category. Szerdahelyi (1976, 1978) has criticized this theory, 
arguing instead that full words are borrowed from national languages, from 
which a category-less root is formed by autonomous Esperanto processes. It 
appears that both theories lead to similar practical results. There is at 
least some psychological reality to the notion of inherent category, in 
that I nearly always found I had guessed the category correctly when I 
checked doubtful cases in the dictionary. A likely cause is that the 
Esperanto roots, coming mainly from European languages, have overtones of 
grammatical category that speakers of European languages will usually find 
natural. Another cause may be that usage patterns within Esperanto Itself 
have over the years confirmed the category assignments of its roots, even 
for non-European speakers. That is, simila is a commonly-used adjective 
and is more common than similas , whereas movas 'moves' is much more common 
than mova 'motional' . 

3. Data and results 

The Scot William Auld, one of the most outstanding poets writing in 
Esperanto as well as a perceptive essayist on literary matters, and the 
Flemish Peter de Smedt were recorded in an Interesting conversation dealing 
with the Hungarian Esperanto poet and novelist Julio Baghy, with the nature 
of translations of poetry into Esperanto, and with the need for definitive 
historical studies of the Esperanto literature. In a thirty-minute 
conversation each man spoke for almost exactly fifteen minutes. Auld spoke 
1782 words (119 words per minute) and De Smedt 1948 words (130 words per 
minute). In no case did there seem to be any ambiguity as to how to break 
up the speech stream Into "words", basically because of the invariant 
penultimate stress and the distinctive grammatical endings. 

Appendix 1 gives the phoneme frequencies for this conversation, 
Appendix 2 gives the frequencies of two-phoneme sequences, and Appendix 3 

gives the frequencies of the various grammatical categories. These data 
are presented mainly because they were easily obtainable, but they can be 
of use in certain applications. For example, the frequencies of two- 
phoneme sequences have been used to plan the construction of a diphone 
library for purposes of speech synthesis in our laboratory. 

In Appendix A are listed the compound words, including participles and 
words made with productive "affixes". Estimating the standard deviation as 
the square root of the number of compound words found, (7.6±0.7)% of Auld 's 
words were compounds (136 out of 1782), and (8.9±0.7)% of de Smedt 's (173 
out of 19A8), which are equal frequencies within the estimated errors. It 
is noteworthy that most of these compound words are themselves quite 
common, being encountered frequently in texts and often even as specific 
entries after the root in large dictionaries. There are very few really 
novel coinages made on the spot during the conversation. Rather, the two 
men pretty much limited themselves to compounds that have already been used 
extensively in the language. I estimate that only about three percent of 
the compounds are novel, in that about that many words struck me as fresh 
and unusual. 

Of Auld's 1782 words, 211 or (11.8±.8)% were inflected words (e.g., a 
verbal root forming a noun), including 6 nouns made with "adjectival" 
participle endings, such as kredAnto 'believer' . Of these 211 inflected 
words, 50 were compound (multi-root) words found in Appendix 4, such as 
florEce 'in a flowery way' . For de Smedt, out of 1948 words, 232 or 
(11.9±.8)% were inflected (of which 8 were participial nouns), with 60 of 
these appearing among the compounds listed in Appendix 4. The frequencies 

Auld de Smedt 

inflected only 9.0% 8.8% 

compounded only 4.8% 5.8% 

inflected compounds 2.8% 3.1% 

totals 16.6% 17.7% 

All the inflected words are shown In Appendix 5. Like the compounds, the 
Inflected words for the most part are forms which have often been seen 
before. Only a few percent of the words were novel coinages, though at one 
point in the conversation the two men were themselves amused at de Smedt 's 
invention of the word paper e 'in a paper-like way' . 

The most interesting data concern constituent order, usually called 
word order, the latter being a somewhat misleading term for describing the 
sequence in the sentence of the major constituents subject, verb, and 
object. There were 179 main clauses where a transitive verb (either finite 
or infinitive) referred to a direct object. Despite the potential freedom 
of constituent order in Esperanto, which is often exploited in writing, 
there were only 15 examples out of 179 in which the order was not subject- 
verb-object (SVO; VO in the case of infinitives). Moreover, of these 15 
cases, eight (examples 1-7 and 10 below) involved merely moving the 
demonstrative object pronoun tlon , of which seven were of the OSV form with 
tion followed by a personal pronoun followed by a verb, and the other 
(example 7) was SOV, also with a personal pronoun subject ( vi tion faris 
'you that did' ). Seven of these eight tlon examples were due to Auld. 

Four examples (11-14 below) involved a personal object pronoun or the 
common word multon 'much' preceding an infinitive, all due to de Smedt, who 
also moved the simple object nenion 'nothing' before a verb (15). This 
leaves only two examples of constituent orders other than SVO involving 
uncommon nouns (examples 8 and 9). It appears that these two speakers have 
developed individual non-SVO patterns which they use for special emphasis, 
but that they very rarely venture outside the confines of these stereotyped 
patterns. For completeness, it should be mentioned that there was one 
question involving (normal) OSV order: Kion vi opinias ? 'What do you 
think?' . Here is a complete listing of all 15 non-SVO examples: 


(1) tion oni povas bedaifri 

(2) tion oni faras 

(3) tion mi jam diris 

(4) ne tion ni bezonas 

(5) tion oni faras 

(6) tion oni povas fari 

one can regret that' 
one does that' 
I already said that' 
we don't need that' 
one does that' 
one can do that' 

(7) vi tion faris 

'you did that' 

(8) min emocias, min inspiras £iuj aspiroj 

'all aspirations move me, inspire me' ; said with 
great deliberateness and emphasis 

(9) mi bibliografion faras 'I am making a bibliography' 

de Smedt 
(10) tion mi ne diris 

'I didn 't say that' 

(11) gin fari 

(12) sin defendi 

(13) nin gojigi 

(14) multon lernl 

'to do it' 

'to defend oneself 
'to make us joyful' 
'to learn much' 

(15) ml nenion plu aifdis 

'I heard nothing more' 

It is striking that except for aspiroj 'aspirations' and bibliografion 
'bibliography' , the only words which have been moved out of SVO order are 
very common words. It should also be pointed out that among the 179 main 
clauses involving transitive verbs there are three examples where the 
object is the title of a book, without accusative endings, and in all of 
these examples the order is SVO. This latter observation may not have 
great significance, in that even in formal writing one sees considerable 
vacillation between declining or not declining proper nouns and book 

Another interesting aspect of this conversation is that while de Smedt 
six times exploited the freedom of constituent order which comes from the 
explicit accusative marker -n, he made errors just as often in his use of 
this marker. (Auld made no case errors.) On two occasions de Smedt failed 
to add the -n to a direct object and four times he attached an -n 
unnecessarily. In sentences 16 and 17 shown below, the -n has been dropped 
in simple SVO situations, perhaps due to the heavy dominance of SVO order 


which makes the explicit accusative marking rather redundant. In sentence 
18, the complement of the intransitive verb "become" has incorrectly been 
given the accusative endings, probably due to the superficial resemblance 
to an SVO situation. The explanation for case errors in sentences 19 and 
20 may simply be that the speaker started out intending to make an object- 
initial construction but got sidetracked by the long subordinate clauses, 
so that the noun eventually played the role of subject instead of object. 
In a somewhat similar manner, it is likely that in sentence 21 the speaker 
started out to say something like "which we like" and changed in midstream 
to "which can make us happy". The case errors shown in sentences 16, 17, 
and 18 may indicate imperfect command of the accusative in a non-native 
language, whereas the errors of 19, 20, and 21 appear more like typical 
hesitation phenomena seen even among native speakers of any language. 

(16) mi trovos tricent eroj(n)... 
(...possibly I will find three hundred items...) 

(17) ...pretigas, jes, centpagan provkajeron, klu enhavos do 
parto(n). . . 

(...prepares, yes, a hundred-page trial booklet, which 
will contain therefore a part...) 

(18) nepre farigus iun^ tre subjektivan literaturhistorion. . . 
( necessarily would become a very subjective literary 

history. . . ) 

(19) Mi tamen konstatis ke guste la poemon kiun vi citis en via 
artikoleto pri tiu studo prl Hector Vermojten - guste 
temis. . . 

(I nevertheless realized that precisely the poem which you 
cited in your article about that study about Hector 
Vermojten - precisely had to do with...) 

(20) jes, kaj iun par ton el tiu enciklopedio, klu estas treege 
bezonata, tiu estas ekzemple la historio de... 

(Oh yes, and one part of that encyclopedia, which is 
greatly needed, that is for example the history of...) 

(21) Mi pensas ke tiu tamen estas fakto kiun povas nin gojigi. 
(I think that that nevertheless is a fact which can make 

us happy.) 

A partial analysis was made of one other conversation from the 
Portland conference to check in particular some of the results on 
constituent order. This conversation consisted of six-and-a-half minutes 
between the Pole Ada Fighiera-Sikorska and the Bulgarian Stojan Djoudjeff, 
followed by a conversation between Djoudjeff and the Briton Duncan 
Charters, of which eleven-and-a-half minutes were analyzed. Out of 84 main 
clauses involving transitive verbs, 80 (95%) had SVO order. The other four 
examples were these: 

(22) ...mi ankau tion esperas 'I also hope that' (Fighiera-Sikorska) 

(23) ...ankau tiun trajton de la kongreso mi rimarkis 

'I also noticed that characteristic of the congress' (Charters) 


(2A) verkado ilin kaptas 'Writing captivates them' (Djoudjeff) 

(25) ...oni povas tion atingi 'One can achieve that' (Charters) 

We see the same features noted in the Auld-de Smedt data. The SVO order Is 
highly dominant, and three of the four non-SVO orders involve moving tion 
or the pronoun ilin . No errors in the use of the accusative were noted for 
these very skilled speakers, with the possible exception of a sentence by 
Fighiera-Sikorska: mi vizitis krome Tokio(n) 'In addition I visited 
Tokyo' . As mentioned earlier, there is some variation in the declining of 
proper nouns, although one would normally use the accusative case in this 
situation, because the city name has been completely Esperantized and ends 
in a typical -o. 

A. The problem of the accusative 

It is well known that the accusative is the biggest source of errors 
for beginning students of Esperanto, and there is some awareness that even 
skilled speakers occasionally make case errors. This seems to be rather 
independent of the person's native language and occurs even with speakers 
as experienced as de Smedt, who is a serious student of Esperanto 
literature. It would seem that for most speakers the fact that the 
accusative poses serious problems might be explained as follows. In 
speaking, the SVO order is so massively dominant (92% in the Auld-de Smedt 
data and 95% in the other conversation, with the bulk of the remainder 
quite special) that it is natural to drop the -n as redundant on the 
object, since order alone normally identifies the subject and object. 
Although I don't have firm data, I have the impression from my own s{>eech 
and from casual observation of other speakers that -ii is sometimes added to 
subjects and in prepositional phrases as a form of hypercorrection, out of 
the guilty knowledge that one is apt to forget -n^ on objects. 

The Esperanto accusative has often been attacked as being excess 
baggage and inappropriate in a language intended to be easy to learn and 
use. Apologists have countered these attacks with various arguments. One 
claim is that the free order benefits native speakers of languages which 
don't have SVO order, making it possible for them to use the constituent 
order which they find most natural. However, there is no real evidence 
that this in fact occurs: although formal studies have not been done for 
spoken Esperanto, in my own conversational experience I have noted that SVO 
is very much dominant for Hungarian and Japanese speakers (whose native 
order is not SVO). Moreover, there are very few languages in the world 
which normally place object before subject, so it is natural for all 
speakers to take the second noun phrase as the object, even without an 
accusative marking. Sometimes contrived sentences are exhibited which show 
how the accusative can remove ambiguities, ignoring the fact that speakers 
can and do use other available constructions to avoid such problems. An 
example of this type of argument involves the pair of sentences Mi trovis 
la vinon bonan 'I found the good wine' and Ml trovis la vinon bona 'I found 
the wine to be good' . One can artificially construct many examples of this 
kind, but in practice speakers use other structures to disambiguate, 
including context. 

Perhaps the most sophisticated defense is the one which points out that 
in communication between speakers from different cultural backgrounds, 

extra precision and redundancy are needed to compensate for the lack of 
shared assumptions and backgrounds. But if further studies confirm what 
has been found here, all Esperanto speakers share a highly dominant SVO 
order and a common tendency not to use the accusative correctly, which 
weakens the argument. It can be agreed, however, that the accusative has 
great value in writing, at least optionally, because sentence structures 
can be much more complex than in speech. For example, active OVS sentences 
with complicated subjects in Esperanto often must be translated as passives 
into English, due to the rigid English constituent order: La libron verkis 
juna fizikisto kaj sperta kemiisto 'The book was written by a young 
physicist and an experienced chemist' . 

It is probably fair to say that the spirited defense given for the 
clearly problem-ridden accusative really springs from social and political 
aspects of Esperanto language planning. Many historians of constructed 
languages have concluded that the 1905 social contract on the 
"untouchability" of the basic core of Esperanto was crucial in combining 
the necessary stability of the language with adequate capacity for 
evolution (Drezen 1931; Janton 1973; Golden 1977). Other constructed 
languages which lacked such a contract among the users tended to break up 
into dialects as reformers tinkered incessantly, seeking the holy grail of 
perfection, while the shelter of the principle of untouchability allowed 
the steady growth of a community of Esperanto speakers and a vital 
literature. A striking example is Ido, the 1907 offspring of Esperanto 
which was intended to remedy perceived failings of its parent, including 
the mandatory accusative. In Ido, the accusative ending was optional 
unless the object preceded the subject (Carlevaro 1978), a rule which can 
be observed to hold in practice for many people's spoken Esperanto. While 
this change was surely in the right direction on narrowly linguistic 
grounds, this and other changes opened up Pandora's box, leading to rapid 
instability as more and more of Ido was perceived to need "improvement". 
The leaders of the Ido movement soon found themselves having to impose 
artificially a period of no change, after having mocked the Esperanto 
"untouchability" as being a mere superstitious fetish. 

It is with this historical background that justifications of the 
Esperanto accusative must be viewed. Speakers are naturally reluctant to 
discuss possible changes in the core of the language. However, the price 
paid in this case for stability is rather high. Insistence on "correct" 
use of the accusative makes learning the language more difficult. Many, 
perhaps most, speakers are unable to eliminate case errors even after years 
of experience. For speakers of moderate skill, failure to use the 
accusative "correctly" can lead to self-consciousness and to condescension 
from those more skilled. These are social effects which are undesirable in 
an auxiliary language intended for easy use by ordinary people. 

There is a possible resolution of the dilemma of how to remove the 
problem without violating the important principle of untouchability. In 
1913 the forerunner of the Academy of Esperanto accepted the "principle of 
necessity and sufficiency" (Cherpillod 1979) proposed by Ren6 de Saussure, 
brother of the famous linguist Ferdinand de Saussure. This principle 
states that good style dictates using only those affixes really needed to 
fully define a word, and none that are obviated by the surrounding context. 
Thus desegnajo 'thing which is drawn' and desegnado 'continued act of 


drawing' can be and should be shortened to desegno whenever the context 
makes clear whether an object or an action is being described. 

Perhaps the Esperanto-speaking community would be willing to admit 
openly that this principle is being applied by speakers to the accusative. 
When one says Ml vidas kato instead of Mi vidas katon 'I see a cat' , the 
principle of necessity and sufficiency is enough to indicate that it is the 
cat which is seen, given that 1) SVO order is strikingly dominant; 2) the 
vast majority of non-SVO orders in speech involve rather special, 
stereotyped forms; and 3) regarding this as an "error" flies in the face of 
the fact that many skilled speakers do say this (and they say it precisely 
because there is no real ambiguity). 

It is interesting that a principle of necessity and sufficiency can be 
seen to control the accusative in terms of semantics. Because books don't 
see, if I say Libro mi vidas or Libro vidas ml , there is no doubt that it 
is the book that is seen, despite coming first in the sentence. Richard 
Wood kindly pointed out to me that in the carefully edited and proofread 
"Der Esperantlst" , published in East Germany, there occurred the following 
"error" (Thomas 1980): "La posttagmeza programo je 16:30 h. estas 
aifdebla. . .kaj la vespera programo je 22:30 h. nl povas aifdi..." (The 
afternoon program at 16:30 is audible. . .and the evening program at 22:30 we 
can hear; should be vesperan programon ). It seems likely that this "error" 
on the part of author, editor, and proofreader (all speakers of German, 
which has an accusative) reflects their natural application of the 
principle of sufficiency. Semantically , only "evening program" can be the 
object in this CSV sentence, and this is the probable cause of the missing 
accusative ending. 

Wood has sent me a number of other examples of accusative errors 
observed In written Esperanto. Of particular Interest are two CVS 
sentences in which the accusative ending is missing despite the inverted 
order. The first Is from a book review, and the second from an 
announcement of a conference: 

(26) Estas nun instrue vidi, kio pensas profesia lingvisto... 
It is now instructive to see what a professional 

linguist thinks... (should be kion ) 

(27) Tlu ci renkontigo organizes Esperanto-sekcio. . . 
This meeting is organized by an Esperanto-section... 

(should be tlun £1 renkontigon ) 

Again, because "professional linguist" and "Esperanto-section" are 
semantically the only possible subjects, the accusative ending was omitted 
on the object despite the fact that the object actually precedes the 

David Gold (personal communication) has noted that even in writing a 
common error is to use the accusative in a nominative slot. This may be a 
kind of hyper-correction. I have noticed in letters I write and receive 
that both extra and missing accusative markers are common, and that typists 
often go back and pen in an -n or blot one out. 


An argument against extending the principle of necessity and 
sufficiency is that if carried to extremes one might argue that all 
redundancy be omitted, including the use of the grammatical endings. 
However, it is an observational fact that Esperanto speakers do not omit 
grammatical endings, nor do they seem to have much trouble with adjective- 
noun number agreement (no agreement errors were noted ia the Auld-de Smedt 
conversation). In these areas description matches prescription. It is 
mainly in the use of the accusative that practice diverges significantly 
from theory. 

5. Conclusions 

Before statistically analyzing an Esperanto conversation it was 
thought that native-language influences might show up not just in phonetics 
but in syntax or word-formation. Such influences were not identified. It 
was found that constituent order is almost exclusively SVO, that other 
orders involve the movement almost solely of very simple objects, and that 
errors in the use of the accusative may be about as common as non-SVO 
orders, even among very skilled speakers. Although no firm data are 
available, my conversational experience with Japanese and Hungarian 
speakers indicates that these results hold true for non-SVO native 
languages as well, so these effects are apparently independent of native 

The difficulties of the accusative lead to undesirable discriminations 
between skilled and less-skilled speakers, no matter what their native 
language. It might be helpful if the Academy of Esperanto would 
acknowledge the fact that the accusative is observed to be optional. One 
way to do this would be through an extension of the principle of necessity 
and sufficiency. The reformist problems Ido and other constructed 
languages encountered are probably avoidable in modern Esperanto if one 
speaks only of extending the existing principle in a specific, narrow area. 
It may be that the community will recognize that its maturity and size 
permit it more safe leeway than was possible 75 years ago. On the other 
hand, de Saussure 's principle was considered just a minor stylistic 
interpretation of Esperanto usage, and extending this principle to 
something as fundamental as the accusative would be considered a big leap 
by many speakers. It is possible that the perils of making a fundamental 
change in the core of the language might outweigh the pedagogical and 
social benfits of simplifying accusative usage de jure as well as de facto. 


I give special thanks to Chin-Chuan Cheng for support and 
encouragement. I thank Richard Wood for sending me corraborating data. I 
thank David Gold for helpful comments. I am grateful to the Department of 
Linguistics for hospitality during a period of interdisciplinary study 
(1979-80), and to the Computer-based Education Research Laboratory and 
Department of Physics for their generosity in making that study possible. 
This has led to a new and deeper association with the Department of 
Linguistics. This paper was presented at the 1980 Symposium on Native- 
Language Influence on Esperanto, held in Urbana. 


CARLEVARO, Tazlo. 1976. Eseo pri la planllngvo Ido. La Chaux-de-Fonds : 

Kultura Centre Esperantista. 
CHERPILLOD, Andr4. 1979. Neceso kaj sufico. Heroldo de Esperanto, 

April 3. 
DREZEN, E. 1931. Historio de la Mondolingvo. Reprinted 1967, Oosaka: 

GOLDEN, Bernard. 1977. Political factors in the international language 

movement. Eco-Logos 23:86.13-16. 
JANTON, Pierre. 1973. L'esp^ranto. ("Que sais-je" series.) Paris: 

Presses Universitaires de France. 
SHERWOOD, Bruce. 1981a. Computer processing of Esperanto text. Studies in 

Language Learning 3:1.145-155. 
SHERWOOD, Bruce. 1981b. Komputila traktado de Esperanta teksto. In C. 

Bertin, ed . , Unua Simpozio pri Komputiko, pp. 21-41. Rennes: the 

SZERDAHELYI, Istv^n. 1976. U semantika modelo de Esperanto. In Zlatko 

TiSljar, ed.: Internacia Lingvistika Simpozio, pp. 85-152. Zagreb: 

Internacia Kultura Servo. 
SZERDAHELYI, Istvin. 1978. Vorto kaj vortelemento en Esperanto. 

Literatura Foiro no. 50, 8-10. 
THOMAS, Paul. 1980. Esperanto en Radio Varsovio. Der Esperantlst no. 99 

VAN THEMAAT, W. A. Verloren. 1977. Productive word-formation in four 

natural and two constructed languages. Eco-logos 23:86.3-12. 
WELLS, John. 1969. Esperanto Dictionary (in the series "Teach Yourself 

Books"). London: Hodder and Stoughton. New York: David McKay. 




Phoneme Frequencies 


de Smedt 

I 11.5% 


) 1.0 


: .8 


: 1.0 


I 2.7 


! 10.9 




; 1.2 


! .6 


» .9 


i .00 




1 2.7 


1 .04 


: A. A 




1 3.5 


I 7.6 


. 8.2 


. 3.0 




1 7.0 


: .25 




1 3.1 


r .3 






Combined Frequencies for Both Speakers 

























































Appendix 2 
Two-phoneme Frequencies 












R 1 


3 k 





r s 





































12 20 



a 9 










1 6 





23 64 







1 1 

c 1 










d 2 






e 15 
















42 39 





















1 7 

















7 28 



J 8 














5 2 

1 1 


















n 17 





























18 8 






r 1 














8 28 















17 1 


u U 














9 8 

u 1 












Number of dlphones per 3000 phonemes in a 30-mlnute sample of 17120 phonemes. On the 
left Is the first phoneme, across the top the second. For example, there were 18 Inst 
of the sequence "ed" and 27 Instances of the sequence "de" per 3000 phonemes. P stand 
pause. W for word boundary. Blank Indicates no occurences, Indicates less than 0.5. 
sample contains a few proper names and foreign words. 

Appendix 3 

Grammatical Gate 






6.8% (U.5) 




13.7 (24.9) 




12.0 ( 9.8) 


( 6.5) 


7.5 (27.6) 




6.3 ( 1.8) 


( 0.8) 



















other 37.1 36.6 

The percentages in parentheses represent the fraction of 
each category that was in the accusative case. For example, 
of Auld's 1782 words, 122 or 6.8% of them were adjectives, 
and 11.5% of these 122 adjectives (14) were in the accusative 
case. The accusative case for adverbs indicates "direction 


Appendix A 
Compound Words 


136 compound words out 

(1782 words)/(15 min.) 















































>f 1782 total words - (7.6±0.7)X 
■ 119 wpm 




























































































De Smedt 

173 compound words out of 1948 total words 

(1948 words)/(15 min.) = 130 wpm 




























































junlllo j 


























malOf te 


















Esperanto-revuo j 



























































mo V Ad on 













Appendix 5 
Inflected Words 





































"Verko jn 


unua j 












































































































































































de Sraedc 










































opinio jn 






traduko jn 






































































traduko j 

































































traduko j 










tutSimple . 








































Studies I n the Linguistic Sciences 
Volume 12, Number 1, Spring 1982 

Bruce Arne Sherwood 

Questions are often raised about the mutual Intelligibility 
of Esperanto spoken by people with different first languages, and 
about the likelihood of Esperanto splitting Into mutually 
unintelligible dialects If It were used on a wide scale. An 
attempt Is made to describe and explain the present situation, 
and speculations about the future are also made. Important 
factors to be taken Into account are the nature of the Esperanto 
speech community; the ways in which vocabulary growth Is 
controlled; pronunciation norms; and phonological and 
morphological aspects. 

1. Introduction 

Two related questions are often raised about varieties of Esperanto. 
One is whether the language is at present mutually intelligible between 
speakers of different first languages (e.g., can Japanese and American 
speakers understand each other), while the other asks whether Esperanto 
would fall into mutually unintelligible dialects if It were ever used on a 
vast scale (as everyone's second language). To a first approximation, the 
present situation is one of good Intelligibility among all speakers, 
independent of first language, and to a large extent there has emerged an 
agreed-upon norm, despite the geographical dispersion of Esperanto 
speakers. I will attempt to explain how this has come about. I will also 
discuss factors likely to influence future evolution of the language. It 
will be shown that some assumptions which are valid and useful for studying 
first languages are not necessarily helpful in understanding a language 
which is spoken mainly as an auxiliary second language. 

2. Who speaks Esperanto? 

Judging from the 36,000 dues-paying members of the World Esperanto 
Association and its national affiliates (Esperanto 1982), recognizing that 
many speakers do not belong to one of these organizations, and considering 
the currency exchange difficulties of the large numbers of Esperantists in 
the communist world, there must be at least several hundred thousand 
speakers of Esperanto as a second language. Some standard reference books 
give numbers as high as several million, but these estimates may be quite 
arbitrary. There Is a problem in obtaining accurate figures, in that the 
low densities of speakers and their mainly second-language use of Esperanto 
make it unlikely that a normal language census would identify Esperanto 
speakers. The highest density of speakers (as a fraction of the total 
population) Is found in East European countries, but West Europe 

contributes the largest fraction of movement leadership. There are 
significant Esperanto activities in Asia, in Japan, China, South Korea, and 
Vietnam. Numerically small but active groups of Esperanto speakers are 
found in the Americas, particularly in Brazil, Canada, and the United 
States. There are few speakers in Africa or the Middle East, except for 
Iran and Israel. 

While most people learn Esperanto as a second language, there also 
exist native speakers of the language, often as the result of marriages 
between young people of differing nationalities who meet through Esperanto 
activities and who continue to speak the common language at home. Of 
course the child eventually learns the local national language (as does 
whichever spouse did not originally know it), thus becoming bilingual, but 
it is not uncommon for Esperanto to remain the language of the home. There 
are also cases of one parent deliberately addressing the child only in 
Esperanto while the other parent addresses the child in another language, 
in a conscious decision to make the child bilingual (Fischer 1981). A 
survey conducted by a newsletter for parents of native Esperanto speakers 
located 150 families where Esperanto was used extensively (Nemere 1968). 
Given the difficulties of carrying out such a survey, one might guess that 
there are between 1000 and 2000 native speakers. Additional evidence for 
such numbers comes from the observation that a few dozen international 
marriages among Esperanto speakers are reported in Esperanto periodicals 
each year, and this rate of family formation is about right to produce the 
estimated number of native speakers. Richard Wood (forthcoming) found in 
a poll at an Esperanto conference that one to two percent of the conference 
participants had learned Esperanto as their first language, which is in 
rough agreement with the estimate of one to two thousand native speakers 
among a total of a few hundred thousand speakers In all. 

3. The Esperanto speech community 

The most important reason for mutual intelligibility among the 
varieties of Esperanto is that the speakers do form a genuine speech 
community, as is well described by Wood (1979, forthcoming). This 
community is unusual in being geographically dispersed and culturally 
diverse, yet sharing certain distinctive cultural values and a common 
literature. Why people learn Esperanto is a complex question, but since 
most learn it for purposes of international rather than local 
communication, the instrumental aspects of their use of the language make 
them strive for an international norm. This is reinforced by the 
integrative ties of a shared aspiration for a solution to the language 

Esperanto was born as a literary language in modern, literate times, 
which may have prevented drastic changes in syntax, morphology, and 
semantics from occurring in different countries, given the world-wide 
distribution of Esperanto books and periodicals. Yet there has been 
significant evolution, especially along the lines of increasingly 
exploiting certain latent autonomous properties of the language instead of 
merely imitating forms borrowed from European languages. For example, the 
agglutinative properties of Esperanto have been utilized more and more in 
ways that are quite un-European in nature. But the development of 
Esperanto in China and Japan has not in general been different from that in 


Hungary and France (more about this in a moment). The language has evolved 
within an international community which has been in constant contact, both 
written (books, periodicals, letters) and spoken (tourism, conferences, 
shortwave radio broadcasts). The breakup into dialects of languages which 
evolved before literacy and/or before global communications has given rise 
to a set of assumptions about language evolution that may not necessarily 
be useful in understanding the different nature of Esperanto evolution and 
the properties of a dispersed community of second-language speakers. As 
Wood points out, there are some striking parallels to the early stages of 
the development of Modern Hebrew, but the territorial nature of the growth 
of Hebrew is quite different from the non-territorial development of 

A related point made to me by David Jordan (personal communication) is 
that contacts with others through Esperanto are, on a day-to-day basis, 
typically written (as is the case with other non-native languages we use in 
international communications) but from a wide diversity of cultures (unlike 
the situation with other second languages: e.g., communications in Italian 
or Russian typically involve Italy or the Soviet Union). This emphasis on 
the written word, with the internationalist character of the contacts, 
keeps Esperanto speakers aware of the needs for international 
intelligibility. This continually reinforces the understanding of the need 
for an international norm. 

Nevertheless, the careful study by Golden (1980, 1981) shows that 
within the rich environment of the Hungarian Esperanto movement there are 
significant identifiable Hungarianisms. Golden expresses the belief that 
much more care should go into teaching materials used in Hungary, to help 
learners avoid nationally-based errors. His work yields a basic inventory 
of major interference errors. 

4. Vocabulary 

For almost the entire ninety-year lifetime of the language, there has 
been a public debate, often acrimonious, about control of growth of the 
lexicon. Roughly speaking, the disagreements result from a desire on the 
one hand to keep the number of roots small to benefit the new learners and 
on the other a need felt by writers, especially poets, to enrich the 
vocabulary for literary purposes. While the debate has typically been 
conducted along the dimension utilitarian/literary, the arguments are often 
reminiscent of the question of purism in many national languages, which in 
Esperanto usually takes the form of contrasting "homey" compounds created 
from the internal, autonomous resources of the language, such as samtempa 
'same-time' , with Latin loan words such as simultana 'simultaneous' . The 
social dimension of the debate has often pitted the needs of the 
linguistically unsophisticated European worker against the capabilities of 
the Latin-trained European elite polyglot. Especially within the European 
Esperanto movement there has been a tradition of working-class involvement, 
and Esperanto publicity has often emphasized that the needs of workers for 
international communications require an easily learnable auxiliary 

For Europeans, a compromise is possible by recognizing different 
styles in the language. However, it has sometimes been pointed out that an 

abundance of Latin loan-words causes special problems for non-European 
users of the language. The point has been made dramatically by Claude 
Piron (1977) in his essay provocatively titled "La okcidenta dialekto" 
("The occidental dialect"). Piron, a Swiss psychologist who has translated 
Chinese for the World Health Organization, contrasts an Esperanto passage 
swarming with Chinese loan-words with the same passage using an abundance 
of Latin loan-words, together with a rewritten passage in what he refers to 
as "global" Esperanto, in order to drive home two points: 1) it is 
scandalous that some Esperantists use a Latinate lexicon which is 
incomprehensible to non-Europeans (and to Europeans not in a narrow elite), 
and 2) with a real feeling for the word-building capabilities latent in the 
language, it is possible to write and speak richly expressive Esperanto 
without having to resort to Latinisms. Piron points out the similarities 
of a "global" style of Esperanto to the ways in which rich metaphors are 
created through similar techniques in Chinese. Chin-Chuan Cheng (1982) has 
demonstrated a tendency for the monthly El Popola Cinio, published in 
Beijing, to use compounds which are caiques from Chinese. It is perhaps 
significant that I had not perceived these forms as unusual, not knowing 
the parent Chinese forms (and also being personally predisposed to use and 
approve of the "global" style urged by Piron). 

While the issues Piron raises are important, his observations do not 
really justify the identification of separate European and Asian dialects 
at present. It is noteworthy that these issues have typically been raised 
not by Asians themselves but by sensitive Europeans. Part of the reason 
for this may lie in the fact that almost all Japanese study English before 
they study Esperanto, and this may be the case with some Chinese 
Esperantists, too, with the result that their use of Esperanto may be 
colored by English (and the Latin vocabulary of English). A careful study 
should be made of the styles and lexicons of Esperanto literature from 
various countries to see what the present situation really is, but it may 
well be that Piron 's exhortations should be directed at all speakers rather 
than exclusively at West Europeans. In any case, the continued strength of 
the Esperanto movement in Asia is likely to insure that the needs of non- 
European speakers will not be neglected, and that Asians will contribute to 
the evolution of a global style. 

Piron 's essay has sparked consciousness-raising activities aimed at 
further internationalizing the language. This thrust is prompted not only 
by internal needs within the Esperanto community but also by the 
requirements of external publicity. One of the few truly well-founded 
criticisms of Esperanto is that it has a European rather than a global 
base, at least in the lexicon. Rather than emphasizing the international 
character of the European lexicon (given the spread of English and other 
European languages), Esperantists have typically responded to this 
criticism by trying to show that despite the European bias in the lexicon 
other properties of the language (agglutination in particular) make 
Esperanto not too European and therefore suitable for international 
communication. A style of Esperanto more like Chinese contributes to these 

The standard mono-lingual Esperanto dictionary (Waringhien 1970) has 
played a major role in the control of the lexicon, as did its predecessors 
(Grosjean-Maupin, Esselin, Grenkamp-Kornfeld, and Waringhien 1934 and 


195A). As Is the case with many emergent national languages, Esperanto 
books and periodicals often gloss new or lesser-known words. (Often there 
are also explanations of national events or customs not likely to be widely 
known.) It is noteworthy that usually only those roots not found in the 
standard dictionary are glossed, Indicating some consensus that the 
dictionary defines acceptable usage. Occasionally, however, one sees in 
these glosses explicit exception taken to the dictionary forms, which is an 
indication of some fluidity. There is an Academy of Esperanto, but it has 
historically played a very minor role in the development of the language. 
Even in lexical matters the Academy has limited itself to occasional 
listings of words which have been around for enough decades to seem 
"official". Major growth in the lexicon has occurred through decentralized 
individual suggestions and use. 

5. Phonetic aspects 

As might be expected, national accents are common among Esperanto 
speakers. In my experience the resulting problems of Intelligibility are 
rarely as severe as the difficulties between American English and, say, 
some varieties of New Zealand English, and the problems are much less 
severe than the problem of understanding Japanese English. These are 
moreover individual rather than group problems, in the sense that most 
speakers achieve an adequate pronunciation, no matter what their first 
language, and failures can plausibly be blamed on the fact that most 
Esperanto speakers have not had formal school courses in the language, 
being either self-taught or having attended informal classes in a local 

The major credit for general success lies in the sound system of 
Esperanto. There are only five vowels, a fairly easy set of consonants 
(but more on this later), not too many difficult consonant clusters, 
syllable-timed rhythm (without vowel reduction), regular penultimate 
stress, and most words end in vowels, which probably helps hearers segment 
sentences into words. About 70% of the words in normal text end in a vowel 
or a diphthong, with another 20% adding an additional -n or -s. The 
remaining 10% of the words end in a vowel plus other consonant. There are 
no word-final consonant clusters except in the word post 'after' . Stress 
not only Is regular but seems not to play a very critical role. 

Intelligibility for such a sound system is more resistant to 
destruction by national accents than is, say, English as spoken by 
foreigners. For example, a slight error in vowel height in English can 
change "beat" to "bit", whereas such an error in Esperanto must be much 
larger before timo 'fear' is confused with temo 'theme' . Similarly, 
incorrect stress in English seriously affects intelligibility, due to the 
effects on vowel quality and rhythm. 

In terms of normal linguistic description. It may seem strange to make 
value judgements about one language having a "better" sound system than 
another. Such a judgement is of course invalid for first languages, where 
the total linguistic system determines communication, not just the 
phonetics. But for a language intended for second-language use, it is 
Important to have a simple sound system, since the purely acoustic part of 
the signal must carry a larger burden (due to cultural differences between 

the speakers) and because it is essential that adults be able to learn the 
sound system quickly and easily. 

It might be said that Esperanto has more consonants, and hence more 
subtle distinctions among these consonants, than one would like in a 
language intended for use by speakers of many different first languages. I 
had occasion to observe this in an unusual way in a new linguistics course 
(Sherwood and Cheng, 1980). In an experimental test of new computer-based 
teaching techniques (J. Sherwood 1981, B. Sherwood 1981, 1982a, Sherwood 
and Sherwood 1982), Esperanto speech synthesis (Sherwood 1978) was used for 
some audio stimuli. American students had difficulty making certain kinds 
of consonant distinctions due to imperfect synthesis, at least in the case 
of isolated words. The problems can be overcome with suitable teaching 
(and the extreme simplicity of other parts of the language leaves plenty of 
class time for working on consonants). But the voiced/unvoiced 
distinctions are difficult for the Chinese, the 1-r distinction is 
difficult for the Japanese, the initial ts sound is resisted by Americans, 
etc. There are also some consonant clusters which are difficult for many 

It may be impossible to get by with significantly fewer consonants in 
a language which borrows new internationalized words from many sources, 
context will usually make up for problems, and the system in any case is 
simpler than many national languages, but one could still wish that there 
were fewer consonants. In Novial, perhaps the only language project 
designed by a modern professional linguist, Jespersen (1928) emphasized the 
usefulness of reducing contrasts among consonants, and he included among 
the fricatives only /f/, /v/, and /s/, with no affricates. On the other 
hand, he found it necessary to include both voiced and voiceless stops, and 
both HI and /r/, as a result of incorporating a European-based lexicon. 
Papers by Sapir, Bloomfield, Boas, Gerig, and Krapp (1925), and by 
Troubetzkoy (1939) both advocated further reduction in the consonant 
repertoire, eliminating all voicing contrasts and the 1-r contrast (Saplr 
et. al. also proposed a three-vowel rather than a five-vowel system, and 
Troubetzkoy eliminated HI). Similar suggestions have been made by White 
(1972). But these studies were not subjected to the crucial test Jespersen 
faced in actually constructing a language embodying these criteria. 
Moreover, Chin-Wu Kim (personal communication) has pointed out that their 
calculations of the number of possible polysyllabic words composed of CV 
syllables were simplistic: no language uses forms such as "tototo", and 
closely related forms such as karaata, katama, makata, mataka, etc., would 
severely strain the memory. For these reasons the number of available 
words was very much overestimated. 

A reformer might like to simplify the overly-rich consonant system of 
Esperanto. In this, as in other areas of "imperfection", such attempts 
have historically been blocked by a special social contract among Esperanto 
speakers not to change the basic structure of the language. This 
"untouchability" of the core was established at the first international 
Esperanto conference in 1905, to give the language the stability many other 
projects involving constructed languages never achieved, and it is thought 
to have been critically important in permitting the emergence of a 
community of speakers and of a significant literature (Drezen 1931; Janton 
1973; Golden 1977). This did not prevent further evolution of the 


language, because the core contained enough unexplolted potentialities to 
accommodate new structures, but It did have the effect of drawing the 
boundaries of the Esperanto community In such a way as to exclude would-be 
reformers and their "deviant" creations. 

Corresponding to "untouchabllity", there Is a myth In the Esperanto 
movement, originally articulated by Zamenhof, the creator of Esperanto, 
that If a truly representative international organization should decide 
that Esperanto would be used among the nations, that organization would be 
entitled to empower a panel of experts to make a one-time reform of the 
language. Given recent experience with national language planning 
activities, belief in such a procedure seems naive. Reform would have to 
be carried out over a long period of time, in small steps rather than all 
at once. If at all, unless the changes were completely trivial. This would 
be necessary both because of the question of authority (how is it decided 
that a particular organization is entitled to authorize changes?) and 
because all changes, even seemingly small ones, can have complex 
ramifications throughout the system of the language. 

Is there such a thing as "good" Esperanto pronunciation? John Wells 
(1979) has given a convincing argument that there has evolved a communal 
consensus on this — that there is a norm for "good" pronunciation. He 
points out that one often hears Esperanto speakers say "She/he has a 
good/bad pronunciation", and he suggests that the basis for such 
statements may be found in several related criteria: practical, 
linguistic, geographical, and sociological. The practical requirement of 
intelligibility between speakers of different first languages is paramount. 
Good pronunciation also reflects the phonological character of Esperanto, 
distinguishing among all the phonemes, minimizing allophony, and conserving 
the strict relation between pronunciation and orthography (for example, a 
tendency for Spanish or Japanese speakers to fall to distinguish between 
/b/ and /v/ not only would cause practical problems of communication but 
also goes against the linguistic structure of Esperanto). Good 
pronunciation is geographically neutral, not manifesting regional or 
national peculiarities and making it difficult to identify the speaker"s 
nationality (for example, French speakers should fight against a tendency 
to stress final rather than penultimate syllables). This does not imply 
that mild national accents are not tolerated or even enjoyed, but it 
appears that speakers do recognize and prize an international or non- 
national pronunciation style. The sociological criterion reflects the fact 
that, due to the development of a community of speakers, certain communal 
attitudes have emerged, Including attitudes toward certain kinds of 
pronunciation, which may be the only way to explain a general recognition 
that a tapped or trilled /r/ is preferred to other varieties. Wells 
summarizes by pointing out that while these norms are not absolutely 
uniform and certainly not observed by all speakers, it is an Important 
sociolinguistic fact that norms for pronunciation do exist. He also points 
out that it is particularly easy for a Serbo-Croatian speaker to attain the 
norm, because of the coincidental similarity of the sound system to that of 
Esperanto, but that it is possible for others to approach the norm, with 
good teaching and effort. 

This point of view is further illuminated by historical aspects of 
Esperanto pronunciation. Kalocsay (1931) stated that Ludwig Zamenhof, the 

creator of Esperanto, did not lengthen stressed vowels, while most speakers 
did and do. The only existing recording of Zamenhof was made in Barcelona 
in 1909. The poor quality of the recording makes analysis difficult. 
However, Richard Wood and I listened to this recording together, and we 
both feel that Zamenhof 's pronunciation is just like good present-day 
pronunciation, without much allophonic variation in vowel quality, and with 
lengthening of stressed vowels (Wood 1980). Wood's well-known 
expertise in shortwave listening qualifies him to analyze this noisy 
recording! In text-to-speech synthesis (Sherwood 1978, 1981, 1982a, 
Sherwood and Sherwood 1982) I find experimentally that making stressed 
vowels 50% longer than unstressed vowels yields natural-sounding Esperanto 

Some older treatments of Esperanto (McQuown, 1936; Kalocsay and 
Waringhien, reprinted 1980) spoke of "rules" that one should follow in 
producing "good" allophonic variation (these had to do in particular with 
the status of vowels in open and closed syllables). But now the norm which 
has emerged characterizes the vowel system as requiring little allophonic 
variation among the five vowels, as in Greek or Japanese. As Wells puts 
it, one should be tolerant of allophones, but one certainly should not 
require specific variation, since any particular rule will be unnatural for 
many speakers. The simplest rule, and the one which is most universal, is 
to not vary, and this is the standard which has in fact emerged. 

Another illuminating anecdote mentioned by Wood (forthcoming) is 
that when planning the 1907 international Esperanto congress in Cambridge, 
British Esperantlsts argued over whether they should be speaking Esperanto 
with English pronunciation, some feeling that this was the only proper way 
to pronounce the language. In the long run a different philosophy won out, 
that Esperanto should be spoken in an international form appropriate to the 
nature of the language. The fact that such a question could even come up 
is astounding for speakers now, when it is assumed that an international 
norm is what everyone should aim at (although with allowance for national 
accents as long as intelligibility is maintained). 

Wood (forthcoming) reports on the speech of a handful of native 
speakers he has known. He found a range of accents, presumably derived 
from the national accents of the non-native parents or of the child"s 
playmates. It would be fruitful to compare the variation among Esperanto 
native speakers with that found among native speakers of modern Hebrew 
during its early development. David Gold (personal communication) reports 
his observations of native Hebrew speakers of various ages, all of Yiddish 
background. He says that the pronunciation of those born before about 1930 
is so strongly influenced by the Yiddish linguistic background of parents 
or grandparents that a naive observer might not take these people to be 
native speakers of Hebrew. Those born after 1930 but before about 1955 
clearly betray their Yiddish backgrounds but may pass for natives when 
judged by naive observers. Gold summarizes these observations by saying 
that just as in Esperanto, an indigenous norm is emerging. It is now 
possible to speak Hebrew without revealing one"s linguistic, communal, 
geographical, or other background. This possibility was absent in early 
Modern Hebrew. 


6. Phonology and morphology 

Another factor contributing to a unified pronunciation norm is the 
drastic simplicity of the phonology and morphology. Esperanto phonology is 
rather rudimentary. The underlying phonetic representations of the 
morphemes in the lexicon map directly without complex phonological 
derivations into the surface phonetic forms (and graphemes too, for that 
matter, since written Esperanto is practically a transcription of the 
spoken form, or, given the powerful influence of the written language among 
literate, dispersed, second-language speakers, perhaps it is accurate to 
say that the spoken form is a realization of the written form). It is true 
that many speakers follow some natural assimilation rules (e.g., with 
nasals), but these phenomena are marginal in the overall scheme of things. 
Within a morpheme many natural rules shared by many languages are already 
accounted for in the lexical forms, which come from national languages. 
Juncture is often heard in agglutinative compounds to avoid assimilation 
(and to mark the presence of a morpheme boundary). In many languages, 
assimilation across word or morpheme boundaries does not occur in formal 
speaking styles, and as a mainly second language Esperanto Is usually 
spoken in a relatively formal style, though it is said that Esperanto youth 
get-togethers have given rise to informal and rather special kinds of 

The morphology is also extremely simple. Almost any morpheme can take 
a "grammatical ending" to form a noun, adjective, adverb, verb, or 
participle: helpo 'help' (noun); helpa 'helpful' ; helpe 'helpfully' ; 
help / is /as/os/us/u/i — past, present, future, conditional. Imperative, 
infinitive verbs; past, present, and future passive and active participle 
endings are it/at/ot and int / ant / ont — helpita he 1 pan to 'a helped helper' . 
Adjectives and nouns agree in number (plural -j) and case (accusative -n): 
helpaj(n) helpoj(n) 'helpful helps' . Terms of endearment are formed by 
deforming the root and adding -c jo (masculine) or -njo (feminine): Petro- 
Pec jo . Mar la-Man jo , pac jo 'dad' , panjo 'mom' . Except for -lo in country 
names (itala 'Italian' , Italio 'Italy' ) and some suffixes used in technical 
vocabulary, especially chemistry, these grammatical endings and suffixes 
are the only bound morphemes in the language. 

To be more precise, morphemes divide into two classes: content 
morphemes which must have a "grammatical ending" (such as help- ), and 
function morphemes which may take such endings but need not (prepositions, 
numerals, some "primitive" adverbs, etc.: tri 'three' , tria 'third' , trie 

'thirdly'). Since the content morphemes need only a "grammatical ending", 
it is simplest to consider them as free morphemes, unlike the small number 
of truly bound morphemes listed above. All morphemes are strictly 
invariant in form except in endearment terms. About the only thing 
remaining for this vestigial morphology to do is to specify the allowed 
segmental sequences within syllables, although there also appear to be some 
phonotactic restrictions on the allowable forms of compound words: both 
banocambro and bancambro 'bathroom' are heard, but partopreni 'to take 
part' is never partpreni . It should be mentioned that the highly 
productive short Esperanto morphemes traditionally called "affixes" are in 
fact themselves free morphemes, too: kato 'cat' , katido 'kitten' , hundo 

'dog' , hundido 'puppy' , ido 'offspring' . 

As a consequence of the Invarlance of its morphemes, Esperanto is 
rigorously agglutinative in its word-building. When various universal 
measures are applied (morphemes per word, etc.), Esperanto scores closest 
to languages like Turkish (Brozovid 1976). Yet its agglutination goes 
beyond that of Turkish, since the Esperanto "affixes" are actually free 
morphemes (nor is there phonetic variation of affixes due to vowel harmony 
as there is in Turkish). Since there are hardly any bound morphemes, some 
aspects of the language are reminiscent of Chinese (although unlike 
Chinese, Esperanto morphemes are often polysyllabic). It seems to me that 
the characteristically productive agglutination of morphemes in Esperanto 
contributes to the lack of derivational rules in the rudimentary phonology, 
since almost any sounds may stand next to any other sounds as a result of 
agglutination, and about the only fully universal rules (of assimilation, 
etc.) valid for the speakers of many languages are simply to butt the 
sounds up against each other with little or no allophonic variation (one 
might say the phonology is "agglutinative"). This tendency is yet further 
reinforced by the power of the written language in a modern literate world 
to impose spelling pronunciations. On the power of the printed word in 
literate societies, see Levitt (1978) and Bentur (1978). The situations 
described by these authors are generalized in the case of Esperanto as a 
result of the minimal phonological and morphological rules, of the strictly 
phonetic character of the written form, and of the mainly second-language 
and written use of Esperanto. 

7. Possible futures 

What if Esperanto were used everywhere as the normal means of 
communication between people of different first languages? Here the reader 
must either suspend disbelief in the development of this kind of 
bilingualism, for the sake of following the argument, or reflect carefully 
on such situations as the following. The European Community is committed 
by treaty to absolutely equal treatment of all its national languages. It 
has just added Greek, making a total of seven official languages, and it 
will soon add Spanish and Portuguese, with the likelihood of adding Turkish 
around 1990. The competing goals of equity and efficiency may drive the 
Community to adopt a politically neutral second language. 

It is often claimed that under widespread use Esperanto would break up 
into mutually incomprehensible dialects. This may be an invalid 
conclusion, since it is based on the way languages developed before there 
existed rich modes of global communication. A language breaks up into 
dialects when there is isolation, and under present conditions of mass 
literacy, global electronic communication, and mass global travel, 
isolation is increasingly uncommon. Moreover, if Esperanto were learned in 
school and used mainly for inter-national and inter-cultural rather than 
local communications, its patterns of usage would tend to block the normal 
processes of dialect differentiation. The specific properties of Esperanto 
which have contributed to unity at present (five-vowel system, phonetic 
orthography, etc.) would reinforce these tendencies. 

A closely related question is what kind of evolution of the (unified) 
language would be likely. Given the difficulties with the consonants, one 
might predict a neutralization of some distinctions which might never- 
theless remain in the written language (e.g., posta / posta 


'later' / 'postal' ). The popular authority of writing would tend to combat 
this otherwise natural tendency, although Gold (personal communication) 
points out that observers in many languages have noted that the average 
person tends to misuse or omit diacritics, so that the written distinction 
might be endangered, too. 

There might be changes in certain morphemes whose phonetic structure 
differs from that permitted in many languages, such as gvido (which might 
change to gwido ) and in the morphemes beginning with eks- and the unnatural 
ekz- , both of which collide with the productive prefixes eks- 'ex-' and ek- 
'start' , though these prefixes typically are followed by junctures not 
present in the morphemes. It might be that spelling pronunciations would 
not shield these from change. Given the many errors speakers exhibit in 
the accusative marking of direct objects (Sherwood 1982b), and the 
pronounced tendency in the spoken language to use solely SVO order in main 
clauses, it is probable that the accusative would become optional de_ jure 
as well as die facto in spoken and perhaps written SVO sentences. 

Because standard Esperantist publicity consistently claims that 
Esperanto is intended solely as a second, not a first language, to solve 
problems of international communication and to shelter the national and 
regional languages from the linguistic imperialism of the big languages, 
the existence of native speakers is something of an embarassment for some 
Esperantists. The eminent Hungarian linguist and Esperantist Gdza Birczi 
(1966) publicly condemned the "fanaticism" of parents who bring up their 
children speaking Esperanto. He stated flatly that such behavior is not 
acceptable. This view misses the point that as soon as people began 
communicating in Esperanto, some of these people married each other, and it 
was natural for them to continue speaking Esperanto. It was impossible to 
avoid the emergence of native speakers. On the other hand, many linguists 
place great weight on the existence of native speakers, without whom 
Esperanto would not be a "real" language, an attitude based on assumptions 
about the role of first languages which are not necessarily relevant for an 
auxiliary language. This attitude does however have validity to the extent 
that a lack of native speakers of Esperanto would be strong evidence 
against the claim that communication occurs and that a community exists, 
since these factors have indeed led inevitably to the birth of children who 
learn Esperanto as their first language. 

In these contexts the question naturally arises whether widespread use 
of Esperanto would ultimately obliterate all other languages, leading to a 
monolingual world. This is a tabu subject within the Esperanto movement, 
given the second-language orientation of Esperanto publicity. The opposing 
tendencies would seem to be on the one hand the fact that minority 
languages and dialects have so far proven to be much more resilient to the 
pressures of national languages than had been expected, on the other the 
fact that mass culture tends toward standardization. It is sobering to 
hear that a television program is not viable if it appeals "only" to a few 
million people. Given such orientations and economics, it seems likely 
that television and movie producers, novelists, and publishers would tend 
to use a second language known all over the world rather than any first 
language known "only" to a few hundred million people. Also, a minority 
language may resist successfully when the majority language belongs to a 
different, possibly disliked social group, whereas a neutral second 

language belonging to no one would not present the same barriers to 
acceptance (this conjecture was pointed out to me by Mik.ul48 Nevan, the 
father of a native Esperanto speaker). But David Jordan points out that in 
the process of coming into widespread use, especially if assisted by 
political forces, Esperanto would likely acquire resistance and enemies by 
association with those groups favoring its use, thus raising barriers to 
acceptance. In the thirties, Hitler crushed the German Esperanto movement 
for being "communist", and Stalin simultaneously obliterated the Soviet 
Esperanto movement for being "cosmopolitan", which gives some perspective 
on the relevant political considerations. (It may be more accurate to say 
that Hitler and Stalin shared the same rationale: both movements were 
"internationalist" and may also have been perceived as "Jewish".) 

While there are obviously advantages to a monolingual world, a 
multilingual world also offers important benefits. The mutual 
unintelligibility of languages provides a useful "customs barrier" to 
protect national and ethnic experimentation in diversity. Could these 
customs barriers still function if everyone learned the same auxiliary 
language? Yar Slavutych of the University of Alberta (personal 
communication) reports that in non-Russian areas of the Soviet Union and in 
Eastern Europe it already happens occasionally that a university professor 
will ask the class whether everyone speaks the local language and, if there 
is even one Russian student who does not, the professor may lecture in 
Russian to be accommodating, on the basis that all the other students are 
presumed to know Russian as a second language. If it could be assumed that 
everyone knew Esperanto well, any gathering of people of mixed native 
languages would likely use Esperanto out of a sense of fair play. 

The existence of a common second language would in itself greatly 
promote the kinds of international exchanges which would produce more and 
more linguistically mixed gatherings, in all sorts of settings (university, 
business, government, tourism, etc.). No matter how strong a theoretical 
commitment to maintenance of the native language and to the reservation of 
Esperanto for inter-language situations, one might find oneself speaking 
Esperanto many times every day even in one 's own home town. It might 
require great linguistic sophistication on the part of the general populace 
to recognize both the cultural benefits of preserving the native language 
and the clear and present danger of its loss. One possibility is that 
those national tongues which are high literary languages not generally 
spoken in family life might be displaced (e.g., German and Italian for at 
least some speakers), while the local ethnic languages would be encouraged. 
There could be a stable diglossia involving the language of the home (often 
not a national language) and the auxiliary language. 

It is important not to misconstrue the nature of the danger. In the 
absence of the kind of language planning on an international scale which 
could lead to the wide-spread use of Esperanto, the world may drift in its 
urgent need for better international communication into an unplanned 
solution. In particular, English might continue to spread and eventually 
become the universally-known auxiliary language, with the same attendant 
dangers in the long run, but with great Inequity in the short run, due to 
the fact that it is many times more difficult to learn than Esperanto and 
because it gives enormous advantages to its native speakers. But some 
scholars believe that English has passed its peak (Starr 1978), and the 

world may simply be headed for chaotic multi-llngualism and a different set 
of difficulties and dangers. 


*I give special thanks to Chin-Chuan Cheng for support and 
encouragement. I also thank David Gold, David Jordan, and Richard Wood for 
valuable comments. I am grateful to the Department of Linguistics for 
hospitality during a period of interdisciplinary study (197980), and to 
the Computer-based Education Research Laboratory and Department of Physics 
for their generosity in making that study possible. This has led to a new 
and deeper association with the Department of Linguistics. This paper was 
presented at the 1980 Symposium on Native-Language Influence on Esperanto, 
held in Urbana. 


BARCZI, G^za. 1966. Vil5gnyelvv6 vAlhatik-e az eszperanto? Magyar Nemzet, 

March 27. Reprinted with Esperanto translation (Cu Esperanto farigos 

mondlingvo?) in So6s, Eva and Ehmann, Beata, ed . , Esperantologiaj 

Kajeroj 3, University of Budapest (EStvSs Lor4nd), 1977, 265-272. 

This volume is a memorial to Birczi. 
BENTUR, Esther. 1978. Orthography and the formulation of 

phonological rules. Studies in The Linguistics Sciences 8:1.1-25. 
BROZOVIC, Dalibor. 1976. Pri pozicio de Esperanto en lingva tipologio. In 

Zlatko Ti^ljar, ed., Internacia Lingvistika Simpozio, pp. 33-84. 

Zagreb: Internacia Kultura Servo. 
CHENG, Chin-Chuan. 1982. The Esperanto of El Popola Cinio. Studies in the 

Linguistic Sciences (this volume). 
DREZEN, E. 1931. Historio de la Mondolingvo. Reprinted 1967, Oosaka: 

ESPERANTO. 1982. See table on p. 39 of the February issue of this official 

organ of the World Esperanto Association (Rotterdam). 
FISCHER, Rudolf. 1981. Dulingva edukado per Esperanto kaj la gepatra 

lingvo. Eifropa Dokumentaro 30:18-20. 
GOLDEN, Bernard. 1977. Political factors in the international language 

movement. Eco-Logos 23:86.13-16. 
GOLDEN, Bernard. 1980. An analysis of some characteristics of Hungarian 

Esperanto. Presented at the 1980 Symposium on Native-Language 

Influences on Esperanto, held in Urbana. 
GOLDEN, Bernard. 1981. Kelkaj trajtoj pri la "hungara dialekto" de 

Esperanto. Planllngvistiko 1:0.11-15. 
GROSJEAN-MAUPIN, E. , Esselin, A., Grenkamp-Kornfeld, S. , and Waringhien, G. 

1934. Plena Vortaro de Esperanto. Reprinted with technical 

supplement in 1954. Paris: Sennacieca Asocio Tutmonda. 
JANTON, Pierre. 1973. L'esp4ranto. ("Que sais-je" series.) Paris: 

Presses Universitaires de France. 
JESPERSEN, Otto. 1928. An International Language. London: George Allen & 

Unwin Ltd. 61-82. 
KALOCSAY, K. 1931. Lingvo Stilo Formo. Budapest. 116-121. 

Reprinted in Blanke, Detlev, ed.: Esperanto, Lingvo, Movado, 

Instruado. Berlin: Zentraler Arbeltskreis Esperanto, 1977, 90-92. 


KALOCSAY, K. , and Waringhien, G. 1980. Plena Analiza Gramatiko de 

Esperanto. 4th, updated edition. Rotterdam: Unlversala Esperanto- 

LEVITT, Jesse. 1978. The influence of orthography on phonology: a 

comparative study (English, French, Spanish, Italian, German). 

Linguistics 208:43-67. 
MCQUOWN, N. A. 1936. A Comparative Study of Esperanto from the Standpoint 

of Modern German. University of Illinois master's thesis. 
NEMERE, Stefano. 1968. 2-a renkonti|o de Esperantistaj familioj. Gepatra 

Bulteno 8: (Oct .-Dec .) 12-13. 
PIRON, Claude. 1977. La okcidenta dialekto. Esperanto no. 7-8 (July- 
August), 125-126. 
SAPIR, E. , Bloomfield, L. , Boas, F., Gerig, J. L. , and Krapp, G. P. 1925. 

Memorandum on the Problem of an International Auxiliary Language. 

Romanic Review 16:244-256. 
SHERWOOD, B. A. 1978. Fast text-to-speech algorithms for Esperanto, 

Spanish, Italian, Russian, and English. International Journal of Man- 
Machine Studies 10:669-692. 
SHERWOOD, B. A. 1981. Speech synthesis applied to language teaching. 

Studies in Language Learning 3:1.171-181. 
SHERWOOD, B. A. 1982a. Spertoj pri sintezo de Esperanta parolado. In 

Helmar Frank, Yashovardhan, and Frank-BShringer , ed . : Lingvo- 

Kibernetiko, pp. 63-74. TObingen: Gunter Narr Verlag. 
SHERWOOD, B. A. 1982b. Statistical analysis of conversational Esperanto, 

with discussion of the accusative. Studies in the Linguistic Sciences 

(this volume). 
SHERWOOD, B. A. and Cheng, Chin-Chuan. 1980. A linguistics course on 

international communication and constructed languages. Studies in the 

Linguistic Sciences 10:1.189-201. ERIC Clearinghouse document 

ED 182995. 
SHERWOOD, J. 1981. PLATO Esperanto materials. Studies in Language Learning 

SHERWOOD, J., and Sherwood, B. 1982. Computer voices and ears furnish 

novel teaching options. Speech Technology 1:3.46-51. 
STARR, S. F. 1978. English Dethroned. Change Magazine, May, 26-31. 
WARINGHIEN, G. 1970. Plena Ilustrita Vortaro de Esperanto. Paris: 

Sennacieca Asocio Tutmonda. 
WELLS, John. 1979. Lingvistikaj aspektoj de Esperanto. Rotterdam: Centro 

por Esploro kaj Dokumentado de la Monda Lingvo-Problemo. 22-26. 
WHITE, R. G. 1972. Towards the construction of a Lingua Humana. Current 

Anthropology J_3, 113-123. 
WOOD, Richard E. 1979. A voluntary non-ethnic and non-territorial speech 

community. In William Francis Mackey and Jacob Ornstein, eds., 

Sociolinguistic Studies in Language Contact - Methods and Cases, pp. 

433-450. The Hague, Paris, and New York: Mouton Publishers. 
WOOD, Richard E. 1980. The development of Esperanto pronunciation. 

Presented at the 1980 Symposium on Native-Language Influence on 

Esperanto, held in Urbana. 
WOOD, Richard E. (forthcoming). A sociolinguistic approach to an 

unconventional language community. In Robert N. St. Clair and Moshe 

Nahir, eds., The Politics of Linguistic Accommodation. Lexington: 

University Press of Kentucky. 

The following special issues are in preparation: 

Papers on Diachronic Syntax 
Editor: Hans Henrich Hock 

Studies in Language Variation: 

Nonwestern Case Studies 

Editor: Braj B. Kachru 



Spring 1974 

Vol.4, No 

Spring 1975 

Vol. 5, No 

Fall 1976 

Vol.6, No 

following issues are available: 
1 , Papers in General Linguistics 

1 , Papers in General Linguistics 

2, Papers on African Linguistics 
Editors: Eyamba G. Bokamba and 
Charles W. Kisseberth 

Spring 1977 Vol. 7, No. 1, Papers in General Linguistics 

Fall 1977 Vol. 7, No. 2, Studies in East Asian Linguistics 

Editors: Chin-chuan Cheng and 
Chin-W. Kirn 

Spring 1978 Vol. 8, No. 1, Papers in General Linguistics 

Fall 1978 Vol. 8, No. 2, Linguistics in the Seventies: 

Directions and Prospects 
Editor: Braj B. Kachru 

Spring 1979 Vol. 9, No. 1, Papers in General Linguistics 

Fall 1979 Vol. 9, No. 2, Relational Granunar and Semantics 

Editor: Jerry L. Morgan 

Spring 1980 Vol. 10, No. 1, Papers in General Linguistics 

Fall 1980 Vol. 10, No. 2, Studies in Arabic Linguistics 

Editor: Michael J. Kenstowicz 

Spring 1981 Vol. 1 1, No. 1, Papers in General Linguistics 

Fall 1981 Vol. 1 1 , No. 2, Dimensions of South Asian 

Editor: Yamuna Kachru 

Spring 1982 Vol. 12, No. 1, Papers in General Linguistics 








Orders should be sent to: 

SLS Subscriptions, Department of Linguistics 

4088 Foreign Languages Building 

University of Illinois 

Urbana, Illinois 61801 

tudies in _ 

The Linguistic Sciences 

The OhrarVof tf» 

MAY 111963 



HANS HENKK H HOC K C/iiic verbs in Pit or (lisanir^c-huscd 

verb Ironiinii? Sanskrit sa hovucci gargrci/j and loniicncrs in 
A vesiun und Homeric Greek 1 

HANS HLNKKfl HOCK Tlie Sanskiv i/iioiuiivc: a liisionca/ 

and coinparaiivc sliidv 39 

^ A\IL NA K/\C HKl' Syniaviic variaiion and lani'uaw ehaiiw: Pusicrn 

and M C^iern Hindi 87 

KAIl SHW ARI PA\I)IIAKII'A\l)[ Qounieraclin'^ Joires in 

laniiiiui^e ehani^e: ionvcn^cnce is. niainicnance 97 

1 I 1/ \151 I H I'l \k( 1 /n/iniiiva/ coni/'Icincnis in Old I ivncli 

and diac/iKiiin c/ianae H 7 

W II I 1A\I I). W Al I \( I r/ieevo/iiiionoj cruanvcsvniax in \i'/>ali. . . 147 

Department of Linguistics 
University of Illinois 



EDITORS: Charles W. Kisseberth, Braj B. Kachru, Jerry L. Morgan 

REVIEW EDITORS: Chin-W. Kim and Ladislav Zgusta 

EDITORIAL BOARD: Eyamba G. Bokamba, Chin-chuan Cheng, Peter 
Cole, Alice Davison, Georgia M. Green, Hans Henrich Hock, Yamuna 
Kachru, Henry Kahane, Michael J. Kenstowicz and Howard Maclay. 

AIM: SLS is intended as a forum for the presentation of the latest original 
research by the faculty and especially students of the Department of 
Linguistics, University of Illinois, Urbana-Champaign. Especially invited 
papers by scholars not associated with the University of Illinois will also be in- 

SPECIAL ISSUES: Since its inception SLS has devoted one issue each year to 
restricted, specialized topics. A complete list of such special issues is given on 
the back cover. The following special issue is under preparation: Studies in 
Language Variation: Nonwestern Case Studies, edited by Braj B. Kachru. 

BOOKS FOR REVIEW: Review copies of books (in duplicate) may be sent to 
the Review Editors, Studies in the Linguistic Sciences, Department of 
Linguistics, University of Illinois, Urbana, Illinois, 61801. 

SUBSCRIPTION: There will be two issues during the academic year. Requests 
for subscriptions should be addressed to SLS Subscriptions, Department of 
Linguistics, 4088 Foreign Languages Building, University of Illinois, Urbana, 
Illinois, 61801. 

Price: $5.00 (per issue) 



Edited by 
Hans Henrich Hock 

FALL, 1982 




Hans Henrich Hock: Clitic Verbs in PIE or Discourse-based Verb Fronting? 
Sanskrit sa hovaca gargyah and Congeners in Avestan and Homeric 
Greek 1 

Hans Henrich Hock: The Sanskrit Quotative: A Historical and Comparative 
Study 39 

Yamuna Kachru: Syntactic Variation and Language Change: Eastern and 

Western Hindi 87 

Rajeshwari Pandharipande : Counteracting Forces in Language Change: 

Convergence vs. Maintenance 97 

Elizabeth Pearce: Infinitival Complements in Old French and Diachronic 

Change 117 

William D. Wallace: The Evolution of Ergative Syntax in Nepali. ... 147 

The stepchild of historical linguistics traditionally has been 
diachronic syntax. For other areas of language change (phonology, 
morphology, language contact, etc.). historical linguistics has been 
able to draw on an impressive number and variety of case studies 
which have accumulated over the last two centuries, especially after 
the neogrammarian "revolution". While the neogrammarians and their 
early followers did devote some of their enormous energy also to 
diachronic syntax (cf. e.g. Delbrilck 1893-1900, Fourquet 1938, Richter 
1903), the number of published studies and their depth (which generally 
goes no farther than word order and morphosyntax) is far less impres- 
sive. And mid-century structuralism ushered in a virtual stand-still 
in diachronic syntax. 

It is only with the 1965 publications of Klima and Traugott that 
interest in diachronic syntax has been rekindled, as is shown by the 
appearance of volumes like Li 1975 and 1977, Lightfoot 1979, and 
Steever et al. 1976. At the same time, however, the "gap" in empir- 
ical understanding of diachronic syntax which has resulted from the 
relative dearth of earlier relevant case studies still persists. In 
part this is no doubt attributable to the fact that the empirical gap 
is too large to be filled in one generation. In part, however, it 
also results from an understandable impatience, a desire to "jump" 
the gap, so as to work for instance with language families whose 
earlier history is not well known or to formulate theories of syntactic 
change which in terms of generality (and ingenuity) can compete with 
those proposed for phonological etc. change. As a result, much of 
current diachronic syntactic work does not provide the indepth case 
studies which are so urgently required. (Thus, out of 14 contributions 
to Li 1977, only four give case studies from languages or language 
families with well-attested histories.) 

The present volume is intended as a modest contribution toward 
bridging the noted empirical gap by providing six case studies of 
syntactic change, all from Indo-European languages, i.e. from members of 
a language family whose history is relatively well attested. In scope, 
they extend from traditional historical/comparative studies to investi- 
gations of syntactic change in convergence areas. In subject matter, 
they range from word order phenomena to questions of ergative vs. non- 
ergative syntax. They reflect current work in the University of Illinois 
Department of Linguistics. 


DELBRUCK, Bertold. 1893-1900. Vergleichende Syntax der indogerma- 

nischen Sprachen. (=Vols. 3-5 of Brugmann and Delbrlick's Grundriss 
der vergleichenden Grammatik der indogermanischen Sprachen.) 
Strassburg: Trlibner. 

FOURQUET, J. 1938. L'ordre des elements de la phrase en germanique 
ancien. (Publications de la Facultee de Lettres de 1 'Universitee 
de Strasbourg, 86.) 

KLIMA, E.S. 1965. Studies in diachronic transformational syntax. 
Harvard University Ph.D. dissertation. 

LI, Charles N., ed. 1975. Word order and word order change. Austin: 

University of Texas Press. 

1977. Mechanisms of syntactic change. Austin: University of 

Texas Press. 
LIGHTFOOT, David. 1979. Principles of diachronic syntax. Cambridge: 

University Press. 
RICHTER, Elise. 1903. Zur Entwicklung der romanischen Wortstellung 

aus der Lateinischen. Halle: Niemeyer. 
STEEVER, S.B., et al . , eds. 1976. Parasession on diachronic syntax. 

Chicago: CLS. 
TRAUGOTT. E.C. 1965. Diachronic syntax and generative grammar. 

Language 41.402-15. 

studies in the Linguistic Sciences 
Volume 12, Number 2, Fall 1982 


Hans Henrich Hock 

In 1892, Wackernagel proposed the hypothesis that PIE main- 
clause verbs were unaccented, therefore clitic, and thus moved to 
clause-second position. This view has been accepted by a number 
of linguists and has been used by Friedrich (l9T5, 1976, 19T7) as 
an argument against the reconstruction of PIE as having SOV order. 
The most crucial evidence for Wackernagel 's hypothesis consists in 
the Vedic-Prose formula of the type sah ha uvaca gargyaj? '(then) 
Gargya said', with unaccented uvaca 'spoke' occurring after the 
initial, accented sab and before the accented words of the rest of 
the sentence. 

This paper shows that a different analysis of this construc- 
tion is preferable, namely that uvaca has been fronted and that 
its specific position results from a conflict between the fronting 
of deictic sab and of "cataphoric" uvaca . Evidence from Avestan 
and Homeric Greek shows this to be an old phenomenon, probably in- 
herited from PIE. The conclusion is that Wackernagel ' s hypothesis 
cannot be maintained and therefore cannot be used as an argument 
in the reconstruction of PIE word order. 

1: In something like an appendix (pp. U25-3't) to his famous 1892 paper 
on the ordering of clitics in Proto-Indo-European (PIE), Wackernagel tenta- 
tively proposed that in PIE main clauses (MC), the verb originally was clit- 
ic and — being clitic — occurred in clause-second position. His explicit and 
implicit arguments in favor of this claim can be summarized as follows: 

(a) Early Sanskrit made an accentual difference between MC verbs, which 
ordinarily were accented, and dependent-clause (DC) verbs, which were accent- 
ed; cf. (1) vs. (2) below (my examples). This pattern had been connected 
with the so-called recessive accent of Greek verbs^ under the hypothesis 
that Greek had generalized the unaccented nature of the MC verb and had sec- 
ondarily imosed on this verb a phonologically predictable new accentuation. 
The combined evidence of Greek and Sanskrit then was taken to indicate that 
the early Sanskrit accentual differentiation between MC and DC verbs was a 
feature also of PIE. 

(1) indrah vytram ahan 

'I.' 'V.' 'slew' 
'Indra slew Vrtra' 

(2) yad indrah vrtram ahan 

'when Indra slew Vrtra' 

(b) This accentual difference between MC and DC verbs should in Wacker- 
nagel's view have been accompanied by a difference in word order: Like other 
clitics, unaccented, clitic MC verbs would have to occur in clause-second po- 
sition, while accented DC verbs would be clause-final. Wackernagel felt that 
this word order distinction was in fact preserved in German, and in early- 
Germanic in general. On the other hand, the prevailing verb-final (or SOV) 
pattern of Sanskrit, Latin, and [early] Lithuanian could be explained as a 
generalization of the DC pattern. This, to be sure, left a certain unex- 
plained residue in the relatively free word order of Greek and the verb-ini- 
tial order of [insular] Celtic. 

(c) What is especially significant in Wackernagel's view is that Sans- 
krit, Latin, [early] Lithuanian, as well as ancient Greek show what he be- 
lieved to be traces of an earlier clitic, clause-second position of the verb. 
For [early] Lithuanian, this consisted in the fact that the verb 'to be' 
frequently is second in its clause. Also for Latin, Wackernagel pointed to 
the frequent occurrence of 'be' in clause-second position. As for Greek, he 
noted a recurring formula in dedicatory and artisan's inscriptions with the 
order Subject + Verb + other constituents (including appositional (etc.) ele- 
ments of the subject NP); cf. the following examples. 

(3) Alkibios anethiken kitharoidos . ■ . 
'A.' 'placed' 'kithara-player ' 
'Alkibios, the kithara-player , placed [me/this]' 

(U) Phanes me anetheke topollon[i 
'Ph.' 'me' 'placed' 'to the A.' 
'Phanes placed me for Apollo' 

(5) Kuniskos me anetheke hortamos wergon dekatan 

'K.K 'me' 'placed' 'the b. ' 'works ' 'tithe' 
'Kuniskos, the butcher, placed me (as) a tithe of his 
business ' 

Finally, for Sanskrit Wackernagel referred to the common formula of the Brah- 
maoas with initial pronoun or other deictic element, followed by optional 
clitic (ha_, (u) ha sma , etc.), plus the verb, which in turn is followed by 
the other elements of the clause, as in (6)-(8): 

(6) sab ha uvSca g^rgyah 
'he' 'spoke' 'G. ' 

'Now, Gargya said . . . ' 

CT) te ha ete ucuh devfa aditySb 
'they' 'they' 'said' 'gods' 'A.' 
'Now, these Aditya gods said ...' 

(8) tad u ha sma aha arupib 
'that' 'said' 'A.' 
'On that issue, Sruoi used to say . . . ' 

(d) In spite of this evidence, Wackernagel was clearly troubled by one 
thing: Clitics normally consist of two syllables or less; PIE verbs, however, 
can be considerably longer, thus fiirnishing rather unlikely candidates for 
clitics. Wackernagel therefore felt constrained to suggest that there may 

have been a limitation — in terms of word length — on the extent to which MC 
verbs were clitics in PIE. However, he was unable to state the precise na- 
ture of these limitations. 

2: Although only tentatively proposed, Wackernagel 's hypothesis has 
been adopted by a variety of scholars, ranging from Behaghel (1929), who at- 
tempted to advance evidence outside of Germanic for a distinction between MC 
and DC word order, to Watkins (1963 and I96U), who added Mycenaean Gk. ho- 
agrese 'and he took' (W. 's transcription and interpretation) to the evidence 
while rejecting all of Wackernagel ' s data outside those from Sanskrit, 3 to 
Friedrich (1975:32, cf. also 1976, 1977), who without further discussion 
draws on Wackernagel 's hypothesis as one of his arguments against the recon- 
struction of PIE as having SOV word order. Also Dressier (1969:8) seems to 
accept Wackernagel 's hypothesis. 

This fairly wide-spread acceptance of Wackernagel ' s claim and the reper- 
cussions which it has for the reconstruction of PIE syntax clearly are sig- 
nificant enough to warrant a closer examination of the evidence on which it 
is based. 

3: The purpose of this paper is relatively modest, namely to reexamine 
the evidence for Wackernagel ' s hypothesis provided by the Sanskrit formula 
documented in (6)-(8) above. In this context, however, it will become neces- 
sary to draw on similar constructions in Avestan and Homeric Greek, the evid- 
ence of which is, I believe, crucial for a proper historical understanding of 
the Sanskrit formula. 

The proposed limitation of the discussion is justified on several 

First of all, the clause-second, clitic positioning of the verb 'to be' 
in early Lithuanian, cited by Wackernagel as supporting his hypothesis, must 
be understood in a much larger context, namely that of the shift (discussed in 
Hock 1982) from SOV to SVO in most of the continental European Indo-European 
languages. This shift, though triggered by the movement of clitic AUX to 
clause-second position, is not a feature of PIE, but must be viewed as a re- 
latively recent innovation, with parallels in Kashmiri and in certain West 
African languages. Also Wackernagel ' s Germanic and Behaghel 's additional non- 
Germanic evidence for a difference between MC and DC word order can be ex- 
plained in terms of a shift from SOV to SVO, a fact noted quite commonly and 
specifically demonstrated for Romance, Germanic, and Kashmiri in Hock 1982. 

Secondly, the alleged early evidence for clitic, clause-second 'be' in 
Latin is far from cogent: As demonstrated by Marouzeau (1908), the appear- 
ance of 'be' in clause-second position is attributable not to clisis, but to 
the fact that 'be' and its attribute (A) functioned as a syntactic unit in 
early Latin, whose internal order was determined as follows: 'be' follows 
A under normal circumstances (as in (9) below); it precedes A under emphatic 
circumstances, where it often may be accompanied by an asseverative particle, 
such as hercle 'by Hercules' (cf. (lO) and (ll)). This internally determined 
ordering may put the verb 'be' in clause-second position, if some other con- 
stituent (such as the subject) precedes, as in (lO). However, if no such con- 
stituent precedes, 'be' may appear clause-initially (cf. (ll)), i.e. in an en- 
vironment in which clitics Eire not permissible. Finally, according to Marou- 

zeau, genuine clitic 'be', with phonological clitic-reduction, always follows 
A in the early Latin comedies of Terence and Plautus; hence (12) is permis- 
sible, but (13) is not. That is, genuine clitic 'be' does not move to clause- 
second position. 

(9) seruos bonus est 
'servant ' 'good' 'is' 
'The servant is good' 

ClO) seruos est bonus 

'the servant is good (indeed)' 

(11) . . . Est hercle inepta 

'by H. ' 'inept ' 
'she is inept (indeed), by Hercules' 

(12) seruos bonust (= bonus-(s)t ) 

clit. 'is' 
'the servant is good' 

(13) seruos-(s)t bonus* 

As for the Greek inscriptional evidence illustrated in (3)-(5) above, 
Wackernagel himself noted that there are numerous counterexamples to this 
pattern in Attic dedicatory inscriptions and that the earliest inscriptions 
likewise do not exhibit the pattern in a regular fashion. In fact, the ear- 
liest Greek inscriptions offer verb-initial patterns (as in (lU)), verb-final 
ones (as in (15)), as well as Wackernagel ' s type (as in (16)). The latter 
construction, however, almost always contains a clitic m(e) before the verb. 
This type is still fairly common in the later inscriptions cited by Wacker- 
nagel (cf. (U) above). The other later types, especially the very common 
pattern exemplified in (3) above, may therefore be safely explained as sec- 
ondary reinterpretations and extensions of the archaic type (l6).^ This 
latter type, however, does not provide any cogent evidence for a clitic, 
clause-second verb. Rather, it can be quite convincingly explained as an 
'amplified sentence' a la Gonda 1959, with a complete (SOV) sentence followed 
by extraposed non-essential material. 

(lU) anethEke tOi pohoidani NikOn . . . 
'placed' 'the' 'P. ' 'N. ' 

'He placed (into manumission) for Poseidon, N. ...' 
(GDI ii591, Lacon. 5/'+th c. B.C.; sim. GDI 1+592, same date) 

(15) hiaron tO puthiO wiswodiqos anethEke 
'sacred' 'P.' 'W. ' 'placed' 
'Wiswodiqos place the sacred object for the Pythian' 

(Arkh. Eph. 1900:107, Theban, 6th c. B.C.; sim. GDI U2U7, Rho- 
dian, 6th c. B.C. ) 

(16) simiOn m anethEke potEdawon[i ... 

'S. ' 'me' 'placed' 'to P.' 
'Simion placed me for Poseidon' 

Thus, of the evidence cited by Wackernagel in favor of his clitic MC hy- 
pothesis, only the early Sanskrit formula of (6)-(8) remains. As will be 
seen in the following discussion, however, this construction has parallels in 
Avestan and Homeric Greek. Moreover, if Watkins's interpretation of Myc. Gk. 

ho-agrese is correct, also this evidence must be added. Unfortunately, the 
Mycenaean evidence does not at this point seem amenable to the kind of de- 
tailed investigation here applied to the evidence of early Sanskrit, Avestan, 
and Homeric Greek. It will therefore be left out of consideration. Note 
however that if the interpretation of the Sanskrit, Avestan and Homeric pat- 
terns proposed in this paper is correct, it may provide an explanation-in- 
principle also for Myc. (h)o-agrese etc. , as resulting from two competing 
fronting rules, one affecting deictics/pronominals , the other, verbs. 

h: Wackernagel ' s view that the Sanskrit evidence points to an earlier 
clause-second ordering of clitic MC verbs is by no means the only possible 
interpretation. Fourteen years earlier, Delbriick (18t8:51-^) had noted that 
the pattern exemplified in (6)-(8) occurred almost exclusively with verbs of 
speaking (SPEAK), in the context of lively discussions or altercations (cf. 
(6) and (7)), or in the quotation of the opinion voiced by famous authorities 
on particular points (cf. (8)). Delbriick viewed these constructions as re- 
sulting from what we would now call extraposition of the subject, motivated 
by the fact that the subject is known and therefore weakly accented. 5 To 
some extent, this may in his view have been further aided by the fact that 
these subjects are heavy noun phrases. Under similar conditions, the accus- 
ative-marked addressees of verbs of speaking could in Delbriick 's opinion be 
occasionally extraposed, as in (IT). 

(17) te ha devSj ucuh bfhaspatim angirasam (SB 
'gods 'SPEAK 'B. ' 'A.' 
'Now, these gods spoke to Brhaspati Angirasa' 

While in I878, Delbriick explicitly ruled out (p. 5^) the possibility of 
accounting for structures like these by a process of verb fronting (rather 
than NP extraposition), in 1900 he instead proposed to consider them a variant 
of the verb-initial constructions frequently encoimtered in the Indo-European 
languages with verbs of speaking (pp.6l-2 with 65). To paraphrase his explan- 
ation: The verb was to be fronted because of its importance. The occurrence 
of an initial connective, however, prevented the verb from being fronted to 
absolute initial position. As a consequence it went into the position in 
which it is actually found (cf. (6)-(8) above). A similar argument, but with- 
out direct reference to our constructions, is found in Brugmann's 190^ sum- 
marization (p. 683), where it is said that since anaphoric pronouns and other 
sentence connectors have to occur initially, the initial positioning of verbs 
[of speaking] had to be modified in Sanskrit. 

Unfortunately, however, the exact details of this hypothesis were not 
worked out, nor was any explanation provided for examples like (17) above, 
which can be accounted for in terms of nominal extraposition, but for which 
a modified verb-initial explanation would be difficult. (On this latter 
count, Delbriick should perhaps not be faulted; for the construction of (17) 
proably does result from extraposition, a pattern coexisting with the struc- 
tures which are the topic of this paper. Cf. type (Vl) in section 6.) 

5: In the remainder of this paper I will attempt to show that in spite 
of its shortcomings, Delbriick 's later 'verb-fronting' hypothesis is the ex- 
planation which best accounts for the data, even though nominal extraposition 
may have been a contributing factor, especially as far as later reinterpret- 

ations are concerned. 

To do so I will first examine in some detail the attestations of our 
construction and its variants in the Sanskrit texts in vhich it occurs, viz. 
the prose texts of the post-Atharva-Veda Samhitas and the BrShmaija/Rranyaka 
literature. (Section 6.) I will then discuss the synchronic patterns and 
processes in terms of which the construction(s) might be explained. (Sec- 
tions 7-9.) Combined with the chronologj.- of attestations, this synchronic 
evidence strongly suggests a verb-fronting hypothesis for this stage of the 
language. (Section 10.) Next I will examine the earlier Rig-Vedic (and Athar- 
vanic) evidence indirectly relevant to the interpretation of the construction 
and assess the implications of this evidence for the verb-fronting hypothesis. 
(Sections 11-13.) Finally, I will show that Avestan and Homeric Greek offer 
evidence which provides further, comparative support for the proposed hypo- 
thesis. (Sections lU-l6.) This leads to the conclusion that the verb-front- 
ing hypothesis must be extended to PIE and that therefore owe structures do 
not provide evidence for or against the reconstruction of PIE as SOV. (Sec- 
tions 17-20. ) 

6 : The prose texts of the post-Atharvanic Sa^ihitas and of the Brahmapas 
and Araoyakas (hereafter referred to as Vedic Prose) offer the following non- 
verb-final types of constructions involving SPEAK. Of these, types (l)-(V) 
typically occur in the contexts described by Delbriick as either involving a 
discussion or argument (i.e. a 'verbal exchange') or the views of a famous 
authority. This characterization is largely correct. In fact, the discourse 
context of discussions and argviments will turn out to be of considerable in- 
terest once the comparative evidence of Greek and Avestan is considered. (Cf . 
sections 15 and l6. ) Synchronically, however, it is possible to give an even 
more general and, I believe, more accurate characterization of these construc- 
tions : Under normal circumstances , the structures in question are associated 
with a special discourse feature, namely that of attributing a certain author- 
itativeness to the participants in the act of speaking. Moreover, construc- 
tions (l)-(V) do not ordinarily occur in other contexts, even in lively dis- 
cussions and arguments, where instead, verb-final order is the norm. That is, 
verb-final order and the types (l)-(V) are in quasi-complementary distribu- 
tion. (Note however that there are some exceptions on either side, such as 
(19) below, a type (l) in a non-authority context.) Before going on to a de- 
tailed exemplification of these types, it might perhaps be interesting to give 
an example of how a switch in characterization of a speaker leads to a shift 
in construction, from 'ordinary' verb-final order to 'authoritative' type (l) 


(18) te ha adityah ucuh . 

. angirasab agnaye anvagatya cukrudhuh iva 

Dp s speak 

S X absolutive 'SPEAK' pcle. 

... sa ha uvaca . . . 

tasmad u ha sma ahut angirasab ... (SB 3.5-1 

D p speak 

'The fldityas said QUOTE. The Angirases, having approached, were 
angry (= inveighed) against Agni QUOTE. He (= Agni) said QUOTE. 
Therefore the Angirases said QUOTE (= the moral of the story, told 
by the Angirases as authorities).' 

(I) The type D (P) (P) SPEAK S : 

This is by far the most common sub-type of the construction in question. 
Thus in Sb lU.6.5-8, of UU non-verb-final occurrences of uv5ca/ncuh , 39 fol- 
low this pattern. To illustrate the variety of different possible variants 
of 6, P, P, and their optional presence or absence, a large number of exam- 
ples are given for this construction. Mutatis mutandis, similar variations 
may be found for the other types . 

(19) prajSpatih vai idam agre ekah eva asa / sab aikgata . . . 

S f 5 X V 5 "SPEAK" 

sab aiksata pra.japatih ... (^B 2. 2. U. 1-3) 
~5 "SPEAK" S 

'Prajapati was all alone here before. He reflected QUOTE ... 
Now, Prajapati reflected QUOTE' (P., in this context, is 
clearly not conceived of as an authority) 

(20) sab ha uvSca vaideghab mathavab (Sb 


'Now, (this) M.V. said QUOTE' 

(21) etad ha vai uvaca vasigthab (KS 3^.17) 

'on this issue, indeed, V. said QUOTE' 

(22) atha ha uvaca gotamah rahngapab (SB l.U.l.lS) 
~5 P SPEAK 's 

'then G. R. said QUOTE' 

(23) iti ahub brahmavgdinab (KS 22.8) 

'QUOTE (Thus) say the theologians' 

(2U) iti ha sma aha asurib (SB 
•QUOTE (Thus) A. used to say' 

(25) api ha uvaca y&,1navalkyab (SB 1^.2.1.7) 

'Also Y. said QUOTE' 

(II) The type 6 (P) (P) SPEAK : 

Though by no means as common as (l), this type is the second-most common 
pattern in Vedic Prose. Thus in the SB II4.6.5-8 sample mentioned earlier, 
four of the remaining five occurrence of non-final uvaca/ncub follow this pat- 

(26) sab ha uvBca dustaritum pauSsayanam (Sb 


'he said to D. P. QUOTE' 

(III) The type 5 (P) (P) SPEAK S : 

Though quite rare, this pattern is attested at all stages of Vedic Prose. 

(27) etad ha vai uvaca sankab kaugyati putram . . . (KS 22.6) 
'~5 P P SPEAK S 

'on this issue, g.K. said to (his) son ... QUOTE' 
(Sim. KS 22.7, KKS ltl.7, JB, AB 2.25.2, gB lU.8.13.2) 

(IV) The type D (P) (P) D SPEAK S/Q/et( 

This pattern, exemplified by (28), is exceedingly rare. It is not found 
in the earliest, Saiphita stage of Vedic Prose. Moreover, Delbruck^(l888:23) 
claims that this type actually is disfavored, the ordering t) (P) (P) D S/O/etc . 
SPEAK (my formulation) being preferred instead; cf. (29). While as (28) 
shows, this avoidance of type (IV) is not absolute, it does seem to hold true 
as a general tendency, at least for the Satapatha BrShmaija, where the ratio 
between type (28) and (29) constructions in 'authoritative' contexts is about 
1: 5. Finally, in the Aitareya Brahmaija, where the majority of type (TV) 
constructions are attested, most of these are found in the relatively later 
books. That is, it appears that type (IV) may well be an innovation and that 
Delbriick's constraint against this construction at one time was an absolute 


(28) te ha ete ttcuti devtb adityap (SB 3.1.3-^) 

6 P 5 SPEAK S 
'These Mitya gods said QUOTE' 
(Sim. SB; AB 1+.27.9, 5.33.3, 7.3U,7,8, etc.) 

(29) tad ha sma etad arunib aha (§3 

'on this issue, A. used to say QUOTE' 

(Sim. ibid., SB,, '♦.5.7.9 (2x),, 
10. U. 1.11,,,; cf. also KS 28. U, 
JB 1.175-8, AB U.27.9) 

(V) The type SPfiAK (P) S :^ 

While relatively rare, this pattern occurs in all the early Saiphita prose 
texts, where many of the other constructions are quite rare. Thus in the 
KS-thaka Saiphita, the ratios of (l)-(V) are as follows: 

(I) (II) (III) (IV) (V) 
8 2 2 

Among the later prose texts examined by me, only the Satapatha BrShmaija, the 
longest text, offers any examples. Moreover, it is perhaps noteworthy that 
all of the SB attestations, save, occur in direct discourse. Could 
it be that the relative rarity of this construction in Vedic Prose is a styl- 
istic feature associated with the technical nature of these texts, while in 
reported speech the construction is more common? 

(30) uvSca ha visvamitrah (TS 5 •'♦.2. 2) 

(Sim. TS = KS 20.9, KKS 31.11; KS 19.1 = KKS 31.3; 
gB, lU.6.3.2, ll+.6.10.2,5, etc.) 

(Vl) Extraposed constructions : 

Other non-verb-final constructions involving SPEAK can occasionally 
be found, such as (31) and (32) below; cf . also (17) above. Note however 
that these constructions do not seem to be specially marked for 'authorita- 
tiveness', although authorities may occasionally figure in them (as in (31)). 
Moreover, these constructions do not seem to differ in terms of their fre- 
quency of attestation and of their connotations from other patterns with ex- 
traposed constituents but not involving SPEAK; cf. (33) and (3^). They 
therefore seem to be of no particular interest for the present discussion. 

(31) arupal? ha sma gha aupavesib (TS 

'Ar. Au. said QUOTE' 

(32) te deve}} abruvan gayatrim CAB 3.26.1) 

'these gods said to the gayatri meter QUOTE' 

(33) na antara pasusirsapi vyaveyad adhvaryuj? (KS 20.8) 
D X V S 

'the adhvaryu should not go in between the cattle heads' 

(3^+) saly tatal;i eva prtn dahan abhiysya imam prthivtm (SB l.!+.l.lU) 
D D adv. pple. V 

'burning from there to the east he (= Agni ) crossed this 
earth ' 

Finally, it may be noted that verbs other than SPEAK may occasionally 
appear in constructions of the type (l)-(lll),° and with a similar 'flavor' 
of authoritativeness; cf. (35)-(37). Examples of this sort seem to be con- 
fined mainly to the latest texts , such as the B;-had Rraijyaka portion of the 
Satapatha Brahmaija from which examples (35) and (36) are drawn. Moreover, 
they do not appear to occur in the earlier Samhita prose texts. Finally, 
they tend to be found in contexts where type (l)-(lll) constructions with 
SPEAK abound. The suspicion therefore arises that these are analogical ex- 
tensions of the type (l)-(lll) SPEAK constructions. 

(35) 'Type (l)' sab gjagHma gautamah (SB lU. 9.1. 7) 
D V S 
'Now, Gautama (an 'authority') came' 

(.36) 'Type (II)' sah a.lagama Jaivalam (SB 1I+.9.I.I) 
D V 
'He came to Jaivala (an 'authority')' 

(37) 'Type (III)' tarn a.lagama supla sarnjayah brahmacaryam 
~5 V S 

(SB 2.1i.l4.U) 
'to him came S. S. (an 'authority') for studying' 

7: The evidence of the preceding section suggests that there are just 
four constructions vrith SPEAK which are commonly associated with the special 
discourse feature of 'authoritativeness'. These constructions are (l)-(lll), 
with the verb placed right after an initial string of accented deictics plus 

optional unaccented sentence particle plus' optional accented sentence par- 
ticle; and (V), with the verb in initial position and therefore accented ac- 
cording to the general rule that clause-initial verbs are accented, even in 
MCs. On the other hand, there is reason to believe that construction^ ( IV )_, 
with another accented deictic intervening between the initial string D(P)(P) 
and the verb, is an innovation and that at an earlier stage this construction 
was actively- avoided. This suggests a certain complementarity between SPEAK 
and 'post-particle' accented deictics in the constructions under discussion. 

Of the constructions thus likely to be inherited, namely (l)-(lll) and 
(V), types (l) and (ll) could well be interpreted as resulting from extra- 
position, comparable to what we find in type (VI ). However, this interpret- 
ation is quite unlikely for (ill), since 'multiple' extraposition, i.e. ex- 
traposition of more than one major constituent is otherwise virtually unheard- 
of at this stage of the language. Moreover, if we were dealing with simple 
extraposition, there would be no expl'anation for the early avoidance of con- 
struction (IV). In short, the totality of the evidence makes it unlikely 
that extraposition is responsible for the specially marked constructions un- 
der discussion. What is possible, however, is that the existence of the ex- 
traposed type (Vl), combined with the possibility of interpreting (l) and (ll) 
as extraposed, led to a later reinterpretation of our constructions as in 
fact resulting from extraposition. (What is important in this context is the 
fact that the reinterpretable type (l) is the most common sub-type of the con- 
structions under discussion.) And this reinterpretation may then have led to 
the creation of type (IV), as well as the relaxation of the earlier constraint 
against this construction. 

While extraposition thus is not likely to be the original motivation for 
constructions (l)-(lll), the fact that the clearly fronted , verb-initial type 
(V) shares with (l)-(TIl) the special discourse feature of authoritativeness 
suggests that Delbriick's fronting hypothesis may be on the right track. At 
the same time, this possible affiliation, as well as the sheer existence of 
type (V), casts doubt on Wackernagel ' s clitic-verb hypothesis. 

What needs to be still done, however, is to show if and how within the 
grammar of Vedic Prose, types (l)-(lll) and type (V) can be accounted for as 
structurally related. To do so, it will be necessary to take a brief look 
at the structure of clause-initial particle and deictic strings in the lan- 
guage of Vedic Prose. 

8: The ordering of elements in the clause-initial deictic/particle 
strings of Vedic Prose is frequently characterized only in a very general 
fashion, to the extent that accented pronouns (of the type (sa/)ta- , ( ega/ ) 
eta- ) tend to occur initially and that other elements tend to occur immediate- 
ly after the first accented word; cf. e.g. Delbriick 1978:U7-8 (on pronominal 
clitics), 1888:22-23 (on pronouns and particles, with some more detailed ob- 
servations on the relative ordering of some of these words on pp. U71-5U6); 
Wackernagel I892 (passim on clitics); etc. However, more specific rules on 
the relative ordering of these words are hard to find. The closest thing to 
such a statement which I have come across is Delbriick's remark (l900:5l) that 
in Sanskrit, as well as in Greek, clitic particles precede clitic pronouns. 

In fact, however, strings of initial deictics and particles are so com- 
mon in Vedic Prose that it is guite easy to establish a set of general prin- 
ciples, exceptions to which are rare at this stage of the language. ^0 Using 
the symbols already introduced, this set of principles can be given the fol- 
lowing taxonomic form. (Note that one principle which can be discerned im- 
mediately is the fact that if all positions are filled by just one word each, 
the accent falls on every alternate position. ) 


(P) (P) (D) (6) X 

That is, the clause begins with an accented word. This word may either be an 
accented deictic pronoun (most notably ta- 'that, this' or eta- 'this, that') 
or pronominal adverb (such as iti (quoteT, atha 'then'), or it may be any 
word which is placed initially by reason of focus, emphasis, etc. Frequently 
such non-deictic initial words are marked by the following sentence particle 
vai or by the phrase- or word-bound emphatic particle eva . (This latter par- 
ticle always follows the word which it emphasizes and is not otherwise re- 
stricted to any particular location in the clause; in the examples which fol- 
low it may therefore occasionally intervene between other, relevant elements.) 

What is important for our further discussion is that, with a few, neg- 
ligible exceptions (cf. note 10), this initial position does not permit 'doub- 
ling'; i.e., only one word may occur in this initial position. 

The second position, if filled, is held by unaccented sentence particles, 
most notably u 'and, but, now', ha (weakly emphasizing), sma (emphasizing and/ 
or indicating habitual past action). The second-position particles may double 
up; and, if u_, ha, and sma thus occur in combination with each other, their 
relative order is the one in which they are listed. 

The third position is held by accented sentence particles, most notably 
vai (emphasis, topic), and also aha (emphasizer) , hi_ 'for, because', nu 'now, 
but', tu_ 'but', khalu 'indeed'. These particles likewise may double up, al- 
though this is not a common phenomenon. If doubling takes place, the normal 
order seems to be as follows : 


(aha) |-(nu]] (khalu) (vai) 


That is, tu_, rm, hi_ seem to be mutually exclusive; as a set, they may be fol- 
lowed by khalu which in turn may precede vai . Note however that I have not 
noticed any sequences of the sort nu khalu vai ; only nu khalu etc. , nu vai 
etc. , khalu vai seem to be attested. Moreover, in the Satapatha Brahmaoa 
some examples with vai khalu are found (e.g. SB As for aha , it is 
found before nu_ or vai . I have not noted any interaction of aha with other 
members of this class. 

The fourth position is taken up by unaccented, clitic pronouns, such as 
ma 'me', me '(to/for/of) me', asya 'his', en5m 'her'. I have not noted any 
passages in which these can double up. If doubling does occur, it would seem 

quite rare. (Note, however, that there is some Rig-Vedic evidence for doub- 
ling in this position; cf. note 19 below.) 

The final position of the string seems to be occupied by those accented 
deictics which because of the constraint against doubling in initial position 
could not be accommodated in that place. Doubling is clearly permitted in 
this position. I am not however aware of any internal ordering principles. 

The following examples, in combination with the earlier Vedic-Prose cit- 
ations, may suffice to provide some illustration of the above ordering prin- 
ciples . 

(38) adanti ha sma vai etasya purannam (KS 23.2) 

I P ? 5~^ X 

'They eat his earlier food' 

(39) dvisantam h.a asya tad bhr^tyvyam abhyiltiricyate (SB 

f p g" g Y~^ 

'that remains over for his hateful enemy' 

C+O) pra ha vai enam pasavo visanti (MS 1.8.2) 
X P P D X 
'the cattle turn toward him' 

(i+1) tatha u eva esa etena . . . hanti (Sb 1+.U.2.13) 
B P D S X 
'thus he slays with that one' 

Cit2) iti tu eva esa etat karoti (Sb 5.i+.3.2) 
^ ? D 5 X 
'thus he does this ' 

(U3) ttm tu nab agatSm pratiprabrntad (Sb 
~5 ? D X 
'announce her to us as having come' 

(UU) agnih hi vai dhnh (gfi 
XP ? X^ 
'for Agni (is) the yoke' 

(.^5) manusy^ id nva (= nu vai ) upastirpam icchanti (TS 
X"^ ¥~ P P X 

'human beings desire something strewn out' 

^+6) pra.japatib khalu vai tasya veda . . . (TS 
X ¥ f D^ X 

'Prajapati indeed knows of that . . . ' 

(>7) tvam nu khalu nab ••• brahmisthah asi (SB II+.6.I.U) 
~~X f f D X 

'you now are indeed the most learned of us ' 

iMQ) na aha nu eva etasya tath5 praj^h variu^ah grhnati (SB 2.5.2.'i) 
T'J ? t 5 X 

'Varupa does not thus seize his offspring' 

What is important for our discussion is that all of the positions in 
these initial strings are syntactically 'arbitrary' and may in fact lead to 


the separation of syntactically closely related words. Thus in C39), both 
dvisantam 'hateful' and asya 'his' are separated from their syntactic head 
noun bhratyvyam 'enemy'. Similarly, in (U8) etasya 'fiis' has been separated 
from its head noun praj tb 'offspring'. It is thus clear that even the last 
position, that for non-initial D, forms an integral part of this syntactical- 
ly arbitrary initieuL string. 

9: As far as the syntactic rule system is concerned which will account 
for the initial-string ordering Just outlined, the placement of sentence part- 
icles causes no great difficulties. All we need to do is specify that sen- 
tence-scope particles (which can be lexically defined as a set) occur in post- 
first-position, with unaccented particles taking precedence over accented 
ones, etc. 

In the case of deictic elements, however, as well as in the case of 
fronted non-deictics , it seems to be necessary to invoke movement rules, 
since — as we have seen — the placement of these elements into the positions in 
which they occur separates them from other elements with which they are close- 
ly related syntactically (in terms of agreement rules, etc.). 

For non-deictics, that rule would be fairly straightforward, specifying 
that under certain discourse conditions, such as focus, emphasis, etc., in- 
dividual words may be fronted. We may refer to this as Discourse Fronting. 
(The only difficulty would lie in the fact that it is necessary to state that 
no more than one word may be fronted, even if a given constituent consists of 
more than one word.-*-^) 

For deictics, however, it will be necessary (a) to account for the clitic 
pronouns and their behavior, (b) to state that fronting is (quasi-)obligatory 
for accented deictics, and (c) that such deictics will go to string-final po- 
sition if the initial position is occupied by some other element (whether 
deictic or non-deictic). Let us refer to this as Syntactic Fronting. 

We must further state that in case of conflict. Discourse Fronting always 
takes precedence over Syntactic Fronting, forcing syntactically fronted deic- 
tics into string-final position. 

Moreover, there seems to be a tendency governing the positioning of some 
of the deictics in string-initial vs. string-final location: Of the two pro- 
nouns ta- and eta- , ta- more commonly occurs initially, eta- more commonly 
string-finally. Thus in Taittirtya SaiphitS 5.2, the ratios for ta- vs. eta- 
in the two positions are as follows: 









Moreover, in TS 5.1-2, there are lU occurrences of initial ta- followed by co- 
referential, doubling, string-final eta- , but none with the obverse order. 

This difference in ordering parallels a difference in behavior between 
the two pronouns when they are used as correlative pronouns coreferential 
with the relative pronoun of an adjoined relative clause (RC). Thus in TS 5- 

6-7 I have found the following patterns. (No distinction is made here be- 
tween string-initial and string-final placement of these pronoims. That pat- 
terning is entirely comparable to the one described above. ) 

MC ... RC 


... MC 






That is, eta- is used as a correlative only in MCs preceding their RCs , ta- 
is usually employed in MCs which follow their RCs. 

This difference in behavior seems best accounted for pragmatically: ta- 
appears to function as a general-'-^ deictic referring to earlier information. 
This accounts for its tendency to occur clause-initially (closest to the ear- 
lier information which it refers to), as well as for its preference for post- 
RC position (where again it refers backward, to the earlier information of 
the RC). eta- , on the other hand, seems to function as a general-'-^ deictic 
introducing new information. This accounts for its tendency to occur string- 
finally (closest to the new information provided by the rest of the sentence), 
as well as for its pre-RC position in contexts where the relative clause 
serves to more clearly define the new information introduced by eta- , as in 
the following example. 

ik9) sadadi vai esah dadsti yati agnihotram .luhoti (MS 1.5.12) 
f P D X 
'Ee/that person gives continuously who offers a fire-sacrifice' 

10: Given this background, we are now in a position to account for the 
ordering of SPEAK in the formulas of type (l)-(lll), to relate these construc- 
tions to type (V), and to explain the early avoidance of (TV). 

As has been observed by many linguists, the verb-initial order of (V) is 
quite common for SPEAK in the Indo-European languages; cf. e.g. Delbriick 1900: 
6l-2, Kroll 1918, Dressier I969. Delbriick explained this ordering as a kind 
of emphatic fronting. Dressier, however, more plausibly tried to account for 
it as a text-linguistic (or discourse) phenomenon: In his view, PIE verb-ini- 
tial position had two discourse functions, one being 'anaphoric' (referring 
to, or linking up with, earlier information), the other, less common one, be- 
ing 'cataphoric' (referring to later information). It is this latter, cata- 
phoric function which he sees in clause-initial SPEAK, as well as in clause- 
initial 'be' at the beginnings of narrations or in the meaning 'there is, 
there are...' (3, 10-11, and elsewhere). 

Let us now extend this interpretation to the initial uvaca of type (V) 
by saying that this verb has been fronted not because it is emphatic (or ana- 
phoric), but because it is cataphoric. Being fronted for a different reason, 
it may therefore also behave differently from other non-deictics. Now, as we 
have seen in section 8, the normal manner in which a conflict between Syntac- 
tic (deictic) Fronting and the Discourse Fronting of non-deictics is resolved 
is by the latter taking precedence, forcing fronted deictics into string-final 

I propose that types (l)-ClII) can be accounted for under the assumption 
that in cases of conflict between deictic fronting and cataphoric fronting of 
SPEAK, deictic fronting takes precedence, forcing cataphoric uvaca etc. into 
the same string-final position which houses fronted deictics in other cases 
of conflict. (Note that types (l)-(lll) all have a string-initial deictic.) 
Under the assumption that in this string-final position, doubling is permit- 
ted only for members of the deictic class, this hypothesis will readily ac- 
count also for the early complementarity between deictics and uvaca in string- 
final position and the consequent early avoidance of type (IV): Construc- 
tions of type (IV ) would be in violation of such a constraint against doubling 
of deictics with non-deictics in string-final position. In addition, of 
course, under this (modified) fronting hypothesis, constructions of type (ill) 
which, as we have seen, are difficult to account for as resulting from extra- 
position, pose no difficulties whatsoever. The same process (es) which ac- 
count(s) for (l)-(ll), as well as (V) , will also yield type (ill). 

Finally, there is evidence which suggests that the postulated different 
behavior of cataphoric SPEAK (as compared to other fronted non-deictics) is 
not entirely ad hoc. This evidence consists of the fact (noted in section 9) 
that the equally cataphoric, deictic eta- likewise tends to go into string- 
final position, in case of conflict with other, anaphoric deictics. That is, 
we can state a general tendency, namely that in case of confliect, cataphoric 
elements more readily go into string-final position than anaphoric ones. 
Note in this respect that although cataphoric eta- may occasionally occur 
string-initially in type (l)-(lll) constructions (cf. (2l) and (2?) above), 
this is quite rare; normally it is anaphoric ta- or other deictics which oc- 
cur in this position (cf. the rest of the examples). Moreover, the fact that 
cataphoric eta- and cataphoric SPEAK both tend to go into the same, string- 
final position, combined with the putative constraint against doubling of 
non-deictics with deictics in that position, provides added motivation for 
the early avoidance of type (IV). 

In short, the evidence of Vedic Prose, combined with general Indo-Euro- 
pean evidence for cataphoric fronted SPEAK, suggests that structures of the 
type (l)-(lll) result from a conflict between deictic fronting and cataphor- 
ic non-deictic fronting, which is resolved in favor of the deictic, forcing 
cataphoric SPEAK into the string-final position which also otherwise houses 
fronted elements which cannot be accommodated string-initially. 

What is not clear, however, is whether this particular conflict resolu- 
tion is an innovation of Vedic Prose or is inherited. The fact noted earlier, 
that type (V) in later Vedic Prose is more common in reported speech than in 
the normal, technical prose of the texts, might perhaps suggest that types 
(l)-(lll) are an innovation, peculiar to Vedic Prose. However, only an exam- 
ination of the Rig-Vedic and Atharvanic evidence and, ultimately, of similar 
constructions in Avestan and Homeric Greek can possibly provide any degree of 

11: As it turns out, the evidence of the Rig- and Atharva-Veda seems to 
confirm the view that constructions (l)-(lll) must be an innovation of Vedic 
Prose. For first of all, there seem to be no occurrences of constructions 
with the special 'authoritative' flavor which we find in Vedic Prose. Second- 
ly, the long strings of deictics and particles + SPEAK, so characteristic of 

the Vedic-Prose constructions, are conspicuously absent. All that can be 
found, beside the common verb-final structures, are constructions of the fol- 
lowing types. 

Ca) Verb-initial : 

Ca) Imperative: (50) pychata id u tad . . ■ (rV IO.81.U) 
'ask ye that (+ ind. disc.)' 
(Sim. AV l.T.i*) 

(b) 5raT5hatic: (51) vidma hi te yathS manab (RV I.17O.3) 

"SPfiAK" P D~^ 
'for we know how your mind (is disposed)' 
(Sim. 8.92.18; AV 7.76.5) 

(c) Cataphoric: (52) uvtca me varunah medhiraya (RV 7.87.I1) 

SPgAK D S*"^ 

'Varuoa said to me, to the wise one QUOTE' 

(53) isyami vab ••■ yudhyata &.1au (RV 8.96.IU) 

'I order you ... "fight in the battle"' 
(Sim. RV l.l6U.3i* (2x), AV 3.8.2) 

(B) 'Modified' initial position : 

(a) Imperative: (5^) uta bruvantu nah nidab (RV l.lt5-5) 

"5 SPEAK (D)15 S 

'and may our accusers say QUOTE' 

(Sim., without D, in RV 1.7.3) 

(b) Cataphoric: (55) atha abravtt vrtram indrab hanigyan (RV U.18.II) 


'then Indra said (to Vigiju), about to slay 

Vftra, QUOTE' 

(56) iti braviti vaktari rarapati (RV 10.6l.12) 


'thus says the giving speaker QUOTE' 

(57) pra nu vocam cikituge .lanaya (RV 8.10.15) 
X/5-'-6 p SPEAK 

'I will now proclaim to the intelligent people 

(c) Anaphoric(?): (58) iti susruma vayam (AV 8.9.I8) 

~^ "SPEAK" S 

'QUOTE (thus) we heard' 

(Sim.ib. 12. U. 1+8, 13.1+.'t7, 50-5^) 

(C) Kxtraposed (?): (59) ngisatyau abruvan devith (RV IO.2I+.5) 

'the gods said to the Nasatyas ' 

(60) uta enam ahutj. samithe viyantati (RV '+.38.9) 

'and they say of him, (as they are) going in 
different directions, QUOTE' 

(61) tena mfcn abravtd bhagati (AV 6.82.2) 
~1> X SPEAK S 

' therefore Bhaga said to me QUOTE' 
(Sim. ib. 6.it8.l) 

(62) iti tva upastutasya vandate vf ga vi:k (RV 10. 
1> P X SPEAK S 115.8) 

'QUOTE (thus) the mighty voice of Upastuta 
praises you' 
(Sim. passim) 

As can be seen from these examples, the relative brevity of the senten- 
ces and the absence of long strings of deictics and particles, in many cases 
makes it difficult to distinguish between extraposed and modified-initial 
structures. In fact, unambiguous judgments are possible only for structures 
like (62). On the other hand, (5^)-(6o) are structurally ambiguous. It is 
only in terms of their discourse functions that it is possible to try to dis- 
tinguish between modified-initial and extraposed structures. Thus, the im- 
peratival nature of (j't) makes imperative fronting at least possible. In 
C55)-(5T) (as well as in (5^+)), the verb SPEAK is, within the discourse, used 
cataphorically , directing the listener's attention to what follows. On the 
other hand, for (59) and (60) the context is less conducive to any 'marked' 
interpretation of SPEAK. Finally, (58) shares with the other constructions 
under (B) a certain sententiousness. However, since direct discourse pre- 
cedes, it can hardly be interpreted as containing cataphoric SPEAK. Clearly 
then, even in terms of discourse context, there can be some disagreements 
concerning the interpretation of these constructions . 

The best that can be said, then, is that the older language has clear 
evidence for clause-initial SPEAK constructions comparable to the later type 
(V) and possible evidence for a 'modified' initial construction comparable, 
but not demonstrably identical to the later types (l)-(lll). None of these 
however has the special authoritative flavor of the Vedic-Prose constructions. 

12: There is also some evidence in the structure of Rig-Vedic initial 
strings which may be taken to suggest that the Vedic-Prose initial-string 
structure shows innovations in certain aspects which may be crucial for our 

As in the case of SPEAK constructions, so also for initial-string struc- 
tures, the nature of the Rig-Vedic texts makes for certain difficulties. In 
the present case these consist in the fact that initial strings are by no 
means as conspicuously (and volimiinously) attested as they are in Vedic Prose. 
Moreover, the much greater freedom in word order, a degree of scrambling un- 
heard-of in Vedic Prose, seems to affect not only major constituents and non- 
deictic fully accented words, but also deictics and even clitic pronouns. 
Thus out of 286 occurrences of clitic pronouns in RV 8.1-21, a full 56 (or 
about 20^) occur outside of initial strings, in constructions like (63). 
Frequently, the clitic pronoun of such deviant structures is found attached 
to its head noun, as in (63). 

(63) aveb indra pra nab dhfyab CRV 3.21.12) 
~V lit D 
'May you, Indra, aid our intentions' 

Still, examination of a relatively small sample of Rig-Vedic text (8.1- 
21) quickly reveals that also Rig-Vedic Sanskrit had clearly-structured ini- 
tial strings, even though their make-up may have differed in certain aspects 
from their Vedic-Prose counterparts. 1" 

The major pattern vhich emerges is of the structure 

i^\ (P) (P) (D) (D) 

which differs from the Vedic-Prose pattern only as follows (disregarding the 
virtual absence of doubling in any of the positions^^) : 

(a) It seems to be necessary to include in the string-initial set l5 the 
adverbial/preverb elements £, ni_, pra , etc. which in the later language norm- 
ally are univerbated with their verbs (cf. note 11). In Vedic Prose, their 
initial occurrence is quite unusual and is associated with emphasis, focus, 
etc. (cf.ibid.), suggesting that they have undergone Discourse Fronting, just 
like other non-deictics. In the Rig-Veda, however, clause-initial ordering 
is normal for these elements, at least in MCs , suggesting that they are front- 
ed by the same Syntactic process which fronts deictics in Vedic Prose. 20 

Cb) Of more direct concern is the fact that the Rig-Vedic evidence for 
a string-final 6 position is quite meagre: Only 22 out of a total of 309 re- 
levant constructions (i.e. about 7^) have an accented pronominal deictic in 
this position. Taken by itself, this is perhaps not too significant, since 
also the Cclitic) P position is filled relatively rarely (Ui out of 309, or 
about \^%) . However, there is no strong evidence suggesting that the posi- 
tion of "string-final" accented deictics is syntactically arbitrary, as we 
found it to be in Vedic Prose. Rather, it seems always possible to account 
for the placement of these deictics as syntactically motivated, as in (6U) 
below. 21 

The following examples may suffice as illustrations: 

(6U) apa u su nab iyam saruti . . . etu (RV 8.67.15) 
D P P D D S V~ 
'may this arrow go away from us' 

(65) sakf t su te mahat^ . . . mudtmahi (RV 8.I.IU) 
X ? D X V 
'may we once again be happy with your greatness ' 

Confirmation that the Vedic-Prose string-final position of deictics may 
be an innovation comes from a rather unexpected quarter: As for instance Wat- 
kins (1963:29-30) noted, the relative pronoun (RP) may in Rig-Vedic Sanskrit 
be found either clause-initially (in which case it seems to behave like all 
other string-initial elements; cf. (66) below), or it may appear in what Wat- 
kins refers to as clitic position, as in (67)-(70). While Watkins is certain- 

ly correct in noting that relative pronouns may occur in two different ini- 
tial-string positions, his statement can be improved upon: First of all, no 
matter where is is placed, 22 the relative pronoun ya- is accented. 23 Second- 
ly, there is evidence which suggests that non-initial accented ya- occurs in 
essentially the same accented, third position of initial strings as the accent- 
ed sentence particles; cf. the possibility of unaccented sentence particles 
occurring before ya- (as in (67) and (68)) and the quite common occurrence of 
clitic pronouns after ya- (as in (67) and (69)). Occasionally, this seems to 
lead to doubling with accented particles, as in (70); however, the phenomenon 
is sufficiently rare to make it uncertain as to whether this is normal behav- 
ior or should be classified among the more deviant patterns of the Rig-Veda. 

(66) ^d cid hi tva Janab ime ... havanta (RV 8.1.3) 
RP f D S V 

'though these people invoke you' 

(67) sa^3 gha yah te dadasati (RV 3.10.3) 

6 P RP D V 
'who worships you ..." 

(68) para ha yad . . . hatha (RV 1.39-3) 

'when you smite away' 

(69) dhiyah yah naj?. pracodayat (RV 3.62.10) 

'who may incite our thoughts ' 

(70) pra nu yad egSm mahina eikitre (RV I.I86.9) 

'when their greatnesses have become visible' 

What is important for our discussion is that here we have an accented 
pronoun which like the deictics tends toward string-initial position and which 
like the deictics may in the Rig-Vedic language be prevented from going into 
that position by the appearance of another fronted element in string-initial 
position. (Note that in all cases of non-initial ya-, either a Syntactically 
Fronted or a Discourse-Fronted element occurs string-initially. ) Unlike the 
deictics of Vedic Prose, however, ya- goes into the (accented) third position 
of the string, not into a "fifth", final, accented position. 

This fact, combined with the rather weak evidence for "fifth", string- 
final position of accented deictics, may suggest that the Vedic-Prose pattern 
is in fact an innovation and that, therefore, the placement of SPEAK into this 
same position must perforce be an innovation. 

13: Note, however, that the Rig-Vedic (and Atharvanic) evidence for 
Vedic-Prose innovation discussed in the two preceding sections is not neces- 
sarily cogent. First of all, the nature of these ('Vedic Poetry') texts dif- 
fers considerably from that of Vedic Prose, both in terms of literary medium 
(verse vs. prose) and in terms of subject matter and style (poetic, reflective, 
devotional hymns vs. technical discussions of a ritual and, in the later por- 
tions, of a theological nature). It is therefore possible that some of the 
differences may be a matter of literary style (such as the authoritative use 
of constructions Cl}-(III) and (V) in Vedic Prose vs. the absence of any such 

connotations in Yedic Poetry). Moreover, long and variegated initial strings 
and relatively lengthy formulas like (.I)-Clll) are much more likely to occur 
in prose, where there are no clear limitations on sentence length, than in 
metrical poetry, where such limitations do exist, especially in Vedic Poetry 
where lines ordinarily are no longer than eight to twelve syllables and where 
run-on lines are quite rare. That is, the difference may be stylistic, not 

Moreover, it may be argued that relative pronouns are different from 
deictic pronouns and cataphoric verbs and therefore may behave quite differ- 
ently in their syntax as well. 

Finally, and perhaps most importantly, it must be borne in mind that 
even if the "fifth", string-final position of Vedic Prose may be an innova- 
tion, it is hardly likely that it was created ex nihilo. Rather, it is more 
probable that it results from a reinterpretation of earlier existing struc- 
tures. What may have been instrvimental in this development is that the place- 
ment of relative pronouns into third position has effectively been abandoned 
in Vedic Prose, thus eliminating this structure (and the rioles accounting for 
it) as a possible model for the placement of accented deictics. If, then, 
Vedic Poetry structures like i6h) are reinterpreted as having deictic iyam 
not in syntactically motivated, pre-nominal position, but in syntactically 
arbitrary, string-final position, then this would easily bring about the 
patterning of Vedic Prose. The Vedic-Prose positioning of SPEAK into string- 
final position could be similarly accounted for by a reinterpretation of 
'modified' initial structures like (55)-(57) as having SPEAK in string-final, 
rather than clause-medial position. 

In short , while the evidence of Vedic Poetry may be suggestive of a 
Vedic-Prose innovation, it is not sufficiently strong to establish it. For 
it may well be taken to be attributable to stylistic difference. Even if there 
should have been an innovation, this may have come about as a reinterpretation 
and extension of patterns which already existed at the stage of Vedic Poetry. 

ik: It is because of this uncertainty of the Vedic-Poetry evidence that 
it is necessary to look beyond the internal evidence of Sanskrit, to the com- 
parative evidence of other Indo-European languages. 

There is quite general evidence that beside 'marked' VSO structures with 
the verb in absolute initial position (indicating emphasis or Dressler's ana- 
phoric or cataphoric discourse functions), PIE had equally marked VSO struc- 
tures in which the initial verb was preceded by a non-major-constituent ele- 
ment, especially by sentence-connecting adverbs and other similar elements. 
This is the so-called 'modified' initial position encountered in Vedic Poetry. 

Unlike Vedic-Prose types (l)-(lll), these structures were not restricted 
to SPEAK. There are, however, two ancient Indo-European languages outside 
Sanskrit which have constructions strikingly similar to Vedic-Prose (l)-(lll) 
and (V) which likewise are limited to SPEAK and which, are used under very 
similar, specialized discourse conditions. These languages are Homeric Greek 
and Avestan, the latter being especially interesting and significant because 
of its geographic and genetic proximity to Sanskrit. 

Already Delbriick (1900:62) referred to one of the Greek types as a paral- 
lel to the Vedic-Prose patterns (l)-tlll). Unfortunately, however, his dis- 
cussion is extremely brief and general Ccf. section h above). Moreover, his 
suggestion does not seem to have been followed up by other linguists. Final- 
ly, the evidence of Avestan has to my knowledge not been introduced into any 
discussions of the topic at hand. 

The evidence of these two languages will be examined in the next two sec- 

15: In Homeric Greek, direct discourse may be introduced by various 

different orderings of SPEAK: verb-medial (7l), verb-final (72), or even 
verb-initial (73). Of these, the types (71) and especially (72) are quite 
common, their relation being roughly the same as that of verb-final and verb- 
medial structures in other contexts. (Frequently, but not necessarily al- 
ways, the verb-medial structures can be accounted for as 'amplified senten- 
ces' a la Gonda 1959, i.e. with extraposition of non-essential material; cf. 
e.g. (71).) On the other hand, the (anaphoric) fronted type (73) is exceed- 
ingly rare. (I have encountered only one example.) What is common to all 
these structures is that they are restricted to introducing isolated tokens 
of direct discourse or the initial direct discourse of a series of exchanges 
in a conversation, discussion, or argument. 

(71) polla de mitri philei er^sato kheiras oregnus (1.351) 


'and he prayed much to his dear mother, with outstretched 
hands QUOTE' 

(72) kalkhanta pr6tista kak' ossomenos proseeipe (I.I05) 


'he, looking evil, spoke first to Calchas QUOTE' 

C73) (messSi d'amphoterSn skeptra skethon) eipe te muthon kerux 

Idaios . .. (7.227-8) 

'(the two (messengers) held their staffs between the two 
groups) (Of them) Idaeus said (the following) word QUOTE' 

A special formula., however, is found as a kind of link between quotes 
which are part of an extended verbal exchange. Note that this Linking Formu- 
la is virtually de rigueur, exceptions being exceedingly rare. 

The most characteristic and constant element of this formula is the ap- 
pearance of an initial anaphoric pronoun (in the accusative case) which refers 
back to the preceding speaker. Beyond that, there are a number of different 
variants; cf. (7'<)-(77)- Several of these, however, can be explained as ana- 
logical to the non-formulaic patterns of (71) and (72); cf. (76) and (77). 
(Note that the verb-final type is quite rare.^^) Moreover, the more common 
of these structures, the verb-medial type (76) can be additionally motivated 
as an extension of patterns like (75), reinterpreted as having a member of the 
constituent S (i.e. of a major syntactic constituent) occurring in post-string 
position, rather than a participial form of SPEAK. (As a consequence of this 
reinterpretation, then, any number of other constituents may occur in this po- 

sition. ) It is only structures of the type C7^) and (75) which are syn- 
chronically 'unmotivated', i.e. which cannot be accounted for a reinterpret- 
ations and/or as analogical on the model of the non-formulaic constructions. 
It is these structures, tlien, which are most likely to he archaic. 

(.7't) ton d' gmeibet' epeita podarkes dios Akhilleus (1.121) 
~5 f SPEAK X S 

'(quote) to him answered in return fleet-footed, divine 
Achilles QUOTE' 

(75) t^n d' apameibomenos prosephe podas okus Akhilleus (1.215) 
~5 ? SPEAK (pple . ) SPEAK S 

or S 
'(QUOTE) ^her, answering, fleet-footed Achilles said QUOTE' 

(76) ton d' Helena muthoisin ameibeto dia gunaikon (3.171) 

'(quote) to him answered Helen with (these) words, divine 
among women QUOTE' 

(77) ten d' aut' Ant^nSr pepnumenos antion euda (3-203) 
~5 f ]5(?) S X SPEAK 

'(QUOTE) to her Antenor, the wise, said in reply QUOTE' 

A special formula is used also at the conclusion of individual quotes or 
at the end of a verbal exchange (i.e. , after the final quote of a conversation 
or argument). Also this Final Formula is virtually de rigueur. (Out of 60 
relevant contexts in books 1-3, I have found only five not showing this form- 
ula. ) 

Unlike the Linking Formula, this Final Formula, however, comes in two, 
equally archaic (i.e. synchronically unmotivated) shapes, one being verb-ini- 
tial (P), the other, 'modified' verb-initial (Q). Of these, the (P) variant 
is most fixed. Most commonly it consists of a single linguistic form, the 
synchronically highly aberrant and 'defective' verbal form §_. There are, 
however, structures like (78) and — more rarely — (79) in which the verb is fol- 
lowed by other elements. And it is these initial-string elements which con- 
clusively show that e^ is in fact clause- and string-initial. 2° 

(78) § rha (= ara) kai es diphron arnas theto ... (3.310) 

'QUOTE; he spoke (and placed a sheep into the chariot ...)' 

(79) e toi ho g' (hSs eipdn kat ' ar' hezeto ...) (1.69; sim. ib. 

sp!;ak d 6 p t> speak 6 p v 101 ) 

'QUOTE; h^e spoke to him (thus speaking he sat down ...)' 

The Q-type, characterized by the occurrence of h6s 'thus' in initial 
position, 27 shows greater variability; cf. C8q)-(85). Hbwever, with the verb 
phg/pha- 'speak* the order is quite fixed, as in (80)-(82). (Thus, all 35 in- 
stances of Q with finite phe/pha- in books 1-3 follow this pattern.) Most de- 
viations from this pattern show a participle of SPEAK in string-final position. 
They tbus do not represent any serious counterevidence. Structures like (85), 

with the verb in clause-final position, are quite uncommon and can be ex- 
plained as occasional reahapings on the model of the non-formulaic type (.72). 
What is important is that structures like C8l) and tlie admittedly rarer (82) 
and (.83) show that SPEAK is indeed fronted (before the subject and other ele- 
ments of the sentence) and string-final (occurring after ^D). The function- 
al parallelism with the clearly fronted type P provides additional evidence 
in favor of a fronting hypothesis. 

(80) hSs ephat', eddeisen d' ho geron (1.33) 

'QUOTE; thus he spoke; and the old man got frightened' 

(81) hos phato Peleidgs , poti de sk§ptron bale gaiei (I.2U5) 


'QUOTE; thus spoke the son of Peleus , and he threw the staff 
down to earth ' 

(82) hds phat' apo ptolios deinos theos (U.51U) 

'QUOTE; thus spoke from the city the terrible god' 

C83) hds ara tis eipesken Akhaion te Tr6on te (U.85) 

'QUOTE; thus would someone (= many a one) speak of the Achae- 
ans and Trojans ' 

(81+) hds eipdn proiei . . . (1.326) 

'QUOTE; thus speaking he sent (them) forth ...' 

(85) hds hoi men toiauta pros allelous agoreuon (5.27**) 
1) D P ]5(?)^8 X SPEAK 

'QUOTE; thus they spoke to each other in this manner' 

The evidence of Homeric Greek can be siimmarized as follows: Beside non- 
formulaic structures which more or less follow the general word order prin- 
ciples of the language, there exist two formulas which occur under specific 
discourse conditions: (a) a Linking Formula found in verbal exchanges, and 
(b) a Final Formula which marks the end of individual quotes or of verbal 
exchanges. Structurally, the synchronic ally most unmotivated and therefore 
most likely archaic exemplars of the Linking Formula and variant Q of the 
Final Formula are characterized by the appearance of an accented deictic in 
string-initial position and by the movement of SPEAK into string-final (or 
post-string?) position, preceding its subject and other elements of the 
clause. On the other hand, the P variant of the Final Formula is character- 
ized by the fronting of SPEAK into string-initial position. 

The structure and function of these constructions are quite reminiscent 
of what we found in Vedic Prose. This is most clearly the case with the Lin- 
king Formula, which is structurally and in its discourse function almost iden- 
tical to the use of Vedic-Prose (l) and (ill) in discussions and arguments; 
of. section 6 above. The only difference is that in Vedic Prose an additional, 
'authoritative' flavor attaches to the use of these formulas. 

The variant P of the Final Formula is structurally comparable to Vedic- 
Prose (V) Cas well as to Vedic-Poetry CA)Cc), examples (52) and (53)). Func- 
tionally, however, there is a difference: Vedic-Prose (V) is cataphoric and 
authoritative, while Homeric P is anaph.oric and otherwise -unmarked. 

Finally, the Q variant of the Final Formula is structurally comparable 
to Vedic-Prose (r)-(lll) in so far as they have iti as string-initial ele- 
ment Ccf. examples (23) and (2l+), as well as Vedic-Poetry (56) and (58)). 
Functionally, however, even if the special authoritative flavor of Vedic 
Prose is ignored, there is only a certain overlap with the Vedic-Prose form- 
ulas, in so far as Homeric Q occurs at the end of extended verbal exchanges. 
Note however that Q also occurs at the end of single direct-discourse utter- 
ances . 

l6: Just as in Homeric Greek and in (Vedic-Prose) Sanskrit, so also in 
Avestan29 the ordering of SPEAK in the introduction of ordinary direct dis- 
course is more or less the same as that for other verbs. Most commonly it 
is clause-final, as in (86); but extraposed structures, as in (8T), can be 
found as well. 

(86) gaos zaotaram zavaiti (Y 11. l) 
'The cow curses the Zaotar QUOTE' 

(8?) a6a imam vaco framruya varsgraynis (Yt. 13.20) 
"? (7) SPEAK S 

'then you should proclaim to her (this) speech, (you) being 
victorious QUOTE' 

However, in discussions concerning important issues (mainly of a reli- 
gious/theological nature) and usually involving important personages (Ahura 
Mazda, Zarathushtra, etc.), a different construction is fo\ind. As in the 
case of the Greek Final Formula and of Vedic-Prose (V) vs. (l)-(lll), there 
are two variants of this construction, one (X) verb-initial, the other (y) 
with the verb in string-final (or post-string?) position. 

The variant X is found in struct\ires like the following, which clearly 
show an order VSO: 

(88) perasat zaraBustro ahuram mazdqjn (Yt.lU.l) 

'Z. asked A. M. QUOTE' 

C89) mraot ahuro mazda spitamai zara9ustrai (Y 5.I) 

'A. M. said to Z., the Spitamid, QUOTE' 

The variant Y appears instead of X in cases where the initial position is 
occupied by another fronted element. This may either be a preverb (as in (90)) 
or a deictic sentence-connective adverb Cas in CPl)). In both of these cases, 
a single, subject NP may follow the verb, or a sequence of subject NP and ob- 
ject NP; cf. (90)/C9l) beside (92). Moreover, as (90) and (92) show, clitic 
pronouns are placed between the initial element and the verb, producing (mini-) 

initial strings. 

C90) a dim pares at zara9ustro (Y 9.1) 
prev. D SPEAK S 
'Z. asked him QUOTE' 

C91) aat mrao;S ahuro mazda (Yt. lU.2) 
'then A. M. spoke QUOTE' 

C92) paiti dim parasat zaraGustro arsdvim . . . (Y 5-90) 
prev. D SPEAK S 
'Z answered her, (namely) A. ... QUOTE' 

What is espcially instructive and confirms the structural and functional 
relatedness of X and Y is the fact that the two constructional types may al- 
ternate with each other in the same text, under the same discourse conditions. 
Thus (88) and (91) introduce two successive speakers (and their speeches) in 
two successive paragraphs of the same text. (This recurs ibid. 1*2 and U3, 
and similar patterns are found elsewhere, passim.) In addition, while type X 
may perhaps be more common at the very beginning of discussions, the preverb- 
initial varieint of Y may occur in the same environment (cf. e.g. (90)). Clear- 
ly, then, both types of structure require a cataphoric verb-fronting hypothe- 
sis, with some mechanism (comparable to that found in Vedic Prose) which in 
cases of conflict places cataphoric SPEAK in string-final (or post-string?) 

There is, however, some evidence which suggests that the mechanism by 
which fronted cataphoric SPEAK is put into string-final (or post-string?) 
position may differ in its details from what we find in Vedic Prose. First of 
all, construction (93), recurring four more times in the same text (and else- 
where), seems to offer a string-final accented deictic preceding (fronted) 
SPEAK. (At least, in Sanskrit this pronoun would be accented; there is no 
direct evidence on the accentuation of Avestan. ) This may suggest either that 
the early Sanskrit constraint against string-final doubling of accented deic- 
tics with SPEAK did not hold in Avestan, or that SPEAK was placed in post- 
string position. A decision on this point would require a great deal more re- 
search on the structure of Avestan initial strings than is feasible in the 
present context. However, given the relative lateness of the Avestan texts, 
it is entirely possible that structures like (93) result from a relaxation of 
an earlier string-final constraint comparable to the late Vedic-Prose relax- 
ation which made structures of type (IV) possible. 

(93) aat me aem paityaoxta haomo (Y 9.2) 
~5 D 5 SPEAK S 
'tken Haoma replied to me QUOTE' 

In addition, (9**) below may suggest that in Avestan (unlike in Sanskrit) 
entire NPs , not Just single words, CEin be put in initial position through Dis- 
course Fronting. (Note that the initial NP seems to be fronted for contrast: 
Subsequent paragraphs detail the second and third wailings of A.) Moreover, 
this construction suggests that the fronting of any element (whether deictic 
or non-deictic) into first position leads to cataphoric SPEAK being placed in 

string-final (or post-rtring) position. 

{3_k} paoiryqji garezqjn gerazaeta asls ... haca apuOro- zanyai 

Jahikayai C^t. 17.57) 

'the first wailing A. ... wails about the sterile courtezan 

There are also some questions about structures containing the Avestan 
equivalent of Skt. iti , Horn. Gk. hSs_. While in expressions like (95), which 
are inserted into QUOTE, these seem to have a structure comparable to that of 
Y above (as well as to iti- initial Vedic-Prose (l)-(lll) and variant Q of the 
Homeric Final Formula), elsewhere uiti tends to be non-initial, occurring di- 
rectly before SPEAK or separated from it by a noun of speaking; cf. (96)-(98). 
Moreover, as C98) shows, such a structure may be preceded by a sentence-ini- 
tial deictic, suggesting that uiti does not fimction as a sentence-initial 
element. It is possible to account for this situation by the following hypo- 
thesis. Structure C95) occurs in a relatively frozen, formulaic context, 
showing little or no variation. On the other hand, (96)-(98) occur much more 
freely. Moreover, it is possible to see in the proximity of uiti to SPEAK an 
incipient univerbation, a drifting of quotative uiti to the semantically rela- 
ted SPEAK. This interpretation receives support from the fact that uiti fre- 
quently appears as uity before aog/ao.1- 'speak', i.e. in a sandhi form. San- 
dhi forms of this sort, however, are in Avestan found only in fixed colloca- 
tions which act as single phonological words. It may therefore be claimed 
that uiti- initial (95) represents an archaism, showing an early, synchronic- 
ally unmotivated structure of the type Y and thus entirely comparable to the 
Sanskrit and Homeric structures under discussion. Note however that while 
this hypothesis may be possible, it is not very firmly established, since 
structures like (95) could easily be taken to represent nothing more than 
type X structures of (univerbated) uiti + SPEAK. This would of course still 
make it possible to relate (95) to the verb-initial P variant of the Homeric 
Final Formula; but it would preclude a direct equation with iti- initial Sans- 
krit type (l)-(lll) formulas. 

(95) usta ahmai naire mainyai / uiti mraot ahuro mazda / ai asaum 


zaraeustra (Yt.10.137, sim.ib.l38, Yt.19.53, V l8.1^°) 

'"Hail to the authoritative man", (so) said A. M. , "0 truthful 
Z. ...'" 

(96) yo ba5a ustanazasto yarazaiti ahurai mazdai uityao.lano (Yt.lO. 

(S )SPEAK 73) 
'who indeed complains to A.M. with outstretched hands, speak- 
ijig Cas follows) QUOTE' 

C97) . . . uiti vacabis aojana (Yt. 17.17, sim.ib.22) 
CD) 'words' SPEAK 
'(thus) speaking with (these) words QUOTE' 

C98) a9at uiti fravasata asis (Yt.l7.2l) 
6 Cd) SPEAK S 
'then A. ... began to speak (thus) QUOTE' 

Finally, it must be noted that the structures X and Y described above 
are found only in the later, 'Younger* Avestan texts. The older, 'GStha' lan- 
guage on the other hand offers no evidence in favor of a systematic use of 
these constructions to introduce QUOTES in discussions between important per- 
sons. ^1 True, we find occasional constructions such as C99), which look very 
much like Younger Avestan Y structures. However, in the same hymn we also 
find (100 ) and (lOl), with verb-final or extraposed ordering. It is also 
true, however, that though the context is something like a colloquy, the 
structure of that colloquy differs from that of the Younger Avestan discus- 
sions: In Yasna 29, the cow asks the Ahuras for its creator and for help. 
Then that creator "answers" by asking Asa 'Truth' about ways in which the 
ccw might be helped. Finally, Ahura Mazda gives an answer to the cow. Un- 
satisfied, the cow at the end breaks out in renewed wailing. This differs 
markedly from the Younger Avestan discussions between two important individu- 
als at a time. Such discussions simply do not occur in the GSthas. 

C99) at i vaoca:^ ahuro mazda (Y 29.6) 
5 5 SPEAK S 
'then A. M. spoke QUOTE' 

ClOO) xsmaibya gius urva garazda (Y 29.1; sim.ib.9) 
'to you wails the voice of the cow QUOTE' 

ClOl) ada tasa gaus parasat asam (Y 29.2) 

'then the creator of the cow asks Truth QUOTE' 

This difference between Gatha and Younger Avestan is strikingly similar 
to that between Vedic Poetry and Vedic Prose. Moreover, in the present case 
it is quite clear that the chronological difference really reflects a stylis- 
tic difference, such that the discourse contexts in which the constructions 
in question would be motivated happened not to be included in the subject mat- 
ter of the older literature. 

More important, however, are the quite striking structural and discourse 
similarities between Younger Avestan X and Y on one hand, and Vedic-Prose (V) 
and (l)-(lll) on the other. In both cases we find a coexistence of structures 
in which cataphoric SPEAK has been fronted to initial position with patterns 
in which the verb has been fronted to a later, string-final (or in Avestan 
perhaps post-string) position. These constructions are used in very similar 
discourse contexts, namely in discussions between important personages. True, 
the Avestan constructions do not have quite the same authoritative flavor, but 
this seems to be a relatively minor difference. (in fact, there are isolated 
occurrences of structures like (89) outside verbal exchanges, where the only 
thing motivating the use of the construction seems to be the fact that an im- 
portant person is speaking. This is for instance the case at the beginning of 
Yasht 10. Perhaps these cases constitute the beginning of a reinterpretation 
comparable to what must have happened in Sanskrit. ) 

17: Surveying now the evidence of ( Vedic-Prose) Sanskrit, Avestan, emd 
Homeric Greek, we find agreement on a number of important points. 

First, all three languages agree on a special formulaic expression em- 
ployed in verbal exchanges. (Let us refer to this as the Verbal -Exchange 
Formula. 1 There are, to be sure, certain differences in the discourse con- 
ditions under which the formula is used. In Greek it is used in all verbal 
exchanges, whether these are conversations, discussions, or arguments; whe- 
ther the participants are human or gods, male or female, important or rela- 
tively unimportant (such as Thersites). A more specialized use is that in 
Avestan, where the formula is used only in discussions between important per- 
sonages (with some evidence for incipient generalization as an 'important- 
person' construction). Finally, in Sanskrit the construction has a special 
authoritative flavor and may be used in reference to the sayings of authori- 
ties outside of verbal exchanges, simply as an indication that the person who 
says or said something on a particular issue is an authority. 

Although it is difficult to be absolutely sure on these matters, because 
reinterpretations easily can go one way or the other, it does seem likely 
that if the formula is inherited, the Greek usage is the most original, that 
the more restricted use in discussions between important personages was an 
Indo-Iranian innovation, and that the Vedic-Prose usage is the result of a 
further restriction to authoritative figures, plus a reinterpretation and 
generalization of this authoritative use as the primary function of this form- 
ula. (For this latter development compare the possible parallel in Avestan.) 

In addition, there is suggestive, but less cogent evidence for a quotat- 
ive formula. This formula is fairly well established for Greek and Sanskrit. 
In Avestan, however, the evidence for such a formula with fronted SPEAK is 
considerably weaker. Moreover, Sanskrit has, beside its iti- initial variants 
of Vedic-Prose (l)-(lll), various other constructions with iti , including the 
type SPEAK + QUOTE + iti . Further, these Sanskrit constructions have devel- 
oped a range of further special uses. (These issues are discussed in greater 
detail in my other contribution to this volume.) Finally, note that Greek 
occasionally shows a variant of its Final Formula in pre-discourse environment; 
cf. note 27. Because of these various diff ictilties it seems advisable to dis- 
regard this formiola in the subsequent discussion, except to the extent that 
it may provide ancillary evidence for the Verbal-Exchange Formula. 

Having thus restricted our scope, let us take a look at the structural 
properties of the Verbal-Exchange Formula in Sanskrit, Avestan, and Greek; 

(a) All three languages agree on a structure with string-final (or post- 
string?32) placement of SPEAK in constructions with initial deictics. In 
Greek, these deictics always are pronominals ( ton , tin ) . But note adverbial 
h6s in the structurally related Final Formula. In Avestan they are always 
adverbial (aat ) , including preverbal paiti etc. In Vedic-Prose Sanskrit we 
find both, pronominal ( sab , tarn, etc.) and adverbial (atha etc.) initial deic- 
tics. Overall, then, the evidence would seem to point to both pronominal and 
adverbial deictics being possible in formula-initial position, with different 
languages following different routes toward specialization. 

Cb) Avestan further offers a pattern of initial non-deictics in construc- 
tions with SPEAK in string-final position; cf. example (pU). 

Cc) Moreover, Sanakrit and Avestan agree on haying an alternative, 
SPEAK-initial ordering in those cases where there is no other string-initial 
element. Though Greek does not have any direct evidence for this ordering, 
it shows such structures as alternatives to string-final SPEAK in its Final 
Formula variant Q. (Note also the cross-linguistic Indo-European evidence 
for cataphoric fronting of SPEAK. ) 

18: These structural and functional similarities certainly are striking. 
The question therefore must arise as to whether the formula characterized by 
these similarities should be reconstructed for PIE. (Strictly speaking, it 
might be possible that the relevant ancestral stage is not PIE, but some in- 
termediate proto-language. This issue clearly is beyond the scope of this 
paper and will therefore be ignored. ) To decide this question it will be 
necessary to show not simply that we can reconstruct (any shared similarities 
between two related languages "can" be reconstructed, given enough ingenuity), 
but whether we must reconstruct. That is, are the similarities such that 
they cannot be attributed to chance, borrowing, or independent innovation? 

In our present case, the similarities are, I believe, too striking sind 
too idiosyncratic to be attributable to chance. And the chronological stage 
at which borrowing would become a possible alternative explanation would be 
so close to the ancestral language that it would be difficult to meaningfully 
distinguish between borrowing and common inheritance, especially if dialectal 
borrowing within PIE might be involved. 

There are however some possible arguments for independent innovation. 
One of these consists in the fact that as noted, our formula is not attested 
in the earliest stages of Sanskrit and Avestan, but only in Vedic Prose and 
in Younger Avestan. This may be taken to be prima-facie evidence for indepen- 
dent innovation. As we have seen, however, there is good reason to believe 
that the difference between Gatha and Younger Avestan is one of style and 
subject-^natter , not of chronology. Similar, but more circimistantial arguments 
were made for Sanskrit; cf. section 13. In light of the Avestan situation, 
these arguments are considerably strengthened. 

Another possible argument for Sanskrit innovation, namely the early front- 
ing of the relative pronoun ya- into string-medial, rather than string-final 
position (cf. section 12) likewise is not particularly cogent; As argued in 
section 13, (cataphoric) SPEAK and the relative pronoun do not necessarily 
have to behave in the same fashion. Moreover, the comparative evidence of 
Avestan and Greek is firmly in favor of the Vedic-Prose string-final ordering 
of non-initial fronted SPEAK. 

It may still be argued, however, that instead of the Verbal-Exchange 
Formula, Sanskrit, Avestan, and Greek inherited certain syntactic patterns 
and processes which, combined with the discourse features of SPEAK, could 
lead to independent developments of our Formula: As is well known, PIE had 
two variants for verb-initial (or VSO) ordering, both of which could occur 
with any fronted verb. (Examples are most commonly found for imperatives 
and other modal constructions used in imperatival function. Most of the ex- 
amples q^uoted below therefore are of this 'imperatival' nature.) One of these 
had the verb in absolute-initial position, the other, so-called 'modified' 


initial order had another element preceding the verb. CCf. also sections 10 
and ih above.) It is the latter order which is of particular interest, for 
as examples- like Cl02) show, non-SPEAK verbs occiirriiig in such modified-ini- 
tial structures may. Just like the SPEAK of our Formiila, be preceded by un- 
accented clitic pronouns and thus occur in string-final position: 

(102) ned eva ma yuna.lan atra devglj (RV lO-Jl.^t) 
1, D Vi'i 

'lest the gods employ me here' 

Similar structures may result from the fronting of preverb + verb con- 
structions, where examples like (103 ) show that clitics go between the pre- 
verb and the verb: 

(103) pra yam asnotu sugfutih (RV 1.17.9) 

pra yam asnotu sustut 
D D V S 
'may the praise reach you' 

Moreover, in constructions with preverbs in initial position and relative 
pronoun in second position, fronted verbs would in effect wind up in string- 
final position, as in (lOU). (I do not have in my file any relevant examples 
with, clitic particles and/or pronouns. But that may be a matter of accident. ) 

(l0l+) pra ye minanti varupasya dh6ma (RV U.5.U) 

'who diminish Varuoa's laws' 

Under these conditions, one may argue, would it not be possible for some 
languages to independently specialize such string-final verbal constructions 
with verbs of speaking, especially considering the cataphoric nature of SPEAK? 
(Compare the arguments in section 10, concerning Vedic Prose.) 

While at first sight this hypothesis appears quite attractive, a closer 
look reveals a number of difficulties: First, with verbs other than SPEAK, 
only a few types of initial elements can bring about such "string-final" struc- 
tures, namely adverbials (including preverbs and adverbial NPs) and/or relative 
pronouns. As we have seen, however, the Verbal-Exchange Formula may have non- 
adverbial deictic pronominals in string-initial position. (in Avestan we even 
find full NPs; but this might be an innovation.) 

Secondly, there is evidence suggesting that structures like (l02)-(l0l*) 
were not the only possible outcomes of a conflict between verb fronting and 
the fronting of otter elements. Thus in (105) below, the fronted verb is ac- 
cented, i.e. treated as initial in its MC, even though it is preceded by a 
deictic adverbial. -^^ In (l06), a deictic adverbial and a preverb seem to 
double up in initial position, being followed by clitic elements. ^^ Finally, 
in examples like (107) the conflict is resolved by inverting the order of pre- 
verb and verb. 

(105) uta bhaveh Spilj nah antamab (RV 8.1+5.18) 
-5 1/ ' ■ 

'and may you be our deepest friend' 

(106) atha t nalj vardhaya girab CRV 3.29.10) 
1) 6 D V 

'then make our praises grow' 

(107) Jayema sam yudhi sprdhab (RV 1.8.3) 

■? 5 X 
'may we totally defeat the enemies in battle' 

Compared to this variety of patterns which could result from the conflict 
between general verb fronting and the fronting of other elements, it is re- 
markable that Sanskrit, Avestan, and Greek agree on a single, string-final 
ordering of fronted SPEAK in the Verbal-Exchange Formula. Surely, had these 
languages engaged in independent innovations, one might have expected their 
generalizations to have been more divergent. 

Finally, it is difficult to see how independent innovations could have 
led to the remarkable similarities in discourse function which we have ob- 
served. Even without the other, structural arguments, this consideration 
alone would strongly argue against independent innovation. 

19: In the absence of credible alternatives, then, it seems necessary 
to reconstruct our Verbal -Exchange Formula for the proto-language. Such a 
reconstruction of course further entails that in this formula (and perhaps 
occasionally also elsewhere?), the conflict between verb fronting and the 
fronting of other elements was, in contradistinction to other conflict situ- 
ations, resolved by the cataphoric verb of speaking going into string-final 

Under this hypothesis, the Vedic-Prose types (l)-(lll) — as well as their 
Avestan and Homeric congeners — thus are in their essential outlines inherited 
from PIE. 

20: Moreover, given that the string-final SPEAK of these constructions 
results from a conflict of fronting processes, these structures cannot be con- 
sidered to support what was the starting point for this discussion, namely 
Wackernagel ' s hypothesis that PIE had clitic verbs in main clauses which, be- 
ing clitic, moved into clause-second position. (Note that even if the proposed 
reconstruction of the Verbal -Exchange Formula is not accepted, the arguments 
presented here for the positioning of SPEAK as resulting from a conflict be- 
tween fronting rules would remain unaffected. ) 

This, thenj eliminates th.e last piece of evidence for Wackernagel's claim 
and in so doing eliminates Wackernagel's hypothesis from the arguments which 
can be adduced against the reconstruction of SOV as the major-constituent or- 
der of PIE. 


Research on this paper has been in part supported by 1979-80 and 1982- 
83 grants from the University of Illinois Research Board. — For perspicui- 
ty's sake, Sanskrit examples will from now on be given in their pre-pausal 
form, not in their actually attested sandhi forms. 

(Most) Greek finite verbs retract the accent as far to the left as per- 
missible within the restrictions of the 'law of 3 moras'. Contrast this with 
the fact that with the exception of clitics, all other formal categories of 
Greek have 'lexical' or contrastive accents. (For details, see e.g. Schwyzer 

^In his 
Cp. 22), only to dismiss it in an addendum on p. U9. Moreover, in his I96U 
paper, Watkins raised the possibility that since Greek and Sanskrit are the 
only languages with verb clisis and with a distinction between accented and 
unaccented verbs, the clause-second patterns of these languages may be due 
to independent innovations (10U2). The evidence to be presented in this 
paper will, I hope, show that also this view is not acceptable. 

The major innovation of the later Greek inscriptions seems to lie in the 

reinterpretation of me_ as redundant and therefore omissible. This made pos- 
sible the very common constructional type (3), as well as occasional examples 
of the type praxiteles anetheke surakosios tod' agalma 'Praxiteles the Syracu- 
sian placed this statue', with appositive plus direct object following the 
clause-second verb. 

Hermann (l895 : 502-it ) , in a paper whose general argument in favor of PIE 
as having had no dependent clauses is hardly acceptable, proposed that the ex- 
traposed NPs should be considered heavily, contrastively accented. 

"Note the following abbreviations : 

6 = stressed deictic, including pronominal sa/ta- , ega/eta- , as well as 
quotative iti , connective atha 'then', and api 'also' 

D = unstressed pronominal (both deictic and personal) 

P = stressed sentence-scope particle, such as hi_ 'for, because', vai 
(emphasis, topic) 

P = unstressed sentence particle, such as u 'and, but, now', ha (slight- 
ly emphatic), sma (emphatic and/or indicating habitual action in 
the past ) . 

SPEAK = verb of speaking Cand other verbs governing QUOTE) 

QUOTE = direct discourse 

S = subject 

= object, including accusative-marked addressee 

V = verb (other than SPEAK) 

X = other or structurally arbitrary constituent 

"^Because the distinction between unaccented and accented forms will be- 
come important in the subsequent discussion, I have tried to give examples 
from accented texts wherever possible. In some cases, however, relevant at- 
testations come from a text whose accentuation has not been preserved or has 
been incompletely preserved. (This is the case for KKS, AB, JB, and many por- 
tions of KS.) In such cases, the symbols 6 vs. D, P vs. P may serve as a 
guide to the accentuations which would be expected, had the texts preserved 

%ote that clause-initial verbs are always accented in Sanskrit, even in 

■'Verb -initial constructions superficially similar to (V), on the other 
hand, are extremely frequent (even if marked, compared to verb-final order). 
However, these constructions lack the special 'flavor' of types (l)-(lll) with 
verbs of speaking. 

One set of exceptions is constituted by the fact that certain sentence- 
connective, quasi-conjunctional accented adverbs, most notably atho (= atha + 
clitic particle u) may occur in initial position without behaving like initial 
accented elements: As Delbruck (l888:36 and kdk) observed, verbs following 
atho may be accented as if they were themselves clause-initial (as in TS atho 
punati eva enam 'then he cleans him'); and/or other constituents may inter- 
vene between atho and the accented sentence particle vai (as in atho manasg 
vai prajapatilji yajfiam atanuta (TS 'then, with his mind Prajapati 
spread out the sacrifice'). Passages with such quasi-conjunctional elements, 
which may 'double up' in the initial position, have been ignored in the fol- 
lowing discussion. 

■'■■'•This is an _ 
the positioning of a preverb away from its verb (and in initial position). 
This raises the interesting question as to how the fronting of preverb-verb 
combinations interacts with the placement of deictics and sentence particles. 
It is well known that under conditions where uncompounded verbs simply are 
fronted (as in example (38)), preverb-verb compounds may either simply front 
the preverb (as in (Uo)), or they may front as a unit, as in (a) below. In 
addition, however, we also find examples like (b), in which the preverb is 
fronted and the verb appears in the position where non-initial 6 would be ex- 
pected. Moreover, if that position is filled, then, it appears, the verb can- 
not be fronted at all; cf. (c). Unfortunately, constructions of this sort, 
with a sufficiently large number of relevant words and constitutents to per- 
mit an unambiguous interpretation, are not very common, making it difficult 
to be certain that constructions of the type (d) are not permissible. If it 
should in fact turn out that also here, verb-fronting to non-initial position 
is blocked if the second 6 position is filled, this would provide valuable 
confirmation for the hypothesis proposed in this paper. 

(a) VI bhajante ha vai imfen asura^i prthivtm (Sb 

P^^"^ ^ p p 6 s 

'the Asuras divide up this world' 


Ch ) yi vai te mathlsyamahe imSh prajStt (SB 2 . 5 • 1 • 12 ) 

~^ ^ D V 


'we will tear up these creatures of yours' 

(c) § ha vai agmin svi^^ ca nigtyalj ca sajpsante (SB I.6.U.17) 
^ P P 5 S V 

'in him his own people and strangers trust' 

(d) VI vai etad mathigyamahe prajth* (') 
"1 ? 5 V 


'we will now/here tear apart the creatures ' 

pounds; cf. the preceding note. 

'that one yonder' and ayam/idam 'this one here'. These do not seem to show 
any placement preferences comparable to those of ta- and eta- . 

Already Delhriick (1900:59) had proposed an explanation along these 
lines for clause- (and story-)initial 'be'. — It is interesting to note 
that one of the early Sanskrit words for legendary stories is itihasa-, at- 
tested as early as the Atharva-Veda. This word is clearly derived from a 
clausal structure of the following form: 

iti ha asa 

"^ P V 

'thus it was . . . ' 

Could this have been an (unattested) variant on the typical asid ra.ia 'there 
was a king ...' of later stories, comparable to iti ha uvaca . . . beside 
uvaca ha . . . ? 

As will be seen later, the presence of clitic pronominal nab after 
bruvantu does not necessarily indicate anything about the status of SPEAK in 
Rig-Vedic initial strings: Clitic pronouns not uncommonly occur outside ini- 
tial strings, frequently next to their syntactic head nouns. 

For the fact that at this stage of the language, string-initial pre- 
verbs act like members of the set D, cf. section 12. 

I have not made a separate study of th-e Atharvanic evidence. Impres- 

sionistically, there do not seem to be any major differences 

Disregarding structures with the relative pronoun in accented second 
position, tfie ratio of 'lawful' behavior to occurrences outside of strings 
is quite impressive for a textual tradition characterized by heavy scrambling: 


Ul* : 

1 C= 98!? : 



6I4 : 

2 (= 91% : 



230 : 

56 (= 80^ : 


ha svid (p + P) in 8.21.11 and, interestingly, because of its rarity or even 
absence in Vedic Prose, ±m enam (D + D) in 8.1.17. 


There are, to be sure, certain exceptions and limitations to the Syn- 
tactic Fronting of these elements. For some details cf. Delbriick l888:l4U-lt9. 

minor complications are the following: (a) the accented 
sentence particle hi_ seems to be able to double up with string-initial ele- 
ments and may then be followed by unaccented particles, yielding sequences of 
the type sah hi sma (8.21.10). In the case of doubling with initial na 'not', 
this seems to lead to univerbation, indicated by the single accent in nahi 
'for not'. — (b) Similarly, the sentence particle vai may be followed, rather 
than preceded, by the clitic sentence particle u. Cf. e.g. at 1.126.11. — 
Only a complete study of the Rig-Vedic evidence can show whether these are 
significant deviations or can be explained as occasional scrambling phenomena. 

ratha 'like, as', whose optional 
lack of accent seems to be modeled on unaccented NP-scope iva 'like, as'; cf. 
e.g. Delbriick l888:26. 

its early positioning in initial strings) 
ya- resembles the PIE particles nu, su, and to which likewise may appear 
either string-initially or in a later position of initial strings, both in 
Sanskrit and in various other Indo-European languages. Watkins (l963:l6-17) 
refers also to these particles as clitic, if they occur in non-initial posi- 
tion. However, here again it must be noted that to the extent that Sanskrit 
has preserved them in relevant positions, these particles always appear ac- 
cented. In Greek, to be sure, Watkins finds unaccented outcomes of nu in Ar- 
cado-Cypriot ho-nu , ho-ni . However, note first of all that these forms are 
not attested in accented texts. (Their accentuation is postulated on the ana- 
logy of the functionally parallel Attic ho-de . ) Moreover, these are univerb- 
ated expressions which may have undergone similar accent reductions as Skt. 
* nahi > nahi . At the same time, it is certainly true that (some of) these 
particles and perhaps also a cognate of the relative pronoun *y6- appear in 
the complicated initial strings of Anatolian (cf. Watkins, ibid.); and the 
common wisdom is that the non-initial elements of these strings are unaccent- 
ed, clitic. It is therefore possible that the Sanskrit accentuation is sec- 
ondary, perhaps introduced by a phonological rule accenting even-numbered 
clitics (from left to right) in initial strings. Note however that such an 
acco\int of the Sanskrit situation is not without its own difficulties. For 
as noted, the string-initial element may be followed by more than one unac- 
cented sentence particle, without the second particle receiving an accent. 
Moreover, the appearance of accented sentence particles in Greek (cf. e.g. 
Delbriick 1900:5^), the only other ancient Indo-European language with distinc- 
tive accent, should give pause. Finally, we know nothing about the accentua- 
tion of Anatolian, including its initial strings. Given these facts, it seems 
appropriate to limit the discussion to the actually attested evidence of Sans- 


All references are to the Iliad. — My discussion is based on the data 

mainly from the first three books of the Iliad, which I have studied system- 


atically for relevant evidence. I have cross-checked jny findings against the 
remainder of the first twelve books, observing no significant deviations. 


It is interesting that the three sj 
6.381, and 11.822) all have the structure 

ton/t^n dCe) autCa) S X/0 SPEAK 
S P 6(?) 

with what looks like a D in string-final position. Could we here be dealing 
with the same string-final constraint against doubling of SPEAK with deictics 
as the one which in early Vedic Prose led to the avoidance of type (IV)? The 
similarity is tantalizing. However, to make a good case for this comparison, 
a more in-depth study of Homeric initial strings would be required. 

Note that (79) may suggest that in Greek the fronting of SPEAK may lead 
to the placement of accented deictics ( ho ) into string-medial position (compar- 
able to the placement of ya- in Vedic Poetry). However, note that the order 
clitic pronoun + accented pronoun + clitic sentence particle in (79) violates 
the normal relative ordering of (clitic) sentence particles before (clitic) 
pronouns, suggesting that this structure may perhaps result from scrambling. 
Structures of this sort are not common enough to determine what, if anything, 
would be their normal order. 

or something very similar to it, may ap- 
pear also before QUOTE, as in the example below. The few examples of this con- 
struction which I have noted are all very similar, introducing something like 
a "generic" quote, i.e. a more or less fictitious quote which siMis up the tenor 
of what people might have been saying about a particular person or event. (it 
is at the end of one such quote that the formula of example (83) is found.) 

hqde de tis eipesken Akhaion te Troon te (3.207, sim.ib.319) 

'thus would someone (= many a one) speak of the Achaeans and 
Trojans QUOTE' 

Here again we have a verb-final pattern with what looks like an accented 
deictic in string-final position. Could this be further evidence for an avoid- 
ance of doubling of SPEAK with accented deictics in string-final position? 
(cf . note 25 above. ) 

29 - / 

For Avestan I have worked through the Gathas and the Hymn to Mithra (Yt. 

10), in the Romanized editions by Humbach (1959) and Gershevitch (1967). Be- 
yond these I have relied on the evidence of the Romanized selections in Rei- 
chelt 19Q9 and 1911. 

Cf. Bartholomae 1901*: s.v. uiti . 


it normally does in Younger Avestan, this uiti occurs directly before SPEAK. 
(Cf. Y i*5.2.) 

But note that Avestan (93) may be innovated, Just like late Vedic- 
Prose (iVl; cf. section l6. In addition, Greek may offer some evidence for 
the same constraijit against type (IV) structures as early Vedic Prose; cf . 
notes 26 and 28. For these reasons, as well as for ease of exposition, the 
distinction between string-final and post-string positions will be abandoned 
in the following discussion, in favor of the term 'string-final'. 

other accentuation rules of the language. 


Cf. note 10 above for similar patterns in Vedic Prose. 


Avestan : Y = Yasna, Yt. = Yasht , V = Videvdat. 

Greek : Arkh.Eph. = Arkhaiologike Ephemeris (Athens, 1837- ); GDI = Samm- 
lung griechischer Dialektinschriften (Gottingen, 188U-I915). 

Sanskrit : AB = Aitareya Brahmaga, AV = Atharva Veda, JB = Jaiminiya Brahma - 
9a CCaland's selections), KKS = Kapi5thala Katha Samhita, 
KS = Kathaka Samhita, MS = Maitrayani Samhita (non vidi), RV= 
Rig-Veda. SB = 'Satapatha Brahmapa, TS = Taittirtya Samhita. 


BEHAGHEL, Otto. 1929- Zur Stellung des Verbs im Germanischen und Indogerma- 

nischen. K2 56.276-81 
BARTHOLOMAE, Christian. I90U. Altiranisches Worterbuch. Strassburg: Triibner. 
DELBROCK, Berthold. I878. Die altindische Wortfolge aus dem Satapathabrahma- 

na dargestellt. Halle: Waisenhaus. 

1888. Altindische Syntax. Halle: Waisenhaus. 

1900. Vergleichende Syntax der indogermanischen Sprachen. (= vol. 5 

of K. Brugmann and B. Delbriick's Grundriss der vergleichenden Grammatik 

der indogermanischen Sprachen.) Strassburg: Triibner. 
DRESSLER, Wolfgang. I969. Eine textsyntaktische Regel der idg. Wortstellung. 

KZ 83.1-25. 
FRIEDRICH, Paul. 1975. Proto-Indo-European syntax. (Journal of Indo-Europe- 
an Studies, Monograph 1.) Butte, MT: JIES. 

. 1976. Ad Hock. JIES 14.207-20. 

. 1977. The devil's case: PIE as type II. Linguistic studies offered 

to Joseph Greenberg on the occasion of his 60th birthday, ed. by A. Juil- 

lemd, 3.^63-80. Saratoga, CA: Anma Libri . 
GERSHEVITCH, Ilya. I969. The Avestan hymn to Mithra. Cambridge: University 

GONDA, Jan. 1959- On amplified sentences and simileo' structures in the Veda. 

Four studies in the language of the Veda, 1-70. The Hague: Mouton. 
HERMANN, Eduard. 1895. Gab es im Indogermanischen Nebensatze? KZ 33.1*79-535. 
HOCK, Hans Henrich. 1982. AUX-cliticization as a motivation for word order 

change. SLS 12:1.91-101. 
HUMBACH, Helmut. 1959- Die Gathas des Zarathustra. Heidelberg: Winter. 
KROLL, W. 1918. Anfangsstell-ung des Verbums im Lateinischen. Glotta 9.112-23. 

MAROUZEAU, J. I908. Sur I'enclise du verbe "etre" en latin. MSL 15.230-6. 

REICHELT, Hans. 1909. Elementarhuch . Heidelberg: Winter. 

. 1911. Avesta Reader. Strassburg: Triibner. 

SCHWYZER, Eduard. 1939. Griechische Grammatik, 1. Milnchen: Beck. 

WACKERNAGEL, Jacob. 1892. tJber ein Gesetz der indogermanischen Wortstellung. 
IT 1.333-^36. (Repr. in KLeine Schriften. Gottingen: Vandenhoeck + Rup- 
recht, 1953.) 

WATKIHS, Calvert. 19^3 . Preliminaries to a historical and comparative ana- 
lysis of the Old Irish verb. Celtica 6.I-U9. 

. 196h. Preliminaries to the reconstruction of Indo-European sentence 

structure. Proceedings of the 9th International Congress of Linguists, 
ed. by H. G. Lunt , 1035-'+5. The Hague: Mouton. 

studies in the Linguistic Sciences 
Volume 12, Number 2, Fall 1982 

Hans Henrich Hock 

In 1967, Kuiper proposed that the Sanskrit q^uotative , marked 
by iti , owes its origin to Dravidian influence. This claim is now 
generally accepted as an argument for early substratum influence 
of Dravidian on Sanskrit. Unfortunately, arguments for this hypo- 
thesis, as well as the counterargument in Hock 19T5, are based on 
very cursory examinations of synchronic and diachronic evidence, 
both in Sanskrit (and other Indo-European languages) and in Dravi- 
dian (and other relevant non-Indo-European languages). This paper 
attempts to provide a fuller account of the history of the Sans- 
krit q,uotative, of its possible Indo-European antecedents, of the 
parallels in the earliest attested relevant non-Indo-European 
languages (Sumerian, Accadian, Elamite), and of the evidence pro- 
vided by the non-Indo-European languages of South Asia (Dravidian, 
Munda, Tibeto-Burman) . For some of these, especially for much of 
Dravidian, for Munda, and for Tibeto-Burman, the available evid- 
ence is quite limited, making it difficult to come to conclusions 
about prehistoric stages. Combined with the fact that all the 
other ancient Indo-European languages (Hittite, Homeric Greek, 
Latin, and Avestan), as well as the ancient Near Eastern langua- 
ges, have quotatival formations, this situation makes it difficult 
to maintain Dravidian influence for the structure and development 
of the Sanskrit quotative. While this conclusion may perhaps not 
be accepted by ardent advocates of early Dravidian influence on 
Sanskrit, it is hoped that the linguistic observations on which it 
is based, especially those for Sanskrit, will be useful and inter- 
esting to all linguists. 

1: Ever since Kuiper (I967) introduced the construction into the discus- 
sion, 2 the Sanskrit quotative has figured prominently in papers arguing for 
early, pre-Rig-Vedic influence of Dravidian on Sanskrit. Cf. e.g. Emeneau 
1969 and 1971 (both reprinted in Emeneau I98O , thus apparently still reflect- 
ing his views), as well as Hamp 1976 (without reference to Kuiper). The only 
dissenting voice seems to have been that of Hock 1975. 

Unfortunately, only three of these papers engage in any fuller discussions 
of syntactic evidence, 3 namely Kuiper 1967, Hock 1975, and Hamp 1976. Even 
these, however, do not offer a sufficiently detailed syntactic study of the 
Sanskrit quotative, of its possible Indo-European cognates, or of its possible 
non-Indo-European sources. True, Kuiper attempted to detail the different con- 
texts in which the quotative particle iti is used in (Rig-Vedic) Sanskrit. 
However, his discussion was geared toward making comparisons with Iranian, 
Dravidian, and Munda, rather than toward providing a full account of the Sans- 
krit evidence. Moreover, his discussion of Munda and especially of Dravidian 
is excessively cursory. Hock's dissenting 1975 account of the Rig-Vedic evid- 
ence and of relevant constructions in outside Indo-European languages, as well 
as of non-Indo-European evidence, suffers from simileir defects. Finally, 

Hamp's paper was concerned mainly with the word order of iti , not with other 
aspects of its syntax. 

It would be a mistake, however, to attribute the defects of these papers 
solely to the narrow, immediate concerns of their authors. Rather, the major 
reason lies in a veritable dearth of earlier work on the Sanskrit quotative 
and its potentially related constructions in other languages. And this dearth 
is attributable to the fact that until quite recently, quotatives did not cre- 
ate much interest among linguists. (Recent work, such as Kachru's (1979) 
study of the quotative in selected South Asian languages, must therefore be 
highly welcomed, even if it may not cover the whole chronological and geo- 
graphical range.) 

For Sanskrit we at least have the treatments of Delbriick ( 1888: 529-31+) 
and Speijer (l886 :380-88) . The latter provides a quite adequate picture of 
the post-Vedic , Classical period, to which we can now add the discussion in 
Kachru 1979. Delbriick's account of the Rig-Vedic situation likewise is good, 
but his description of the later Vedic situation is too cursory. Moreover, 
being chapters or paragraphs in much more general treatments of Sanskrit syn- 
tax, both accounts are quite condensed. 

For two of the other early Indo-European languages, Hittite and Latin, 
the standard handbooks and dictionaries provide at least some useful inform- 
ation. But for languages like Avestan and Homeric Greek there seems to be no 
adequate coverage. Outside of Indo-European the situation is even more des- 
parate. Thus, as Hamp (1976, n. 3l) aptly observed, even Dravidian has not 
yet received adequate descriptive and comparative treatment. True, the liter- 
ary languages of the South and their quotative constructions have been describ- 
ed fairly well. However, for the other, "tribal" languages it is much more 
difficult to find adequate descriptions. It. is probably because of these 
lacunae that Masica (1976:189) claimed that the quotative is not found in 
the Central and Northern Dravidian languages. For other language families, 
we depend on stray remarks in the grammars of individual languages. 

2: The major purpose of this paper is to initiate a fuller study of the 
Sanskrit quotative and of possibly related constructions in other languages. 
The major focus will be on the Sanskrit quotative and its development in ob- 
servable history. This will be followed by a briefer survey of the evidence 
of other ancient Indo-European languages. Next I will attempt to characterize 
similar constructions in relevant non-Indo-European languages. Finally, I will 
draw on the evidence thus amassed to assess the hypothesis that the Sanskrit 
quotative reflects Dravidian influence. While this latter assessment may per- 
haps not sway many of the scholars committed to the 'Dravidian' hypothesis, I 
hope that the rest of the paper will be interesting and useful to all linguists, 
no matter what their stand on the Dravidian substratum issue. 

3: One of the difficulties in dealing with a topic like 'the quotative' 
is one of definition: Presumably a quotative construction consists of direct 
discourse cfiaracterized by a special lexical or morphological marker. But must 
that marker be obligatory, or may it be optional? And if so, how "optional" 
may it be? Is it sufficient to have such marked constructions next to verbs 
of speaking, or should they be found more generally, such as with verbs of 
thinking, or without any overt governing verb? Etc. , etc. 

Rather than getting tangled up in a definitional morass, I will restrict 
myself to the minimal definition that there must be at least some degree of 
syntactic standardization, auch that the marker is not Just an occasional 
phenomenon, and that there he a relatively small number of possible variants 
for the marker. (Without such a minimal definition, we would probably be 
forced to find "quotatives" in all languages.) 

Beyond that, I will try to characterize the various quotatival construc- 
tions in terms of the following parameters. This, I feel, has the advantage 
of describing all the various quotatives within the same framework, thus mak- 
ing comparison easier. Moreover, it makes it easier to describe historical 
changes in given quotative constructions. At the same time, however, for 
many languages this method of description points out the appalling lack of 
detailed information available at this point. Clearly, all that can be done 
in such situations is to list those features for which I have information and 
to leave the blanks as challenges for further research. 

3.1: The first parameter is that of relative "obligatoriness". In some 
cases (Sanskrit, Greek, Avestan), this parameter can be established statistic- 
ally. In others, some impressionistic Judgments are possible. For some, how- 
ever, I am unable to give any indications. 

3.2: The second parameter concerns the morphosyntax of the quotative: 
What are the lexical items/morphemes employed as a marker? If these are verb- 
al, are they finite or non-finite? What is their ordering relative to direct 
discourse (QUOTE)? What is the position of QUOTE relative to the governing 
verb (speak)? (Note that the term SPEAK will here be used in a technical 
sense, covering all the verbs under (i)-(v) below, if appropriate, i.e. if 
they govern QUOTE.) 

3.3: The third parameter addresses more clearly syntactic (and pragmat- 
ic) questions, namely the kinds of verbs which govern the quotative construc- 
tion, as well as the use of quotatives in other contexts, i.e. without SPEAK. 
In this discussion I have benefited greatly from the thorough analysis in 
Kachru 1979, although the natxire of the data has made it necessary to make 
certain modifications. One of these is that I do not set up a separate cate- 
gory for verbs of non-oral communication (such as 'write'), since with the 
exception of the ancient Near Eastern languages, this category is not relevant 
at the early time depth of the Vedas , the Avestan texts, etc. The syntactic 
categories which I distinguish are the following: 

(i) SAY: verbs of oral communication. (Examples of quotatives with 
'write' etc. found in the ancient Near Eastern languages will be classified 
in this category. ) 

Cii) THINK: verbs of thinking which cross-linguistically may be con- 
strued like SAY, with a QUOTE of the thought, but also Clike verbs of believ- 
ing) with, f active complementizers. 

Ciiil KNOW: verbs of cognition and believing which commonly are con- 
strued as factives. 

(iv) HEAR: verbs of oral perception which may be used with the QUOTE of 
what is heard, but which more frequently are used in other constructions. 

Cv) SEE; verbs of visual perception which are semantically affiliated 
with HEAR Caa perception verbs), but which a priori are not expected with 

(vi) 0, i.e. the absence of any SPEAK. In and by itself this category 
is not particularly remarkable, since languages without quotatives may have 
QUOTE without any overt SPEAK. What makes this category interesting is the 
fact that languages with qiiotatives seem to have a tendency toward special- 
ized uses of this 0-construction. Some of these are detailed below. 

(vii) CAUSE: The use of a 0-quotative to indicate that QUOTE states 
the cause or piirpose for the action referred to in the "main clause", as in 
(l) below. The starting point for such a use probably lies in constructions 
of the type (2), where an originally intended reading (a) is reinterpretable 
as Cb). 

(1) vaidesikab asmi iti prcchami (Class. Skt.) 

'Since I am a stranger , I ask (you) ..." 

(2) . . . varunal? akarot iti tu eva egalj etat karoti (Sb 5.^-3.2) 
Tal '"Varuoa did it^^so thinking) he also does it' 

Cb) 'Because Varuija did it, (therefore) he also does it' 

(viii) NAME: The use of the quotative construction to name or label 
persons or things. 

(ix) QU: The quotative marker with question words, presumably a spe- 
cial development of (viii). 

(x) EMPH: The use of the quotative for emphasizing an NP; probably a 
specialization of (viii). 

(xi) ONOM: The use of the quotative marker with onomatopoeia. 

(xii) OTHER: Other special developments in the use of the quotative. 


■k: The discussion of the Sanskrit quotative is complicated by the exis- 
tence of competing constructions which at different times interact with the 
quotative. These competing constructions can be briefly characterized as fol- 
lows, with illustrations from the Rig-Veda. 

(a) A PARTICIPIAL structure of the type (3)_, in which the verb of the 
lower, QUOTE clause is participialized and, with its subject, is assigned 
case in the higher clause according to the following rules: The case is nom- 
inative if the subject of the lower clause is coreferential with the higher 
subject; elsewhere it is accusative (which in the passive, of course, turns 
into a nominative). In its full form, as just described, this construction is 
quite rare in the Rig-Veda. However, it is supported by parallel constructions 
with verbs of sensory perception, including HEAR which shows signs of being a 
SPEAK verb (cf. the fact that in {k) it is the message, not the action describ- 
ed, which is heard); cf. {h) and (5). Where the corresponding finite structure 
would have the copula, the participial construction always seems to delete the 
copula. (Note that also elsewhere 'be' is quite commonly deleted.) For syn- 
chronic SPEAK, this is the most common variant of the construction; cf. (6). 

Although in many cases it is difficult to distinguish this construction from 
simple 'naming' structures (as in (j)), there are again parallel structures 
with sensory^-perception verbs Cwhetlier functioning as SPEAK or not), as well 
as with vid- 'know' (cf. (8)), which show that the account proposed here must 
be on the richt track. Because of the extensive structural differences be- 
tween the participial q.uoting structures and the corresponding finite-verb 
quotes, they can only be considered indirect q^uote constructions. 5 

(3) ••• mamsai nivacanani samsan (10.113.10) 
SPEAK pple. 

'I may think [myself to be) speaking speeches' 
= • I may think that I am making speeches ' 

ih) ... tvfen rtuthii yatantam . . . s];-ijomi (5.32.12) 
pple. SPEAK^ 
'I hear you requiting in due order' = 'I hear that you requite 
in due order' (Sim. ibid.ll; with, man- 'think', 10.73.10) 

(5) aruoah ma_ . . . vfka^i ... yantam dadarsa (I.IO5.I8) 

pple. 'saw' 
'a yellow wolf saw me going' 

(6) ... sayujam hajpsam ghuti (10.121*. 9) 

'they say a svan (to be/is) the friend ...' 

(7) uta kapvam nrgadah putram ahuh (10. 31 .11) 

'and they say K. (to be) N's son'/'they call K. N's son' 

(8) revantam . . . tva sfpomi (8.2.11) 

'I hear you (to be) rich' = "I hear that you are rich' 
(Sim., with vid- 'know', I.IO.IO) 

(b) A construction marked by the relative pronoun YA- or, more rarely, 
by the interrogative pronoun KA-; cf. (9)-(l'+). (The latter, KA-construction 
occurs freely with pych- 'ask'; but in that case, the structure is indistin- 
guishable from direct discourse. Only structures with vid- 'know' and SAY 
are relevant to the present discussion. ) Because of the interrogative-pronoun 
variants it is tempting to consider these to be indirect questions. Note 
however that structures like (12), which have no probable direct-question 
counterparts, cause difficulties. Moreover, the 'modal shift' so common in 
other Indo-European languages (from indicative to optative or subjunctive) is 
exceedingly rare; cf. Debrunner 19'+8. Example (13) is one of a few Rig-Vedic 
examples.^ Even so, it seems preferable not to include these structures among 
the direct discourse constructions. 

(9) pfchfimi yatra bhuvanasya nSbhiti (l.l61+.3) 

'I ask where the navel of the world is' 

ClQ) pra bruhi ... ya^i idam krooti (IO.87.8) 
'Proclaim who does this' 

Cll) vidmS ... te yatha manal; (1.170.3) 
'know' YA- 
'We know how your mind Cis disposed)' 

Cl2) yab vrtrasya sinam . . . abharigyat pra tarn ... uvaca (2.30.2) 
YA- conditional SPEAK 

'she proclaimed (him) who would bring revenge on Vytra' 
(Direct discourse would have the future tense. ) 

(lU) kati im veda ... kad vayab dadhe (8.33.7) 
'know' KA- 
'who knows of him what strength he puts on' 

(c ) Also UNMARKED quote structures may occasionally be instances of in- 
direct discourse, such as (l5) below, with shift in person (from first to 
third). However, as Debrunner 19'+8 correctly noted, these structures are ex- 
ceedingly rare."!^ Normally, these constructions exhibit no shift in person or 
mood and must therefore be considered UNMARiCED DIRECT DISCOURSE, as in (l6) 
and (17). 

(15) sunai;isepati ahvat ... Sdityam ava enam . . . varuqah sasr.lyad 

SPEAK sg.3 

'S. called out to 5. (that) V. should release him (= S)' 

(16) . . . tam . . . somab aha tava aham sakhye nyokalj (S.^^.l'^) 

'to him Soma said "I am at home in your friendship".' 

(17) uta enam ahuh ... para dadhikrt asarat . . . ('+.38.9) 

'and they say of him "D. has gone off ...'" 

It is these unmarked constructions, then, which most directly. are rele- 
vant to the discussion of the Sanskrit quotative. 

5: The Rig-Veda 

5.1: The Rig-Vedic use of the quotative may be common, but not obligat- 
ory. Thus in book 10, the ratio between QUOTE marked by iti and xmmarked 
QUOTE is 17 : 2U.° This ratio seems to hold good also for the rest of the 
Rig-Veda. The actual numbers, however, may vary. Thus it seems that QUOTES, 
whether quotative or unmarked, occur much more frequently in the later portions 
of the Rig-Veda. (Cf. 5.5 below.) 

5.2: As elsewhere in Sanskrit, the quotative marker is iti , a word found 
in Sanskrit also in independent use, meaning 'thus'. In the Rig-Veda it is 
difficult to find unambiguous instances of this independent use. All possible 
instances can also be given a quotative interpretation, as shown by the various 
translations by different scholars; cf. e.g. (l8). However, the multiplicity 
of different readings suKgests "that none of the quotative interpretations is 
cogent. (Such uncertain passages will be ignored in the subsequent discussion. 9) 
In the Br§aimaoas, however, clear examples can be found, such as (19) below. 

Cl8) iti cid hi tv5 dhana Jayantam (./) m&de-made anmnadanti viprab / 
o^iyab dlir^nati sthiram k tanusva(/) mfi tva dabhan yatudhana^?. 
durevat // (l0.12l».U) 

'for thus the inspired ones Jubilate to you, the victor of 
booty, in every intoxication. Even stronger, bold one, extend 
the bow; the ill-intentioned warlocks shall not outwit you. ' 
CReference of iti? — Possibilities: (a) to verse 2: navanta 
. . . madegu 'shout in their intoxications' (cf. the made-made 
anumadanti of this verse; i.e. play on the word mad- ) ; (b) to 
verse 3, addressed to the 'you' of this verse; (c) to the sec- 
ond half of this verse, which then would be the QUOTE of anu- 
madanti 'Jubilate'; (d) no such reference, but simply the mean- 
ing 'thus') 

tl9) Iti agre kr^ati atha iti atha iti atha iti (SB 7-2.2.12) 

'he first plows thus /in this manner, then thus, then thus, then 
thus' (In the oral tradition of the text this was accompanied 
by appropriate gestures) 

5.3: In terms of the relative position of iti , SPEAK, and QUOTE, we may 
distinguish the following sub-types: an ' iti- initial' construction, with both 
iti and SPEAK (in either order) preceding QUOTE, as in (20); a 'SPEAK-final' 
construction, with iti_ + SPEAK after QUOTE, as in (21); and an 'Embracing' 
construction, with SPEAK before QUOTE and iti after, as in (22). 

(20) iti braviti vaktart raranafe / vasob vasutv^ karavah anehSb 

SPEAK (10.61. 12) 

'(thus) says the giving speaker "Through the goodness of the 
good, the singers are guiltless"' 

(21) yab indraya sunavama iti aha (5.37.1) 

'who says to Indra "We shall press"' 

(22) nakib vektt na dat iti (8.33.15) 

SPEAK """^ 
'no one is about to say "He shall net give"' 

The frequency of these constructions relative to each other eind to the 
corresponding unmarked QUOTE constructions can be preliminarily illustrated 
by means of the following table. (Working with various indexes for iti . I 
believe I have been able to give a complete picture for the quotative. For 
the unmarked construction, my collection outside book 10 cannot claim to be 
exhaustive. However, the relationship between pre- and post-posed SPEAK 
should not be seriously affected by this. 




iti-initial : 


1+ iti] • 

Total : 


SPEAK-final : 22 

r- iti] 



To these figures must be added a few examples of SPEAK and/or iti insert- 
ed into QUOTE, as in (23). 

(23) idam udakam pibata iti abravTtana (/) idam va gha pibata 
nejanajn (I.161.8) 
'"drink this water" you said, " or drink this rinsewater"' 

In these structures we find two instances of SPEAK + QUOTE + iti + QUOTE, 
five of QUOTE + rW + SPEAK + QUOTE, and 3 of (iti-less) QUOTE + SPEAK + 
QUOTE. In addition, RV 10.3i+.12 has a complex structure with a SPEAK-like 
oath-taking expression surrounding QUOTE and then followed by SPEAK. . (This 
construction will be ignored.) 

5.i*: QUOTE may also occur without SPEAK, with or without iti . Construc- 
tions marked by iti , such as (2U)-(26), are easily located. On the other 
hand, for unmarked constructions the absence of any unambiguous clues makes 
the situation more difficult. I have tried to include only the most obvious 
examples in my count, such as (27 )-(29) .-'-'-' My figures for this construction 
therefore may be a little conservative. 

With these caveats , the ratio between unmarked and marked SPEAK-less 
constructions can be given as 9 : 6. 

i2k) pra vaya apa vaya iti asate tate (lO. 130.1) 

'they sit at the spread-out (sacrifice) (saying/thinking) 
"weave hither, weave thither"' 

(25) namati namati iti nrdhvasah anakgan (10.115.9) 

'they have approached (with the words) "honor, honor"' 
(Sim. ibid.; but note that the first half of the verse has 
QUOTE followed by iti . . . SPEAK, and so does the preceding 
verse. That is, we could be dealing with 'carried-over ' SPEAK.) 

(26) tvagtg duhitre vahatum kypoti (/) iti idam visvam bhuvanam sam 
eti (10.17.1) 

'"Tva^tV is arranging for the marriage of (his) daughter", 
(hearing, thinking, saying this) this whole world assembles' 
(There may be some question as to which verb of speaking should 
be supplied here. The metrical break before iti , however, sug- 
gests that the verb should be compatible with what follows.) 

C27I uta mats mahigam anu avenad (/) amf tva .lahati putra devah 
'and the mother looked after the buffalo (saying) "My son, 
these gods are leaving you' C^.l8.3) 

C28) parayatfm mataram anu acasta (/) na na anu gSni anu nu gamani 
'He looked after his departing mother (thinking) "I will not 
not go (= I will not remain), I will go"' (U.18.3) 

C29) iravatt . . . bhutam . . . vi astabhna^i rodasi visQo ete (7.99.3) 
'You, Visnu, stemmed apart these two worlds (with the words/ 
so that) "You shall be full of sustenance"' 

Finally, as a matter of curiosity, it might be mentioned that there is 
one Rig-Vedic verse in which iti occurs multiply, in a fashion which makes it 

difficult to he certain which instance of iti is "the" q^uotative particle; cf. 
(30). CThe evidence of this verse is disregarded in the present discussion.) 

(30) iti vai iti me manah {/) (g&m as vain sanuyam iti / kuvit som^sya 
"~~ "SPEAK" 

apam iti (.10 . 119 . 1 ) 

"Thus (.? } J thus C? ) indeed (is) my mind "I would win cow (and) 
horse" (thus (?)), "perhaps I have drunk soma" (thus (?))' 

5.5: The data summarized in 5-3-'+ can be interpreted in several ways. 
Hovrever, for the present discussion the relationship between the sub-types of 
the quotative and the manner in which they are embedded in the chronology of 
the Rig-Vedall are the most significant. 

Chronologically, the three sub-types of the quotative are distributed in 
the Rig-Veda as follows : 















At first sight, the most striking phenomenon might be the overall increase 
of quotative attestations in the Late period. However, it is questionable 
whether that increase is meaningful. For QUOTES in general, whether marked by 
iti or not, seem to occur more frequently in the late portions of the Rig-Veda. 
Thus my (incomplete) count for corresponding iti- less constructions jumps from 
22 in the Early and Middle portion to 3^ in the Late Rig-Veda. The ratio be- 
tween marked and unmarked constructions, however, seems to remain fairly con- 
stant at all stages of the Rig-Veda. Thus the ratio in book 10, a collection 
mainly of Late hymns, is roughly the same as for all of the Rig-Veda: 




Book 10 

17 : 


1 : 1.1* 

All of RV 

37 : 


1 : 1.5 

Significant differences can however be observed if the relations between 
the three sub-types of the quotative are considered: 

Ca) The iti- initial construction definitely is in the minority compared 
to those in which iti_ follows QUOTE Ci.e., the SPEAK-final, Embracing, and 
SPEAK-less constructions). The total ratio is one of 5 : 38, disregarding struc- 
tures with Iti inserted into QUOTE. Moreover, in later Sanskrit, iti- initial 
constructions become exceedingly rare. 

(bl The SPEAK-final type is considerably more vigorous. In fact, the 
figures above suggest a 100^ increase in its use from the Earlyl2 and Middle-'-^ 
periods to the Late Rig-Veda. ■'■^ However, given the noted general increase of 
QUOTES in the Late portions, it is difficult to Judge whether that increase is 
meaningful . 

(c) The case ia quite different for the Embracing construction. Though 
the numbers are small, tkere does seem to be a significant increase in the 
Late Rig-Veda, from twice each in the two preceding stages -' to six times in 
the Late period. Moreover, as will be seen in subsequent sections, this 
increase marks only the beginning of what ultimately turns out to be the most 
productive quotative pattern. 

Of th^ese three patterns the most likely archaism is type (a). The great- 
er popularity of SPEAK-final (b) might perhaps suggest an innovation. However, 
it can also be explained in terms of a polarization with the unmarked construc- 
tion: Since the latter clearly prefers SPEAK before QUOTE (by a ratio of 52 : 
h) , the iti-quotative comes to prefer the mirror-image order QUOTE + SPEAK 
(by a ratio of 22 : 5, disregarding the embracing construction). Given this 
alternative explanation, it is possible that both (a) and (b) are inherited. 
Because of its marginal use (with a total of only 5 attestations for all of 
the Rig-Veda, the inserted iti_ + SPEAK pattern (cf. 5.3 above) probably like- 
wise is an archaism. (On the other hand, the two instances of SPEAK + QUOTE 
+ iti may be considered influenced by, or comparable to, the Embracing con- 
struction. ) 

The most clearly innovated pattern is the Embracing type (c). Moreover, 
in light of the facts just noted, this construction can easily be explained 
as a Rig-Vedic innovation, namely as a compromise between the order SPEAK + 
QUOTE of the preferred unmarked construction and the order QUOTE + iti of the 
(heretofore) preferred quotative. This process may have been aided by the 
fact that in SPEAK-less QUOTE constructions, iti^ always follows QUOTE. If 
this construction is accounted for as resulting from the deletion of SPEAK, 
this latter order is not surprising, since as we have noted, the type QUOTE + 
£ti + SPEAK was more productive than the iti- initial construction. After de- 
letion, however, a construction QUOTE + iti can be reinterpreted as having the 
syntactic structure (3l), rather than (32). That is, iti changes from being 
a member of the SPEAK clause to being one of QUOTE. As a consequence it would 
now no longer be necessary for iti and SPEAK to be clause mates. 

(31) [ [ QUOTE iti_ ] (SPEAK) ] (innovated construction) 

(32) [ [ QUOTE ] iti_ (SPEAK) ] (earlier construction) 

5.6: There is evidence that such a syntactic reassignment of iti has in 
fact taken place: In the fti- initial and SPEAK-final constructions, iti could 
act as the initial element of the clause containing SPEAK. For the iti- initial 
type this is shown by the line- and clause-initial position of iti in (20) 
above (similarly in 10. 95 .18 and, with preceding "extrasentential" vocative, 
in 10.97.^). Notice that line breaks ordinarily coincide with clause boundar- 
ies. For the SPEAK-final type, note line- and clause-initial iti in (26) above 
and C33) below, as well as (3it)-(36) which, show iti as the first element of 
clause-initial strings. 


(33) tySm stoggma . . . // iti tva agne ... fsayab avocan (10.115.8-9) 

'"We shall praise you ..." (thus) the ygis said to you, Agni ' 

C3I*) ... Iti ca bravat (6.5^.2) 
•"" SPEAK 
'and QUOTE he shall say' 


(35) ••• iti ced avocan (10.109.3) 

•if qUOTE they said' 

(36) ... iti_ yad vadanti (10.37-10) 

•when QUOTE they say' 

On the other hand, excepting two (ambiguous) instances where iti occurs 
in the middle of a line/clause, ■'■° all other (i.e. 8) cases of the Embracing 
construction have iti clause- or line-finally, as in (37). 

C37) ye im ahub surabhib nib hara iti / (1.162.12) 
'who say of it (the battle horse) "(it is) good-smelling, take 
it away" ' 

5.7: The syntactic /pragmatic contexts in which the quotative construc- 
tion [and QUOTE in general) can be used in the Rig-Veda are as follows :19 

(a) With SAY Ccf. e.g. (20), and (33)-(37)). This includes not only 
verbs meaning 'say, speak, tell', but also nu- 'shout' (0, 8.96.1U), rap- 
'whisper' (0, 10.10.11, IO.61.II), is_- 'order' (0, 8.96.1U), nSdh- 'implore' 
( iti , 1.109.3), sikg- 'instruct' (0, 10.95-17), as well as ghoga astt 'there 
was a noise/shouting' ( iti , 10.33-1). For simple 'say, speak', there is also 
a rival construction with (quasi- )participialization , of the type exemplified 
in (6) and (7) above. 

Cb) A special sub-type of SAY is prach- 'ask' : Though permitting QUOTE 
tas in 2.12.5, 8.77.1 with iti^ and I.I6U.6, 8.U5.I+, etc. with 0), this verb 
quite commonly occurs in the 'indirect-question' construction discussed in 
section h above; cf. e.g. example (9)- 

(c) With THINK; cf. (38) and (39), the latter with a noun of thinking. 
Other examples occur at 10.lU6.ii ( iti ) and 10. 3^*. 5 (0, a-dht- 'reflect'). 

(38) yad ... na marai iti manyase (8.93.5) 

'when you think "I will not die"' 

(39) uta syS nati ... matit (/) aditib tJtyS i. gamat 

'and this (is) our thought "May Aditi come with succor"' 

With THINK, however, the more commonly found pattern is the participial con- 
struction discussed in section k; cf. e.g. example C3). 

Cd) With HEAR: I have found only one example of this structure, without 
iti , namely (Uo) below. Elsewhere, HEAR is found in the participial construc- 
tion as in {h) and C8). 

(Uo) uta tvam ... srou (/) yas te va^ti vavakgi tat (8.U5.6) 
•and hear/listen you: "If someone wants something from you, that 
you order . . . " ' 

(e) On th.e other hand, vith KNOW and SEE, no QUOTE constructions are 
found. For KNOW, there are a few examples of the participial construction 
Cas in I.IQ.IO); but the normal pattern is the 'indirect-q^uestion ' type exem- 
plified in (11), Cl2), (1^+). For SEE, I have found only participial construc- 
tions, as in (5). 

5.8: As the earlier discussion has shown (cf. also examples (2i+)-(29)), 
there are quite a number of SPEAK-less , or 0-examples, both with and without 
iti . Most of these are of no great interest, except to the e:ctent that they 
may have helped bring about the developments sketched in 5.5. 

There is however one example which deserves closer examination. This is 
example (26) which Kuiper (1967) considered to be an instance of the CAUSE 
construction of later Sanskrit (for which cf. section 3, examples (l) and 
(2)). While this is no doubt a possible interpretation, it is by no means the 
only possible one. For as the glosses to (26) show, there are a number of 
other possible readings. Similar ambiguities can moreover be occasionally 
found with iti- less constructions, as in (29). However, none of these con- 
structions provides incontrovertible evidence for the CAUSE pattern in the 
Rig-Veda. At best, they show the ambiguities from which the later CAUSE type 
may have arisen by reinterpretation. 

5.9: Of greater interest are the following constructions which, as (i*6) 
shows , may be found with 0-SPEAK. These constructions might perhaps indicate 
the existence of the NAME construction. This would especially be the case in 

(1+1) tam ahuti suprajgti iti (9.1ll*.l) 
•him'SPEAK sg.N/V 
'they say of him "(He is) rich in progeny'" 
OR: 'they say to him "(O you,) rich in progeny'" 
OR: 'they call him "rich in progeny'" 

(1+2) yat? enam adidesati (/) karambh^d iti pugaqam (6.56.I) 
'him' SPEAK sg.N/V 
'who says of him, of Pugan "(He is) a porridge-eater"' 
OR: 'who says to him, Pu?an "(O you,) porridge-eater"' 
OR: 'who calls him, Pu§an, "porridge-eater"' 

(U3) uta gha nema]? astutah (/) puman iti bruve paijih (5-16.8) 

'and many an unpraised niggard is talked about "(He is) a man"' 
OR: 'and many an unpraised niggard is called "a man"' 

(1+1*1 ... sanasrutam C/) indrab iti bravttana (8.92.2) 
sg.A sg.N SPEAK 

'say of the one of ancient fame "(He is) Indra"' 
OR: 'call the one of ancient fame "Indra"' 

(.1*51 yal? m5 mogham yStudhana iti aha (7.IOI+.I5; sim. ibid.l6) 
'me' sg.V SPEAK 
'who falsely says to me "0 warlock"' 

Cl+6) indo indrab iti kjara (9-6.2) 

'0 juice, flow (thinking) "(l am) Indra"' 
OR: '0 Juice, flow (as/called) "Indra"' 

Constructions like these are used freq^uently in the later language for 
the purpoae of naraing things or persons. A characteristic of these later con- 
structions is the fact that they look like the quasi-participial naming con- 
structions discussed in section h (and illustrated in example (7))j in that 
the person or thing named appears in the accusative case (except in the pas- 
sive, where the nominative is -used instead). The name, however, is introduced 
in the nominative case, as a q.uasi-QUOTE marked by iti . 

There are however several difficulties with the interpretation of the Rig- 
Vedic examples. First of all, the case marking of the qiioted NP is ambiguous 
in (Ul)-(U2): Both nouns could either be nominative or vocative, the latter 
being a case not permitted in the naming construction of the later language. 
Moreover, (U5) offers a clear case of a vocative. At the same time, however, 
C'+3)/(^'+) show that also nominatives can occur in this context. 

Secondly, contextually parallel structures make it possible to interpret 
the above examples as genuine QUOTES. Thus, example (37) contains a plain 
nominative as the first "clause" of its QUOTE. And the context makes it clear 
that this is not a naming construction, but a construction with omitted copula 
( surabhil? (asti ) '(it is) good-smelling'). Moreover, this example, as well 
as many others (such as (16) and (17)), shows that the accusative preceding 
such a reduced clause and coreferential with its subject need not be a person 
'named' by means of the QUOTE, but can simply be the person to whom or about 
whom the QUOTE is uttered. — For (U5) there is the parallel structure {hj), 
found in the same hymn and in the same verse as the second occurrence of (U5). 
And this structure can be interpreted only as a genuine QUOTE. — For {h6) , 
there is the parallel C+S), in which a copula-less direct-quote interpretation 
seems to be the only possible analysis. Given this evidence, then, the NAME 
interpretation is not the only possible analysis for (Ul)-(U6); but all the 
readings given in the glosses are a priori equally possible. We thus have no 
certain evidence for the NAME construction in the Rig-Veda. 

('+7) yah ma ayatum yatudhana iti aha (/) yalj va rakgtti sucib asmi iti aha 
* 'me ' SPEAK gt'EAK 

'who says to me, the one not being a warlock, "0 warlock", or who, 
being a rakjas, says "I am pure"...' (7.10U.16) 

(U8) Induh indrab iti bruvan (9.63.9) 
'saying "The juice (is) Indra"' 

As a matter of fact , it may well be argued that the NAME construction 
secondarily resulted from a reinterpretation of structures like (ltl)-(U6) as 
somehow akin to the participial naming construction. What may have helped in 
this development is the quasi-passive type ('♦3): Because of the passive-like 
nature of bruve 'is called/talked to, about', the quoted NP would have to ap- 
pear in the nominative both in an iti- less genuine QUOTE construction and in 
the participial construction; cf. (1+9 ). The res^ating ambiguity could then 
be extended to the iti- quotative, as in (50). (Both (U9) and (50) are unattest- 
ed as such; but structures of this sort woiild be possible in the Rig-Veda.) 

(U9} panih EU5a.n_ "hruve 

ag'.N" sg.N SPEAK 

(a) 'the niggard is talked about "(He is) a man"' 

(h) 'the niggard is called a man' 

C5Q1 panil? puman iti bruve 

ag.N sg.N SPEAK 

(a) 'the niggard is talked about " [He is) a man"' 
Cb) r 

5.10: The evidence of the Rig-Veda, the earliest stage of Sanskrit, then 
can be summarized as follows. 

Rig-Vedic Sanskrit had a quotatival structure marked by iti 'thus' which 
coexisted with an iti- less construction and thus was only optional. Both con- 
structions could occur with SAY (including prach- 'ask', which however pre- 
ferred other, indirect constructions), as well as THINK and HEAR. (The latter 
two however show strong competition from indirect constructions.) In addition, 
both the q^uotative and the iti- less construction can occur without any overt 
SPEAK, in which case a CAUSE reading is occasionally possible for either con- 
struction. There is however no evidence for this being an established use of 
the q.uotative. There are also ambiguous structures which indicate the poten- 
tial for reinterpretations leading to NAME-quotatives . Again, however, there 
is no unambiguous evidence that such constructions have already arisen. (in 
addition, there is as yet no evidence for the use of the quotative with KNOW 
and SEE which, instead, use indirect constructions.) 

The Rig-Veda does however offer evidence for the development of a new 
constructional type, in so far as the morphosyntax of the quotative is con- 
cerned. Where early on, Rig-Vedic Sanskrit seems to have had three major 
variants of the quotative, one iti- initial, a second SPEAK-final, and a third 
with iti + SPEAK inserted into QUOTE, a new. Embracing construction is seen to 
be coming in, in which SPEAK precedes and iti follows the QUOTE. 

The Atharvanic quotative shows a very marked development vis-a-vis even 
the late Rig-Vedic stage. This manifests itself in all areas: in the extent 
to which the quotative has become obligatory, in the morphosyntax of the con- 
struction, and in the syntactic/pragmatic uses of the structure. 

6.1: In terms of frequency, an examination of books 1-8 shows a ratio of 
12 : 5 between SPEAK + QUOTE structures with and without iti . If SPEAK-less 
constructions are included, the ratio is l^i : 5- (in book 10 of the Rig-Veda 
the ratio was 17 : 2U .' ) Moreover, while the verse sections of the Atharva- 
Veda contain about 15 examples of SPEAK-less iti- constructions , I have found 
no comparable constructions without iti . In short, then, the marked quotative 
is well on its way toward becoming q.uaai-obligatory. 

6.2: As far as its morphosyntax is concerned, the quotative no longer 
seems to be attested in its iti- initial variety. And the ratio between SPEAK- 
final and Embracing quotatives shows a marked development toward predominance 
of the latter construction, as can be seen from a comparison of Late Rig-Vedic, 
Atharva Verse, and Atharva Prose. (Note that it is generally acknowledged that 
the Prose sections are relatively late in the Atharva-Veda . In the Prose sec- 
tions I ignore repetitions of the same collocation within a given "hymn".) 

Late RV 

AV Verae 

AV Prose 








6.3: Perhaps the most striking and interesting changes can he observed 
in the syntax/pragmatics of the quotative: 

(a) Impressionistically, it seems that indirect constructions are very 
much on the wane, for all relevant verbs, except SEE which does not show any 
quotative cons-tructions as yet. Still, occasional indirect constructions may 
be found, such as (51). 

C51) vidma vai ... yatalj .. . .layase (AV 7-76.5) 

'We know whence you are born' 

Cb) In addition to a greater incidence of quotatives with THINK, we now 
also observe quotative constructions with HEAR (while in the Rig-Veda we only 
found one example of an iti- less QUOTE), as well as with vid- 'PCNOW, a cate- 
gory not yet taking QUOTE in the Rig-Veda; cf. (52) and (537. This latter ex- 
tension can be taken as resulting from the reinterpretation of THINK as 'be- 
lieve Cto be true)', hence 'KfJOW (to be true)'. 

(52) . . . saptagrdhrSb iti susruma vayam (AV 8.9.I8) 

'"... (They are) seven-vultured" (so) we have heard' 

(53) bhtimib iti tvtm abhipramanvate janalj (/) 

nirrtib iti tva aham pari veda sarvata]? (AV 6.8U.I) 

'People think of you (as) "earth", I know you completely (as) 
"Nirvti" (= "perdition")' 

Cc) As the (translation of the) last exajnple shows, there is good reason 
to believe that at this stage a NAME variant of the quotative has developed. 
This is indicated first of all by a larger number of relevant constructions 
than were found in the Rig-Veda. In the Rig-Veda, constructions which might 
possibly qualify as NAME quotatives amount to only 6 out of a total of U6 
iti- constructiona ; i.e. the ratio is about 1:8. In Atharva-Veda verse, 11 
out of UQ iti- quotatives are interpretable as NAME constructions; i.e. the ra- 
tio is about 1 : U, More important, however, is the evidence of (5^), where 
nBTTiB dheyam 'name' is explicitly specified, and of (56) where an iti- less NP in 
a parallel construction strongly suggests that iti is inserted without re- 
course to a (deleted) SPEAK, but simply as a naming device. Note that in a 
Rig-Vedic passage comparable to (5^+), no iti is found; cf. (55). 

(5**) sajpvasavah iti vah nSmadheyam (AV 7- 109 -6) 
' 'Sajpvasus" (is) your name' 

C55) ghftasya nama ... yad asti (/) ,1 ihva devSnam . . . (RV U.58.I) 
'which is the name of ghee: "tongue of the gods ..."' 

(56) udanvatt dyau^i avamS (/) ptlumatt iti madhyamS / 
trtlya ha pradyauti iti CAV 18.2.WT 

'watery is the lowest heaven, "full of pilua" the middle one, 
the third (is.) the "foreheaven" ..." 

This new NAME construction was to acquire a considerable degree of popul- 
arity in the later language, including in grammatical literature. Its attract- 
iveness seems to have lain in the fact that it made it possible to "integrate" 
lexical items into a syntactic context in their citation (nominative or stem) 
fonn, without further adjusting that form in accordance with its grammatical 
status within the sentence. (For the probable origin of this construction, 
cf. section 5-9 above.) 

6.k: In addition, there is evidence that the Atharva-Veda is in the pro- 
cess of developing a CAUSE variety of the quotative, viz. a use of the quot- 
ative to indicate purpose. Disregarding infinitival constructions, the Rig- 
Vedic device for marking purpose clauses was a structure with yathS 'so that' 
+ subjunctive, as in (57). Similar constructions continue in the Atharva- 
Veda; cf. (58). Beside these, however, we find constructions like (59) and 
(60), without yatha , but with subjxinctive , and with the particle iti . 

(57) grhan gacha gyhapatni yatha asat (RV 10.85.26) 

'go home so that you be lady of the house' 

(.58) huve devfm aditim ... sajatanam madhyamesthSh yatha asani 

'I invoke divine Aditi so that I be the midmost of my fellows' 
(AV 3.8.2) 

(59) sarva^i samahvi o^adhilj (/) itab nab paraya[n3 iti (AV i|.17.2) 

subj . 
'I have called together all the herbs (thinking) "May they save 
us from this" ' 
OR: 'I have called together all the herbs so that they may save us 
from this ' 

C60) kal? asya bahu samabharad (/) viryam karavad iti (AV 10.2.5) 

subj . accented 
'who brought his arms together so that (?) he do something 
heroic' (Sim. ibid. 17, 6.128.1) 

What is especially interesting is that in a number of examples (cf. (60) 
vs. (59))> the verb of such, Iti + subjunctive clauses is accented, indicating 
that the clause functions as a dependent clauae, just as does a yatha con- 
struction. (Elsewhere, however, main-clause verbs within a QUOTE normally 
are unaccented. ) 

Moreover, there is other evidence suggesting an (incipient) equivalence 
between yathS clause and iti construction. One consists of their apparent in- 
terchangeability in (61). The other^ in the occurrence of an apparent blend 
between the two constructions; cf. (62). 

(6l) asau me smaratad iti (/) priyab me smaratad iti / 
devSb pra hinuta smaram (/) asau mSm anu socatu // 

yatha mama smarad asau (/) na amujya aham ... / 
devalj pra hiputa smaram (/] ... CAV 6.130.2-3} 

'so that yonder (;iian) love me, so that the dear one love me, 
gods, send love, may yonder (man) b\irn after me. 

'so that yonder (man) love me, not I him ..., gods ...' 

(&) tvastS tam asyah S badhnad yathg putram .Hnad iti (AV 6.81.3) 

subj . 
"Tvastr shall bind that on her so that she may give birth to 
a son' 

In terms of internal Sanskrit evidence, this new construction can be ex- 
plained as the resTilt of reinterpretation of potentially ambiguous construc- 
tions such as Rig-Vedic, iti- less (26) and its iti- quotative counterparts. 

6.5: Other innovations include the first instance of a pattern which be- 
comes prominent in the Vedic Prose of the later Sajphitas and the BrHhmaoas and 
which migh.t be referred to as 'Ritual Quotative', i.e. a sacred formula quoted 
during a ritual act and marked by iti , usually without an accompanying SPEAK; 
cf. C63). 

(63) ... pisSctn sarvan darsaya (/) iti tv5 rabhe ogadhe (AV 1*.20.6) 
'"... make (me) see all the Pisacas" (with these words) I take 
you, herb' 

Another fore-runner of a construction quite common in Vedic Prose, but 
not found elsewhere in the early language, is that given in (6U), in a passage 
from Atharvanic Prose. Tbis is the use of the quotative with FEAR. 

(6it) tasySb Jatayalj sarvam abibhed iyam eva idam bhavisyati iti 
'of her, when she was born, everthing was afraid (thinking) 
"this one will indeed become this world'" (AV 8.IO.I) 

6.6: The most striking innovation of the Atharva-Veda , however, is the 
use of quotative iti_ with ONOMATOPOEIA, cf. (65), (66), and (67). 

C65) prthivySm te nipecanam bahib te aatu btl iti (AV 1.3.1-9; re- 

ONOM. frain) 
'on the earth be your outpouring, outside of you, "splash"' 

(66) ajena krovantati sitam (/) v;-?ena uk?antu bSl iti (AV 18.2.22) 

'making you cool with the goat, let them sprinkle you with rain, 


bhug it i abhigatab (/) sal iti apakranta^i C/) phal iti abhisthitat 
' 'bounce', he has come; "whist", it is gone; "bang", it has trod- 
deii'22 (A7 20.135.1) 

For Kiiiper C1967) and Emeneau (1969), these structures were clearly due to 
Dravidieui influence. Kuiper, to be sure, did note something of a Rig-Vedic ante- 


cedent, the expression baj. itth^ (= bad itth^ ) 'indeed, truly, etc.', which 
contains an interjection vaguely reminiscent of the above b§l , bhug etc., plus 
a cognate of iti ; cf. e.g. (68} below. How, in many of its attestations, 
itth^ may be looked upon as a simple emphasizer. Occasionally, however, it is 
used in the meaning 'thus' and may, like iti , be used even with SPEAK; cf. 

Kuiper does not pursue this matter. As it turns out, however, Avestan 
has evidence for similar uses of its cognate iOa (YAv. i3a/i9a ) , as well as 
for the quotatival use of that particle; cf. 12.5-6 below. While this does 
not prove that the ittht of bad ittha was quotatival and thus a more or less 
direct ancestor for the iti of (65)-(6t) above, the parallel is tantalizing. 
Still, given that RV ba(^ is not an onomatopoetic interjection, the way of cau- 
tion would advise against such a direct connection. 

(6?) bad ittht mahimS vim ... panigthah ... (RV 6.59.2) 

'truly, your greatness is praised most ...' 
OR: 'thus indeed (it is): Your greatness is praised most' (?) 
(Sim. l.lUl.l, 5.67.1, 5.8it.l) 

(68) satyam ittha vfsa id asi (RV 8.33.10) 
'truly thus (it is): You are the bull' 

(69) apalj indralj ... turajSt / ittht_ sjrjanalj ... artham ... vivigu^i 
'Indra, conquering the might (released) the waters; thus re- 
leased, they pursued their duty' (RV 6.32.5) 

(to) . . . bhava mr].tkah / ittht gipnantalj ... syama ... gosatamalj 
'" ... Be merciful," (thus) praising (you) may we be the most 
(RV 6.33.5; Sim., with vad- 'speak', 6.18.5) 

The normal pattern for onomatopoeia in the Rig-Veda, disregarding derived 
nominals , seems to have consisted of a choice of the following: 

Ca) The onomatopoeia is turned into a verb-stem and then inflected as a 
verb, such as probably in hegati 'whinneys', prothati 'snorts', as well as in 
participial jajhjhattti (RV 5-52.6) 'laughing' or 'hissing', janjatt (RV I.I68.7) 
'blazing , flaring (of fire ) ' . 

(b) The onomatopoeia is extended by the verb k£- 'do, make', as in ciscS 
fcfpoti 'makes a whizzing sound (of an arrow)' (RV 6.75-5), hiA-kr- 'make the 
sound bin Cof a cow)' (RV l.l6it.27, 28), kikir§-kr- 'scratch' (RV 6.53-7, 8), 
akhkhalr-kftyg 'jubilating' (RV 7.103-3); cf. also phat karikrati 'they keep 
making "crash"' (AY k.lQ.S). 

Cc) The onomatopoeia ia extended by bhu- 'be, become', as in alala-bha- 
vantitt 'rustling (of water)' CRV I4.I8.6), j an j ajjS-bhavan 'blazing, flaring 
(of fire)' CRV 8.1+3.8). 

What is common to all of these processes is an attempt not to use an ono- 
matopoetic expression by itself, but to "integrate" such words into the ordin- 
ary vocabulary — and the syntax — of the language by turning them into a recog- 
nizable — and syntactically usable — category, namely into verbs. (in fact, the 
coexistence of jafljatr and J ail J apa-bhavan suggests that for 'spur-of-the-moment' 

expressions, any of these processes could equally well be used, i.e. that 
they all were "equal" in implementing a conspiracy against using plain, un- 
e:ctended onomatopoeia. ) 

Given this background, it is perhaps not surprising that once the NAME 
construction with iti had been introduced into the language, as a device to 
"integrate" names etc. into the rest of the sentence without further syntactic 
adjustment (cf. 6.3 above), it could be used as an additional device for "inte- 
grating" onomatopoeia into the rest of the sentence, coexisting with the other 
devices throughout the remainder of the (Vedic) language. 

That there may have been a time lag between the development of the NAME 
construction and the special ONOM use of the quotative is s^iggested by the 
following considerations. The NAME construction is fo\md throughout all the 
various chronological layers of the Atharva-Veda. ONOM, however, appears on- 
ly in contexts which look like late additions: The hymn in which (66) occurs 
was not included in the more conservative Paippalada recension of the Atharva- 
Veda. And though some of the material of the hymn from which (65) is taken 
is found in the Paippalada, the quoted passage itself is not, suggesting that 
it is a later addition to a pre-existing hymn. As for (67), it occurs in the 
very problematic 'Kuntapa hymns' which had not yet been included in the Athar- 
va-Veda at the time that the grammatical analysis reflected in the pada-pStha 
was undertaken. Bloomfield (l899: 100-1 ) very aptly describes the changes in 
the ritual which must have led to the late inclusion of these hymns into the 
Vedas. At the same time, however, variants of (65) and (66) appear in the 
latest Vedic hymn collections — in KS 13.9, TS — , and a variant of 
C67) is found in the non-canonical and frequently quite late Rig-Vedic 'khi- 
las ' (5.18). It is therefore probable that the construction had come into 
existence by the end of the Vedic-Poetry period, and before the Vedic-Prose 
stage which will be discussed next. 

6.7: The evidence of the Atharva-Veda thus suggests the following devel- 
opments: The quotative is well on its way to becoming quasi-obligatory, both 
compared to unmarked QUOTE and to the indirect constructions. Of its three 
major Rig-Vedic variants, the iti- initial structure is too rare to even be 
attested, and the Embracing pattern is well on its way toward predominating 
over the SPEAK-final one. HEAR and KNOW are now attested with quotatives. 
A NAME variant of the quotative has developed which in tvirn may have furnished 
the basis for an ONOM construction. In addition, a purpose variant of the 
CAUSE construction, a 'Ritual Quotative', and the use of the quotative with 
FEAR can be observed to be developing. 

7; Vedic Prose 

The language of the prose texts of the post-Atharvanic Saiphitas, as well 
as of the Brahmapas and Krajjyakas , shows the quotative construction almost ful- 
ly developed to its state in the Classical language. 

7.1 : Compared to other, indirect or direct quote constructions, the quot- 
ative is now virtually de rigueur. Thus in two samples from the Satapatha- 
Brahmapa, selected because of their different subject matter and style, ^3 jti 
constructions outnvnnber other constructions by 31 : 1 and 27 : 2 respectively. 
(The figures are even more impressive if the (mostly SPEAK-less) quotes from 

the Vajasaneyt-SaiphitS and the explanatory restatements and paraphrases are 
included: 55 : 1 and 31 : 2 , respectively.) 

This is not to say, however, that iti- less QUOTE and indirect construc- 
tions are entirely wanting. Thus in the two samples there are one example of 
a participial construction with KNOW and two examples of 'indirect questions' 
with THINK, respectively. Elsewhere, occasional examples of iti- less QUOTE 
can be found, as in (jl). (Cf. also T.U below.) In general, it can be stated 
that SAY may occasionally be used with the participial construction (cf. e.g. 
(75), inside QUOTE) and unmarked QUOTE; THINK and KNOW with the participial 
construction and 'indirect questions'; and SAY may also occur with 'indirect 
questions' where genuine questions are being asked, as in (72). 

(71) yadi it tu anye vadanti kas tat samdham upeyat (SB 2. i*. 3.10) 
'if now others say "who would incur this combination (of mis- 
takes ) ? " . . . ' 

(.72) bruhi yataJ; khinema (SB 
'say where we should dig' 

This competition between different constructions may perhaps be respon- 
sible for the occasional appearance of syntactic blends, such as (73) with 
■indirect-discourse' marker yatha 'that' and quotative iti . Moreover, it may 
account for the fact that where SPEAK interrupts a QUOTE, iti may occasionally 
be placed only at the end of one of the QUOTE fragments (cf. (7**)), although 
the normal pattern has iti at the end of all fragments (cf. (75))- 

(73) sah ftam abravtt yatha sarvasu eva samavad vasani iti (MS 2.2.7) 
'he swore an oath that "I will live with all of them equally"' 

(7!+) idam hi ahuhi rakgamsi yositam anusacante tad uta rakgamsi 
eva retah adadhati iti (SB'+ 

'for "here (on earth j,'^' they say, "the rakjases pursue young 
women and then the rakgases put their seed in".' 

(75) atra u sah k§mab upSptalj iti ha sma aha m§:hitthit yam carakah 
£r5;japatye pasau ahuh iti (SB 

'"Therein that wish was obtained," Cso) MHhitthi once said, 
"which, the carakas say (to be) in the Prajscpati-victim". ' 

As noted earlier, Vedic Prose also offers examples of ^non-quotative iti 
meaning 'thus', cf. Cl9) above. Cin the first of the two Satapatha-BrShmana 
samples referred to earlier, there happen to be five such examples. Overall, 
however, this use is found much more rarely.) 

7.2: The tendency, observed in the Atharva-Veda , toward predominance of 
the Embracing construction over against the SPEAK-final variety of the quotat- 
ive can be observed even more fully in Vedic prose. In the two Satapatha- 
Brahmana samples studied in detail, the ratios between the two constructions 

are 19 : 1 and l8 : 5, respectively.) 

7.3: An innovation in the area of morphosjmtax, occasioned no doubt by 
the increasing number of uses for the quotatiye, is the fact that at this 
stage of the language we find the first examples with, 'nesting' of iti- quot- 
atives within iti- quotatiyes , as in (76). 

(76) hiranyayi iti vai abhyukta it£ (SB 6.3.1.^*2) 
'(saying) "it is said (to be) 'golden'."' 

There is, however, as yet no evidence for a possible 'pile-up' of itis at 
the end of a QUOTE, as it can be found in the later. Classical language. 
Rather, such a 'pile-up' seems to be actively avoided, as in (77), where in- 
stead of expected ONOM-iti plus QUOTE-final iti_, only a single iti is found, 
(in the Classical language, this would come out as (78), with double iti . ) 

C77) • • • tam juhuyad dev^so yasmai tv5 Ti^e tat satyam uparipruta 
bhangena hatal^i asau phat iti (SB U. 1.1. 26) 

'he should sacrif ice~wT"th that (saying) "O divine sprig, for 
what I pray to you (let) that (be) true; (let) this man (be) 
struck by destruction-from-above , 'crash'."' 

C78 ) . . . asau phat iti iti 

7.^+: The area of syntax/pragmatics likewise exhibits innovations in the 
use of the quotative. 

(a) One of these is the fact that the quotative may now be used also 
with SEE; cf. (79). This innovation no doubt is attributable to generaliza- 
tion from HEAR to other verbs of sensory perception. 

(79) sa ha etad eva dadarsa anasanataya vai me pra.jSb parSbhavanti iti 
'see' (SB 

'he then saw "These creations of mine are perishing of hunger"' 

(b) The NAME construction now appears in a new function, namely that of 
characterizing technical terms (80) and of serving as italics, to characterize 
quoted forms in discussions of a technical, philological nature; cf. (6I). 

(8Q) te vai ete paripasavye iti Shutir (SB 3.8.I.16) 
'these two libations are "paripasavyas" ' 

(81) vtk iti ekam akjaram akgaram iti tryaksaram (SB 6. 3.1.^3) 
' yak Cis) one syllable, akgaram^is) trisyllabic' 

(c) A further extension of the NAME construction, a structure marking 
EMPHASIS, has developed by this time; cf. (82). (The accusative case marking 
in C82) jnight perhaps suggest that this is unrelated to the NAME quotatives. 
Hoveyer, as (83) shows, also the NAME construction occasionally may retain 
the accusative of the unmarked construction, rather than switching it into the 
nominative. 1 

C82) dvau trfn Iti eva pitSmahtn somapan vindanti (SB 5.1+.5.'+) 
du.A pi. A pi. A 

'they find only two or three (not more) soma-drinking fore- 
fathers ' 

(83) tatah aaiirah rauhipam iti agnim cikyire CSB 
sg.A sg.A 
'then the Asuras built themselves the "rauhica" Agni/fire' 

(d) the 'Ritual Quotative', the beginnings of which were noted in the 
Atharva-Veda, now is fully established. It is frequently found followed by 
a restatement or paraphrase. While the Ritual Quotative almost invariably is 
unaccompanied by any overt SPEAK, but is always followed by iti , the subse- 
quent restatement may or may not be followed by SPEAK and/or iti . Example 
(8^) may serve as an illustration of some of the patterns which can be found. 

(8U) devasya savitul; save iti (/) devena savitrt prastltaly iti etat(/) 
svargyaya saktyS iti (/) yatha etena karmapa svargan lokam 
iyad evam etat aHa (§B 6.3. 1.1*4) 

'... "at the impulse of divine Savit^-" (= VS 11.3b); that (is) 
"impelled by god Savitr" (= the explanation/paraphrase); "with 
power to the heavenly (world)" (= VS 11.3c); "so that by this 
act one might go to the heavenly world" (= explanation/para- 
phrase), that he says' 

7.5: Finally, in addition to further instances of FEAR with iti- quotative 
and the Purpose variety of CAUSE with QUOTE + iti_ (cf. sections S.h and 6.5), 
Vedic Prose also offers the first attestations of a truly 'causal' CAUSE con- 
struction. And while the other two constructions just mentioned retain cert- 
ain characteristics (in terms of subjunctive mood and optional accentuation of 
the verb), the causal construction has no such overt features of subordinate 
structure; cf. (85). However, the frequent occurrence of the causal correlat- 
ive tasmad 'therefore' after such causal quotative constructions clearly sug- 
gests a dependent-clause interpretation. 

(85) yajnam ... tanavai iti tasmad adityam carum ... nirvapati 

'(Because/thinking) "I will . . . spread the sacrifice", therefore 
he prepares the Sditya pap ...' (SB 

At tke same time, however, at this stage of the language it still seems to 
be always possible to supply an expression like 'thinking' , as in the gloss 
above. Where such a reading would not be possible, i.e. where the causal rela- 
tion between dependent and main clause is conceived of as an objective one, 
existing independently from the thinking of the agent of the main clause, dif- 
ferent structures are found, as in (86) and (87). 2^*8. 

C86) yad dasadasa ekaikam camasam anuprasfptah bhavanti 
tasmad u eva dasapeyam (SB 5- '+■5.3) 

'because each time ten (men) creep after the cup, therefore it 
is called the dasapeya (= the one to be drunk by ten)' 


(87) yad e^ara rtjHnaij rSjasnyaySjInaU ts.ufy. tad ha sma tad abhyShu^i 
■''Because their kings were performers of the rajasuya, therefore 
they used to say this' (Sb 

This restriction on the use of the causal construction clearly indicates 
the origin of the structure, namely as a reinterpretation of q^uotatives with 
deleted THINK. 

7.6: The major innovations of Vedic Prose, then, lie in the development 
of 'nesting' iti- quotatives (but with a constraint against iti 'pile-up'), the 
use of th£ quotative with SEE, the extension of the NAME quotative to technic- 
al terminology, its use as an equivalent of italics in technical discussions 
and to indicate emphasis, and the development of a Causal variety of the CAUSE 
construction (limited to causes existing in the mind of the main-clause agent). 
In addition, Vedic Prose shows further extensions of the Embracing quotative at 
the expense of other competing constructions, as well as fuller use of the 
'Ritual Quotative'. At the same time, however, older, rival constructions per- 
sist (leading to occasional blends between indirect and quotative constructions) 
Moreover, we find occasional instances of archaic iti 'thus', used non-quotat- 

8: The Classical Language 

The post-Vedic language described by Speijer (1886:379-88) does not differ 
markedly from the Vedic-Prose situation just described. (Even syntactic blends 
between indirect and quotative constructions continue to be found; cf. ibid. 
382-3.) The main differences can be briefly characterized as follows: 

(a) The occasional appearance of iti- initial quotatives, as in (88) be- 
low, seems to suggest that though moribund and not appearing in the post-Rig- 
Vedic earlier language, this construction never was completely lost. 

(88) iti ca enam uvaca dutikhita / suhrdab pasya . . . 

'and (thus) she, distressed, said to him "See the friends ..."' 

Cbl iti may appear after QU(estion words), as in kirn iti ''why' (lit. 
'saying what' ) . 

(c) The quotative may be used to state 'objective' CAUSE, not just a 
causal relationship existing in the mind of the main-clause agent; cf. example 
Cl ) above . 

9: Sanskrit summary 

Surveying the evidence of Sanskrit we find a constajitly expanding use of 
the quotative construction, especially that of the Embracing variant. This ex- 
pansion can be diagrammed as follows. ^5 (The inserted quotative is ignored.) 
Given this increasing expansion and reshaping of the construction, from very 
modest, and morphosyntactically quite different beginnings in the Rig-Veda, to 
the full panoply of attestations in the Classical Language, it is not diffic\ilt 
to see in the quotative a Sanskrit innovation , just barely in its beginning 
stages in the earliest, Rig-Vedic language. At the same time, however, it is 
also possible to argue that in the shape in which it appears in the Early Rig- 


Veda, th.e quotative may be essentially inherited and that the innovations 
which have taken place lie in the gradual reinterpretation, reshaping, and 
expansion of the construction. 

To more meaningfully decide between these two competing interpretations, 
it will be necessary to look at outside, comparative evidence 




G I F E 


Early RV 
Late RV 

Late AV 
Ved. Pr. 

F - 
F CR) 

V ^^ 

- + + - + - - - +i 

- + + _ + - _ + +i^ 

+ + + ( + )£ +$ _ + + +''■% 
+ + + + + + + + +^% 

'Only iti- less construction; ^'Ritual Quotative' (or, in the 
Classical Language, quotation of authorities, etc.); ^With FEAR; 

^Not 'objective' 

♦■Also 'technical' uses. 


10: Latin and Hittite (Anatolian) 

The only other two ancient Indo-European languages which are generally 
acknowledged to have a quotative construction are Latin and Hittite (and other 
ancient Anatolian languages related to Hittite). 

10.1: In Latin, the quotative construction is marked by the finite- 
verbal form inquit , inquam 'says, say', usually (but not necessarily always) 
inserted into QUOTE after the first word or constituent of the quotation; cf. 
the examples below. In general, this quotative construction requires the pre- 
sence of SAY or of an easily recoverable SAY. However, some special uses can 
be discerned. One is found in the quotation of scriptural authority, where 
however a verb of speaking is easily supplied; cf. (89). Similarly, the use 
as a 'definitory' construction, as in (90), is not too difficult to derive 
from a literal interpretation of inquam as 'I say'. The most specialized use 
seems to be that found in (91), where inquit marks the objections of a hypo- 
thetical opponent in what is hypostasized as a 'real' argument. 

(89) furem ... luce occidi vetant XII tabulae: 'cum . . . hostem . . , 
teneas, nisi se telo defendit ' inquit , 'etiamsi ..., non oc- 
cides . . . ' 

'the 12 Tablets prohibit killing a thief by daylight: "When 
you should hold an enemy, unless he defends himself with a 
weapon" "even if ..., you should not slay ..."' 

(90) has compedes, fasces , inquam , hos laureatos 

'these fetters, "these laureled powers of authority" (I say) 
= 'these fetters, i.e. these laureled powers of authority' 

C9l) ' parva ' inquit ' est res ' ; at magna culpa 

'Cone might sayi "the case ia (of) small Csignificance)" , 
but the guilt (is) great' 

Note that though tte Latin quotative construction is not excessively rare, 
it is dwarfed in much of Classical literature by the indirect accusative-cum- 
infinitive construction. 

10.2: Unlike Latin inquit/inquam , the Hittite and general Anatolian quot- 
ative marker va(r ) is quite commonly used. True, there may be occasional ex- 
ceptions, especially in the mythological texts and in short verbal exchanges; 
cf. Friedrich I967 rll+S-SO . But ordinarily the particle is used; cf. (92) be- 
side C93). 

Ever since Gotze and Pedersen (l93^:T^) proposed it, the generally accept- 
ed derivation of wa(r) has been from the verb Hitt. ver-iya- 'call, invoke'. 
To Gotze and Pedersen 's mind, such a derivation would have parallels in the 
[.clitic-shortening] development of quotatival Russ. de_, OPol. dzie , Czech pry 
from earlier full verbs of saying. [These earlier full forms are *dejati_ 
'put, say' for Russian and Polish, pravy 'said' for Czech.] Recently, however, 
Joseph (1981, 1982) has proposed a different source, namly Hitt. iwar 'like, 
as ' , for which Joseph finds parallels in the development of like into a quot- 
atival particle in certain American English dialects, an apparently similar 
development in Neomelanesian, and the use of particles meaning 'like, thus' 
in Buang CNew Guinea) and in Tibeto-Burman Lahu.28 Given that both 'thus' 
and SPEAK can frequently function as quotative markers, Joseph's hypothesis 
jnay well constitute a credible alternative. (l would feel more comfortable, 
however, if it co\ild be shown how Hittite non-deictic iwar 'thus, like' could 
acquire the deictic meaning 'thus' normally found with such quotative markers.) 

Be that as it may, the morphosyntax of the particle is quite simple: To 
the extent that it is used at all, wa(r) occurs in the characteristic initial 
strings of the Anatolian languages, following the first (presumably accented) 
element of each quoted sentence. 

Ordinarily the quotative is governed by SAY. However, 'name', 'inscribe' 
may also be found. In a number of cases with omitted SPEAK, it is also poss- 
ible to supply a verb like THINK, but SAY cannot be ruled out. 

Frequently, however, the preceding SPEAK may be further accompanied by 
deictic kisssji 'thus'; cf. e.g. (92) below. (Additional examples may be found 
in Friedrich I967 and elsewhere (passim).) Note that this introductory formula 
may also occur where no quotative particle occurs in the QUOTE; cf. (93).^" 

(92) nu man kis(s)an kuiski memai anniaein-war-an LUGAL-iznanni kuwat 
tittanut (/ ) kinunma-wa-ssi kurur kuwat hatrieskisi ( / ) 
man-war-asjnukan sulliyat kuwapi O-UL 

•Now, if someone speaks as follows "Why did you formerly place 
him on the throne? And why are you now declaring war on him?" 
(in ansver, I say) "If he had never started hostilities with 
me . . . " ' 
(Apology of Hattusilis 3- 73-77) 

(53) [nu ki]ssan jnemahhi kiez mahhan [nijngir zig-0-az KAL 

^Sjtursas [Ritual of Anniwlyanis k .2-3) 

'Now I speak thus " As these have drunk, so drink you, KAL of 
the Shield' CSim. ib.3.35-^'t; tut 1.28-29 has wa . ) 

The last sentence of example (92) also shows that the quotative marker may 
characterize a QUOTE not accompanied by an overt SPEAK. In (92), it is easy 
to recover an 'I answer'. However, there are contexts where such an analysis 
would be more difficult. The most striking construction of this sort which I 
have found is (9^*), in which the most likely interpretation seems to be that 
QUOTE specifies the reason or CAUSE for the fact that there is no recompense. 

(9^) takkn SAL-an kui[sk]i pittenuzzi (/] EGIR-andama[ sm ]a[ s ]a 

Is]ardiyas paizzi (/) takku 2 LtJ.MES nasma 3 Ltj.ME§ akkanzi 
sarnikzill] NU.GAL [ z]ik- wa UR .BAR.RA kisat 

'If anyone elopes with a woman, and a rescuer goes after them, 
if two men or three men die, (there is) no recompense "You 
have become a wolf"' (Selections from the Code, 2.29-30) 

Other special uses of the 0-SPEAK quotative seem to be the appearance of 
quotative -wa- in Hieroglyphic Hittite, in what Dressier (1970:38?) plausibly 
refers to as 'talking' inscriptions (of the type "I am the monument of ..."), 
and perhaps also the Palaic example (95) below (cf. Carruba 1972 :l6 and 20). 

(95) [nuku] pashullasas ti[ya]z tabarni LUGAL-i papazku-war ti 
[ anna]zku-war ti ... (KUB XXXV. l65vs. 21-22) "** 
'And now, sungod of the gods (?), for Tabarna, the king, you 
(are) "father", you (are) "mother" ...' 

Unfortunately, the interpretation of this inscription is made difficult by the 
presence of several hapax legomena, as well as the uncertain value of the ku 
preceding war . Still, it is possible that we have here something akin to the 
NAME variant of the Sanskrit quotative. 

Hittite and the other Anatolian languages thus offer clear evidence for a 
quasi-obligatory quotative particle - wa(r)- which normally is incorporated into 
the initial string following the first word of each clause of the QUOTE. Beside 
with overt SPEAK (= SAY), it may also be used with 0-SPEAK. And this construc- 
tion shows some probable evidence for extended, secondary or specialized uses 
(as in (9^*) and in the 'talking' inscriptions of Hieroglyphic Hittite), and some 
possible evidence in (95). In addition to, and sometimes instead of, the quot- 
ative particle -wa {x)~ , Hittite quite frequently shows kissan 'thus' preceding 

11 : Homeric Greek 

As noted in my other contribution to this volume, Homeric Greek has a 
Final Formula which ordinarily indicates the end of a single-speaker direct 
quote or of an (extended) verbal exchange between several speakers. This Final 
Formula comes in two basic variants, one consisting of the defective verb §_ 
'(h.e] said', the other of h^s 'thus' plus a verb of speaking, most usually a 
finite form of phe/pha- 'speak'; cf. (96) and (97). In the first three books 

of the Iliad, out of 60 caaes where this Final Formula could occur, only 5 
do not show it. That is, in Homeric Greek, this construction appears to te 
q.uasi-ohligatory . 

C96) Peleides d' ... proseeipe ... / . ■ . / hSs phato Peleides ... 
'But the son of Peleus spoke ... QUOTE ... (Thus spoke the 
son of Peleus) and ...' (11.1.223-^5) 

C97) t&n d'apameibomenos prosephe ... Akhilleus/ . . . / e_ kai 

'to her, answering, spoke Achilles QUOTE (He spoke) and ...' 

This Formula usiially occurs after a QUOTE introduced by a preceding SAY, 
as in the above examples. Sometimes a related noion may appear instead of the 
verb. Exceptions to this pattern are exceedingly rare. I have noted only the 
types exemplified in (98) and (99). Note however that in (98) there is a noun 
of speaking next to finite HEAR; and in (99) a verb (or noun) of speaking is 
easily supplied. In both cases, the Final Formula is used, even though in 
C99) no explicit SAY is found in the structure preceding QUOTE, and in (98) 
the finite verb is HEAR. 

(98) ... ameilikton d' op' akousan / j_:_^ / e kai ... (ILILIST-I+S) 
'voice' HEAR ''* 
'but they heard an ungentle voice QUOTE (He spoke) and ... ' 

C99) aipsa d' ep' Aianta proiei keruka Thooten / ^_;_j_ / ho 3 ephat ... 
'Forthwith he sent to Aias the herald Thootes (with the words) 
QUOTE (Thus he spoke) ...' (11.12. 31^2-51 ) 

In terms of its quasi-obligatoriness and the relatively few variants which 
it permits, the Homeric Greek Final Formula clearly qualifies as a quotative. 
However, it is remarkable that there is no strong evidence for extended uses 
of the construction, with or without 0-SPEAK. 

In concluding this section it might be mentioned that in addition to the 
Final Formula, a variant of one of its sub-types may occasionally occur pre- 
ceding QUOTE, in a 'generic-quote' construction; cf. (lOO). (Cf. also note 
27 of my othfir contribution to this volume. ) 

ClQO) hode de tia eipesken Akhaion te Tr6on te / ^_^_^ / h^s ephan . .. 
'and thus would say one or another of the Achaeans or the Tro- 
jans QUOTE (Thus they spoke) ... ' 

12: Ayestan ^ 

As noted by Kuiper C1967), Avestan has a construction with uiti 'thus' 
which in many ways resembles the early Rig-Vedic iti-construction, but which 
also differs from it In its morphosyntax. In the following I will take a closer 
look at the Avestan evidence, including a construction overlooked by Kuiper. 

12.1: In addition to indirect constructions similar to those of Sans- 
krit , Ayeatan also has two direct quote constructions , one employing the part- 
icle uiti 'thus', the other haying no special particle. Both of these can be 
used with SAY and THTNK; the unmarked construction additionally can occur with 
HEAR, FEAR, and 0-SPEAK; cf. the examples below. 

(101) mraot ahuro mazda spitamal zara&ustrai ... (Yt.lO.l) 

'A.M. said to Z. , the Spitamide, QUOTE' 

(102) uitya ojana miQrai youru. gaoyaoitae ... (Yt.lO.l) 

'Thus speaking (they cry to/address) M.V. QUOTE' 

C103) i_9a_ mainyete duxs'^arana / noi:& Ima;^ ytspam duzvarstam (Yt.10.105) 
'"Thus", thinks the ill-fated, "(it is) not all this illdoing 

OR: 'Thus thinks the ill-fated ...' (?) 

(lOU) a5ai frasa hqm.razayata atars ... uiti ayaSa maQhano (Yt.i9.UT) 

'thus' SPEAK 
'then A. stood up, thinking thus QUOTE' 

(105) sraotu ... gusahya tu ahura / ki airyama aphat (Y 1+9.7) 

'let him hear, listen you, A. "What Aryaji shall be ...?"' 

(106) yahmat ... fraterasenti ... moi tu iera ahurahe . . . vaeyai 

jasaema (Yt. 10. 68-9) 

'wherefore they are frightened . . . "May we not meet here with 
the charge of the . . . lord" ' 

(107) srira daSaiti daemSna ... ko m^m yazaite ... (Yt. 10. 107-8) 
'he looks around (lit. he places/giyes beautiful eyes) (think- 
ing/saying) "Who will worship me? ... "' 

12.2: Except perhaps for (103), all the aboye examples haye SPEAK (± uiti ) 
before QUOTE; and that is in fact the most common pattern. However, a minor 
pattern is that found in (108), with ( uiti +) SPEAK inserted into QUOTE. 

Ciq8) usta ahmai naire mainyai / uiti mraot ahuro mazda / ai asaum 

zara9ustra (Yt.l0.137i sim. ib.l38, Yt. 19-53) 
' "Hail to the authoritative man" said A.M. "0 truth-owning Z."' 

12.3: The relative frequency of the uiti- construction over against the 
unmarked structure is subject to considerable fluctuation. Thus in the Gathas, 
th.e ratio of uiti to is 1 : 10 (counting as one single instance the 9 repet- 
itions of the formula tat 9wa psrasS . . . 'that I ask you QUOTE' in Y U1+). In 
the hymn to Mithra it is 5 : 16. In the total Romanized selection of Reichelt 
1911, the ratio is 17 : 100. However, that ratio is skewed by two factors: 
One is the frequent use of tKe Verbal Exchange Formula (cf. (109) and the dis- 
cussion in section 16 of my other contribution to this volume); and that formxila 

never occurs with uiti . The other consists of 20 instances of the formula 
exemplified in (llO) , in which yazata/yazanta 'worshipped' is followed by 
.1a3yat./.1a5ysn 'prayed' which, with the suhseq.uent QUOTE specifies the 'content' 
of the worship. In its structure, this formula is parallel to what we find 
in Cm), where uiti + SPEAK takes the place of J a5yat (in the same hymn). 
If we exclude these formulaic expressions, the ratio will be more like that in 
th.e hymn to Mithra, namely 17 : 35- (if only the Verbal Exchange Formula is 
excluded, the ratio will be 17 : 55.) Even with these adjustments, however, 
th.e uiti- construction must be said to be used .quite sparingly. 

(109) a dim parasat zaraBustro (Y 9.1) (Sim. UU x elsewhere) 


•Z. asked him QUOTE' 

(110) tim yazata ... aaj him JaiSyat (Yt. 5.17-18) (Sim. ib. 19 x) 
'her he worshiped ..., and to her he prayed QUOTE' 

(ill) t§m yazata ... paitivacaQhat uiti vacSbis aojano (Yt.5.76) 
'with speech' ^with words' SPEAK 
'He worshiped her with speech, thus speaking with words QUOTE' 

12. U: What is especially interesting is that uiti almost invariably oc- 
curs next to SPEAK (cf. e.g. (102) and (lOU)), at best separated from it by a 
noun of speaking (cf. (ill)). More than that, when placed next to aoj- 
' speak', uiti quite frequently appears in its sandhi form uity- (as in (102)) . 
Considering that sandhi across word boundary, in Avestan, is limited to words 
which form a single phonological xmit (mainly to compounds), this suggests 
that there has been an (incipient) univerbation of uiti with SPEAK. 

Examples (102), (lO^i), and (ill) further show uiti occurring with a part- 
iciple of SPEAK. This is no accident, for of the 17 instances of uiti + SPEAK 
in Reichelt 1911, fully 11 have a participle of SPEAK. Moreover, this uiti ■*• 
SPEAK-participle construction may be used either with a 'higher', finite-verb 
SPEAK (as in (ill)), or with a non-SPEAK higher verb (cf. (lOi*)), or with no 
higher verb at all (as in (102)). Considering that present participles are 
not normally used by themselves or with non-Aux. -verbs , the use of participles 
of SPEAK in constructions like (102) and (lOU) suggests the need for a special 
explanation. The most probable explanation seems to be that uiti + participle 
of SPEAK has become a synchronically productive quotative marker. (Structures 
like (108), with finite SPEAK, then might be archaisms.) 

While this interpretation of the participial uiti + SPEAK construction as 
a synchronically productive q^uotative marker may be somewhat speculative, it is 
I Relieve safe to state that the general uiti + SPEAK (.or 0-SPEAK) construc- 
tion is comparable in its range of uses to Homeric Greek and comparable to Latin 
in terms of the frequency with which it is employed. 

12.5: There is evidence that in additi'on to this quotative construction, 
Avestan developed another quotative marker. As apparently first noted by 
Geldner (1885 :2U6-7 ) , a couple of very late texts, whose functions vis-a-vis 
the earlier hymns is comparable to that of Vedic Prose in relation to the Vedic 
hymns, offer i5a 'here; (thus)', once also i9a 'thus; (here)', indicating 'Ritu- 
al Quotes' as in (ll2) and (113). (Note however that this marker is not oblig- 
atory.) Unlike the uiti + SPEAK construction, this i5a/i9a regularly occurs 

after the quoted passage, although- string-initial elements (such as para im 
m Ui2l) njay intervene hetveen QUOTE and the marker. 

(112) dazda manapho para Im i9a manaQhe cinasti CY I9.I3) 

'"dazda manaQho" Ca quotation from Y 27 on which Y 19 is a 
commentary) teaches /means "for the thought/for thinking"' 

CII3) yai dim damabyo cinasti mazda iGa tern yai ahmai daman (ib.lU) 
'"mazda" (= Y 27.13b mazdai ) teaches/means that he (exists) 
for the creatures (and) that the creatures (are) for him'^ 

Etymologically , the iSa of these constructions creates certain difficul- 
ties, since it seems to reflect earlier ida 'here' (cf. Skt. iha 'here'), an 
unlikely quotative marker. However, the one-time occurrence of quotative i9a , 
combined with other considerations, provides a clue toward a more satisfactory 
explanation, identifying the i5a of these constructions as a descendant of ear- 
lier i9a 'thus', a cognate of Skt. ittht : First of all, there is independent 
evidence for a merger of e_ and 5_ (< d) in the spoken language of late Avestan; 
cf. e.g. Skt. veda, GAv. vagda : YAv. vaeSa beside vae9a_' knows ' , Skt. padya- 
te 'falls, goes' : YAv. paiSyaite 'he shall fall' beside paiOyeiti 'goes', 
etc. Secondly, that such a merger led to the interchangeability of earlier 
ida 'here, and i6a 'thus' is suggested by occasional uses of i9a in the meaning 
'here' and of i5a as meaning 'thus'; cf. (llU) and (115). In addition, the use 
of iga in C116) is strikingly similar to that of ittht in RV bad ittha (cf . 
(67) above). -^^ Moreover, the occasional use of deictic relatives of i9a, viz. 
a9a and ava9a 'thus', in reference to a following QUOTE (cf. (II7) and (II8), 
as well as (lOU) with uiti + ava9a ) suggests that i9a likewise must have been 
usable to refer to QUOTE. ^^ Finally, note that conversely, the ordinary quot- 
ative marker uiti shows occasional attestation in the meaning 'thus'; cf. (119). 

(llU) ma avi z§m ni.urvise i9a me turn h§m. caraQuha antara.araSam 
nraSnahe (Yt. 17. 60) 

'do not go down to the earth. Here wander around in the inter- 
ior of my house . . . ' 

(115) noii zt im za sa ya daraya akarsta saeta ... i5a caraiti 
huraoSa ya darava apuGra aeiti (V 3.2U) 

'for the earth (is) not happy which lies unplowed (for) long, 
thus/ likewise/ just as a beautiful woman who goes childless 
( ror ) long' 

(116) ba5a i5a afrasane daQhubyo baSa i5a aeni bsraGi ... (V 3.27) 

'truly (thus (it is)), I will go to the countries, truly (thus 

Cit is)), I will go on to give birth ...' 

(Sim. ibid. 29, except the second baSa occurs without iSa) 

C117) yo avaGa vyaxmanyata (Yt. 19.^3) 
'^'^^ SPEAK 
'who spoke thus at the meeting QUOTE' CCf. (IOI+), ibid.i(7) 

(118) a9a_ mraoj ahuro mazda (Afrinakan ^.3) 
'Thus spoke A.M. QUOTE' 

(.119) yoi vaQhiua a manaQho syeinti yasca uiti (Y 39-3) 

'who (masc.) hold on to the Good Thought and who (fem. ) thus/ 

12.6; The late Avestan 'Ritual Quotative' construction with i5a/i9a thus 
can be identified as an earlier i6a 'thus' construction, and thus as ultimately 
related to the q^uotative uiti 'thus' construction: Apparently uiti and (*) i9a 
represent different specializations of constructions in which a deictic adverb 
meaning 'thus' was used to focus the listener's attention on a particular QUOTE. 
While uiti was almost entirely specialized in this new function (the type (119) 
seems to be limited to three examples), i6a — like a9a and avaQa — largely re- 
tained its original deictic function, becoming quotatival only in the 'Ritual 
Quotative' . 

12.7: In conclusion it might be noted that the normal use of uiti + SPEAK 
before QUOTE, the rarer insertion into QUOTE, and the positioning of * i9a after 
QUOTE indicate an original freedom of occurrence comparable to that of iti + 
SPEAK in early Rig-Vedic (cf. 5. 3-5). This impression of comparability is fur- 
ther supported by the fact that Just as in the Rig-Veda (cf. section 5.3), 
the Avestan order of the quotative particle and SPEAK may in a few rare cases 
be reversed; cf. (120). 

(.120) a3ae-ca uiti (V U.U?) 
'and I say (thus) QUOTE' 

The dynamics of the Avestan constructions, however, differ from what we 
find in Sanskrit: There is no evidence in Avestan for the complex developments 
found in the late Rig-Veda and especially in the post-Rig-Vedic language. 
Moreover, unlike the Sanskrit quotative, the Avestan constructions remain quite 
optional throughout the attested history of the language. 

13: Other Indo-European languages 

Attested considerably later than the languages so far discussed, the 
other Indo-European languages do not seem to offer in their earliest stages 
any unambiguous evidence for quotative constructions . In some of the languages , 
however, some such constructions did develop. The case is most clear for 
Slavic, where as noted in 10.2, Russian, Old Polish, and Czech have a particle 
de , dzie , pry respectively, which can be traced to earlier SPEAK. To these 
might be added the similar Russian (slang) mol . Constructions marked by these 
particles (which usually take the second position within QUOTE) may or may not 
be preceded by 'independent' SPEAK. The constructions are used in various con- 
texts, similar but perhaps not identical to the use of the German subjunctive 
in reported speech. These may range from quoting someone without taking res- 
ponsibility for the accuracy of what is being quoted, to Just a simple repeti- 
tion of vhat the speaker has said earlier. Unlike the German subjunctive con- 
struction, however, these Slavic particles are always used with direct QUOTE. 3^* 

A quasi-H^uotatival construction is found in the quotha of earlier Modern 
English, as in Cl2l) below. However, this construction is limited to very spe- 
cial Cironic, etc.) pragmatic settings. 

Cl2l) The fickle moon , quotha , I vish my frienda were half as con- 

A more recent development is that noted by Joseph (1981) for (it's) like 
in colloquial Ohio English, as a marker of '"internal" quotation — an approxim- 
ate representation in the forjn of reported speech of what someone had in mind 
but did not express. ' In some ways, of course, the use of a construction with 
like , rather than thus , is quite unusual. However, one may conjecture that 
this regional development (a) is parasitic on the more general use of like in 
colloquial American English and (b) may have proceeded from a structure of the 
sort (it's) like this . 

Developments of this sort are interesting in that they show that quotat- 
ival constructions may arise at various times, through independent developments. 
Moreover, they show that similar elements (verbs of speaking and adverbs mean- 
ing 'thus') may be drawn on in such independent developments. At the same time, 
however, it is interesting how rare such developments seem to be in the more 
modern Indo-European languages of Europe. This makes the appearance of quot- 
atival constructions in all the early Indo-European languages^Sa. gg much more 

li*: Summary of the Indo-European evidence 

have some 

kind of quotative construction. The morphosyntax of these constructions may 
differ considerably, as indicated in the following table. Moreover, even to 
the extent that languages might agree on using SPEAK, 'thus', or a combination 
of these as quotative marker, the actual morphemes employed differ (as between 
Skt. iti , Av. uiti/iQa , Gk. h6s 'thus'). Also the degree of obligatoriness may 
differ, with Hittite and Homeric Greek having the construction most consistent- 
ly, Avestan and Latin showing it much more sparingly, and Rig-Vedic Sanskrit 
holding an intermediate position. All of the languages, however, agree on per- 
mitting the construction only under quite limited syntactic/pragmatic conditions: 
mainly with SAY and to some extent also with THINK; with HEAR the construction 
occurs seldom at best. Hittite and Latin, however, also show evidence for some 
specialized uses of the construction; and so does Avestan with its (late) 'Ri- 
tual Quotative'. (None of these, however, are comparable to the full panoply 
of uses found in Classical Sanskrit. ) 

In spite of these differences, however, it is — as noted — remarkable that all 
of these languages should have quotatival formations. Moreover, disregarding 
the differences in morphosyntax and specialized uses which can easily be attrib- 
uted to independent innovations, the languages show a remarkable agreement in 
the syntactic/pragmatic contexts in which they permit their respective quotat- 
ives. It is, I believe, hardly likely that this situation should be due to 
chance. It therefore seems more attractive to attribute the construction to 
the proto-language. 

True, this does cause certain difficulties as far as the morphosyntax is 
concerned. But these are not insurmountable. Thus the appearance of the quot- 
ative particle in clause-second position (within the QUOTE) in Hittite and Latin 
can be attributed (a) to the pattern with quotative marker inserted into QUOTE 
and (b) to the fact that the marker may well have become clitic and thus — syn- 




of major quotative marker 

thus • j SPEAK 

Before QUOTE 


After QUOTE Other 

(Early RV) 


- ! 







+ 1 (+)^ 






(.)^| .^ 





» ' « 










Notes: 'Embracing construction; If univerbation of uiti + SPEAK 
is accepted; In the rarely attested 'Ritual Quotative'; 
^Frequently preceding QUOTE, even without quotative - wa(r )-; 
^But note Joseph's connection with iwar 'as, like'; Both 
'thus' + SPEAK and plain SPEAK are used; SOnly in the rare 
'generic quote' pattern. 

chronically functioning as sentence clitic for QUOTE — would have gone into 
clause-second position in accordance with Wackernagel ' s Law. 

Noting now the prominent role played by words meaning 'thus' in Sanskrit, 
Avestan, and Greek, and the optional use of 'thus' in Hittite, 3" as well as 
the role of SPEAK in Greek and Latin (and also perhaps in Hittite), it is 
possible to reconstruct a syntactic pattern with 'thus' + SPEAK as a quotativ- 
al construction for Proto-Indo-European and to permit this structure to occur 
before, after, and inserted into QUOTE: All we need to allow for is the pos- 
sibility that just as in independent uses, 'thus' and SPEAK were subject to con- 
stant morphological and lexical remakings (cf . Skt. itthS/ittham , iti , tathS , 
Av. uiti , i9a , a9a , avaGa, Hitt. kissan , Gk. hSs , hode , etc., Lat. ita , sic , 
all meaning 'thus, so' ) , so also in their quotatival uses they could undergo 
some remaking, especially as long as the etymological meaning/function of the 
construction was still quite transparent. Where through reinterpetation, how- 
ever, one or the other of the two markers becomes the major quotative marker 
and where the position of that marker gets to be relatively fixed, at that 
point the construction would tend to become frozen, permitting little or no 
further change. 

In all fairness, however, it must be admitted that a different, 'areal' 
explanation is conceivable, namely that the appearance of quotatival construc- 
tions in these ancient Indo-European languages was due to influence from the 
ancient Near Eastern prestige languages which, as we shall see presently, had 
quotatival constructions of similar structure. What may be attractive about 
this explanation is the fact that as the prestige of these ancient Near Eas- 
tern languages and their cultures declined, so apparently did the use of quot- 
ativea in the Indo-European languages (except for Sanskrit which by this time, 
hovever, can be assumed to have been safely located in another quotative area. 

that of South Asia). For note that there does not seem to be any evidence for 
a survival of the Avestan, Homeric Greek, and Classical Latin quotatives in 
the later Cquasi-).descendant languages. (Note that though later Greek may oc- 
casionally show constructions reminiscent of the Homeric patterns, these lack 
the obligatoriness and the relative standardization of the Homeric structures.) 

Attractive as this alternative analysis may appear, however, I am bothered 
by the assumption that the Near Eastern influence reached as far west as Latin. 
Moreover, it may be the disappearance of quotatival constructions which is an 
areal phenomenon. Just like the change from SOT to SVO syntax in (most of) con- 
tinental Europe (.cf. Hock I982). In fact, this disappearance of the quotative 
may geographically be more limited than would appear at first sight. For 
later Greek and Iranian (Persian), as well as Armenian show direct-discourse 
structures (without change in person or mood) introduced by a new set of mar- 
kers: Gk. (h)6ti , MPers. ku, NPers . ki_, Arm. (e)the , bam (etc.); cf. Hock 
1975:107 and Friedrich 19^3. And as Friedrich (ibid.) shows, constructions 
of this sort are found also in Georgian (with postposed -o_) and Turkish (with 
diye 'having said'). 

Whether we attribute the early Indo-European quotatival constructions to 
inheritance or to areal influence, however, the conclusion seems inescapable 
that quotatival constructions remarkably similar in their morphosyntax and syn- 
tactic/pragmatic uses to what we find in early Rig-Vedic are found also in the 
other early Indo-European languages and that this remarkable similarity can 
hardly be attributed to independent developments. 


15: Ancient Near Eastern languages 

15-1: The earliest attested language, Sumerian , is reported to have had 
a quotatival construction marked by - e-se , perhaps an 'emphatic' form of a verb 
es- 'say'. This construction, however, seems to have been used quite rarely. 
Moreover,- it could apparently be used independently, in non-quotative contexts. 
The syntactic position of this form was post-QUOTE. (Note that Sumerisin was 
an SOV language.) Cf. e.g. Jestin 19^+6:331-5. 

15.2: Accadian (likewise an SOV language) also has a quotatival construc- 
tion which, however, seems to be used more commonly. (Even so, other construc- 
tions were available, such as unmarked QUOTE (von Soden 1952:208), or dependent 
clauses introduced by kima 'that' Cibid.233) • ) The Accadian quotative construc- 
tion either was introduced by preposed enma (later umma ) 'thus' or marked by 
inserted mi_ or me_ (a shortened form of enma ) which frequently, but not neces- 
sarily occurs after the first element of QUOTE. (Cf. von Soden 1952:176, 178.) 
Examples would be the following. ' Note that (122) shows that the quotative 
construction may be used without overt SPEAK. I have, however, not found any 
evidence for specialized uses of the quotative. 


'Thus (says/writes) I.D. to L. QUOTE' 

C123) apunama guitumma-me eqlam ula a'rus a taqbi 

'Do not under anyTircumstances say "The Gutaeans (are here, 
therefore) I did not cultivate the field'" 

Given that Accadian SOV is commonly attributed to Sumerian influence, 
Ccf. e.g. Riemschneider 1969:l6), it is tempting to see Sumerian substratum 
also in this construction. However, as noted earlier, the Sumerian quotative 
construction is quite rare. Moreover, its morphosyntax (postposed SAY) is 
rather different from the preposed or inserted 'thus' of Accadian. 

Similarly, one might perhaps be tempted to see Accadian influence in the 
Hittite quotative. In this case, the morphosyntax would in fact be much more 
similar, especially if preposed Hitt. kissan 'thus' is taken into considera- 
tion and if - wa(r )- is derived from iwar via a meaning 'thus'. However, as 
we have seen, the Hittite pattern has parallels also in the other ancient Indo- 
European languages. 

15.3: Also Elnmite had a quotative construction, marked by something like 
an old, clitically shortened absolutive of a verb SAY which is placed after 
QUOTE; cf. Friedrich 19^+3. In addition, however, the examples in Friedrich 
suggest that QUOTE often is preceded by structures of the sort 'He spoke thus' 
or even longer expressions; cf. e.g. (12U), where na-an-ri preceding QUOTE is 
the synchronically productive absolutive of a verb of speaking. 

C12U) hi 5i-la ap ti-ri-is na-an-ri QUOTE ma-ra 
'thus' 'spoke' 'saying' 
'He spoke thus, saying QUOTE' 

Apparently this construction could be employed also with THINK. I have not 
seen any evidence for specialized uses of the construction. 

This "exuberant" type of construction, with multiple instances of SPEAK 
as well as of 'thus', looks rather different from the Sumerian and Accadian 
constructions, but may compare well with some of the early Indo-European con- 
structions, as well as with Classical Tibetan (cf. below). 

Here again, direct influence from Sumerian or Accadian may be difficult to 
Justify. At the same time, however, there does now seem to be sufficient evid- 
ence to suggest the existence of a quotative linguistic area in the ancient 
Near East, an area with which perhaps also Proto-Indo-European or at least pre- 
historic Indo-Iranian , Greek, Anatolian, and Latin may have been affiliated. 

16: The languages of South Asia 

The interpretation of the evidence furnished by the various non-Indo-Euro- 
pean languages of South Asia is made difficult by several factors. Perhaps the 
jnost i;nportant of these is that none of the languages is attested anywhere as 
early as Rig-Vedic Sanskrit. Many are attested only since the last century, or 
even later. Even under the best of circumstances we are therefore required to 
go back beyond the actually attested data, Ccloser) to the reconstructed proto- 
atage, Iiefore ve can meaningfully compare these languages with early Rig-Vedic. 

This is further complicated by the fact that except for the great liter- 
ary languages (Tibetan; Tamil, Telugu, Kannada, and Malayalam), thorough gram- 
matical descriptions either do not yet exist or are hard to get at for the non- 
specialist. Even where descriptions do exist, however, they often do not go 
beyond the morphology and/or morphosyntax of quotative constructions. 

Moreover, just aa a number of modern Indo-Aryan languages have lost the 
old quotatiye (replacing it vith the Persian ki -construct ion or similar struc- 
tures), so also a number of non-Indo-European languages seem to lack quotative 
constructions;. And just aa some Indo-Aryan languages (e.g. Nepali, Bengali, 
Oriya, Dakhini Hindi /Urdu, and Marathi) have quotatiye constructions but do not 
agree with each other (or with Sanskrit) on the marker of the constructions, so 
also we find patterns of disagreement in many of the non-Indo-European langua- 
ges of South Asia. 

As a matter of area linguistics we may say that there is on one hand a 
Southern group of Dravidian languages, comprising the old literary languages, 
but also many of the neighboring "tribal" languages , in which postposed absolu- 
tives of a Proto-Dravidian aa/en/ia- (hereafter: an-) 'say (so)' are used to 
mark quotatives. To the North of this there is a 'Central' area in which quot- 
atives seem to be found in most of the languages (whether Dravidian, Munda, or 
Indo-Aryan), but in which there is less agreement on the choice of quotative 
marker and on its morphosyntax. Intruding into this area is the large group 
of (North-Central and) Northwestern languages which lacks comparable construc- 
tions. This group comprises, among others, Hindi/Urdu, Punjabi, Kashmiri on 
the Indo-Aryan side, Brahui on the Dravidian side, and Korku and Kharia on the 
Munda/Austro-Asiatic side. To the East of this area, however, we find two 
quotative areas: Bengali and Oriya on one hand, Nepali on the other. (Are 
these two areas linked with each other, or does the 'Northwestern' area ex- 
tend between them?) And to the North and East we find in Tibeto-Burman a fur- 
ther group of quotative languages. Like the 'Central' group, these languages 
show a great degree of variation in quotative markers. 

The greatest difficulty lies in interpreting these patterns. Kuiper (196?), 
attributing the 'Southern' ag-absolutives to Proto-Dravidian, evidently felt 
that it was this Dravidian pattern which spread to the Indo-Aryan and Munda lan- 
guages with quotatives, and that the differences in marking observed in the non- 
Dravidian languages result from different directions taken in calquing the 
Dravidian construction. On the other hand, Masica (l976:l89) apparently took 
essentially the same pattern of distribution as indicating a need for caution 
in this matter. Note however that his belief that North and Central Dravidian 
had no quotatives must have been based on insufficient evidence (cf. below). 
Before trying to tackle this difficult issue of interpretation, it would seem 
best to take a closer look at the evidence. 

IT : Dravidian 

17.1 ; The four literary languages of the South clearly have a quotative 
marked by an abaolutive of the verb ap- which is postposed to QUOTE. This in 
turn normally seems to be followed by SPEAK, although given other evidence for 
extraposition in Dravidian, I would not be surprised to find occasional exam- 
ples of extrapoS€d QUOTE + quotative marker which would thus resemble the Em- 
bracing construction of Sanskrit. Unfortunately, however, information on pat- 
terns of this sort is virtually impossible to come by, using standard reference 

In terms of their syntactic/pragmatic uses of the quotative, these langu- 
ages show patterns strikingly similar to Sanskrit; cf. Kachru 1979- However, 
the use of quotatives with QU does not seem to be attested for either Kannada 
or Tamil, the two Dravidian languages studied by Kachru. And Tamil shows no 

q.uotatives with either HEAR or SEE. On the other hand, Indo-Aryan Marathi has 
virtually all of the Sanskrit uses, except those with ONOM and SEE. And Nepali, 
likewise Indo-Aryan, has all the Sanskrit uses outside of NAME, EMPH, QU, and 
ONOM. In this respect, then, the differences between modem Dravidian and 
Indo-Aryan are not overwhelming. What is remarkable, though, is that none of 
them seems to have the full panoply of uses found in Classical Sanskrit. 

It is also interesting to note that the morphology of the quotative mar- 
ker shows variation, within a given language, across different languages, and 
through history. Thus as Kachru notes, Tamil and Kannada have two different 
absolutive formations each. Moreover, as Kuiper showed, the modern Tamil 
earu seems to be a replacement of an earlier eaa , which outranks earu in Old 
Tamil by a ratio of 200 : 26. Finally, as Kuiper notes. Old Tamil encu is, 
with two exceptions, always used 'in its full lexical meaning' (196T, note hi.) 

17.2: Moving further to the North, we find some kind of quotative con- 
struction in apparently all the Dravidian languages other than Brahui. How- 
ever, the further North we go (roughly speaking), the greater the differences 
from the Southern pattern. 

Thus Pengo has two quotative markers , inji and injele , but unmarked QUOTES 
frequently occur instead of quotatives; cf. Text 1.8, 9; 6.1, 12 vs. 6.3, T-9, 
10, 11 in Burrow and Bhattacharya 1970. The postposed quotative markers 
li'^e Cetc), injihi (etc.) of Kuvi often are accompanied by ele 'thus'. QUOTE 
may in addition frequently be preceded by ele icesi 'said thus'. That is, 
unlike th.e Southern languages, Kuvi frequently uses structxires similar to the 
Sanskrit Embracing quotative, as well as structures involving an element 'thus'. 
Finally, finite (ele) icesi may occur after QUOTE instead of the non-finite 
quotative markers'! (Cf. the texts in Israel 1979- ) 

No information has been accessible to me concerning the syntax/pragmatics 
of quotatives in this area. 

17.3: Yet further North we find Malto with a possibly archaic, synchron- 
ically unmotivated quotative particle a^, but also with unmarked QUOTE, as well 
as with extraposed structures in which QUOTE is followed by absolutive-like, 
'conditional' anko/ankah 'saying, speaking', which always seems to be a part 
of the following, independent main clause. That is, in these structures, the 
absolutive-like form of SPEAK does not seem to be part of the preceding QUOTE, 
but seems to be functioning as a link with the following clause, an element 
which in terms of surface structure belongs to th.e following sentence. In 
addition, tan , Je , and ki^ 'that' may be used after SAY, THIMK, and SEE. Cf. 
Mahapatra 1979:197, 199, and text. 

KuriJth uses a 'conjunctive participle' of one of its verb for SAY to mark 
direct discourse, employing this construction also to mark Purpose; cf. Hahn 
1911. Ho\feyer, the verb employed is bac- , not a cognate of aa-- Moreover, the 
'conjunctive participle' is simply the finite verb agreeing in person and num- 
ber with tke main verb eind optionally linked with it by kT or darS. Finally, 
note that in Hahn's Kurukh version of the Prodigal Son, all direct disco\irse is 
unmarked and that a similar situation is found in the examples of Vesper (l97l)' 

Brahui, finally, apparently has no traces of a comparable quotative. 

17.^: Thia evidence can te interpreted in several different ways. On 
one hand one jnight claim that the lack of a quotatiye in some of the languages 
and the disagreement in the choice of marker and in morphosyntaz between many 
of the languages, as well aa the chronological differences between, say. Old 
and Modern Tamil, indicate that Proto-Dravidian lacked a quotative construc- 
tion. Cit is on the groundsof such arguments that Kuiper (196T) claimed that 
the quotative constructions found in many of the Munda languages cannot be in- 
herited but must be borrowed from Dravidian. ) A necessary corollary to this 
claim would have to be the assumption that the remarkable degree of agreement 
in the choice of an- as- tPue basis for the quotative marker of most of the Dra- 
vidian languages is attributable to cross-linguistic diffusion, presumably 
from (one of) the Southern literary languages. Toward the Northern periphery 
of this diffusion area, then, the change would have slowly lost momentum, lead- 
ing to the noted irregularities and aberrancies in the languages of the transi- 
tion area. 

This claim might be countered by pointing to the synchronically unmotiv- 
ated quotative marker a^ of Malto , which can be taken to suggest that quotat- 
ive constructions, even if now no longer de rigueur, have a long prehistory 
even in this language. This argument would be strengthened if it could be 
shown that a^ can be plausibly derived from an earlier form of an-. It might 
therefore be argued that the quotative is in fact inherited in Dravidian, and 
that it was originally built on the verb aa- 'say (so)'. This argument, too, 
would require certain corollary assumptions: First, one would have to argue 
that whatever the morphology of the original construction, it could undergo 
morphological renewal (as in OTa. ena vs. Mod.Ta. eaeu ; cf . also Kuvi finite 
icesi (?)). Moreover, one might have to claim that Kurukh bacas (ki/dara) 
shows that even the verbal root could undergo such a renewal. As for the fact 
that unmarked QUOTES are more common in the Northern area and that there is no 
inherited quotative at all in Brahui , this would have to be attributed to the 
influence of Munda and/or (regional) Indo-Aryan. 

Some variant of this second analysis may well be correct. Still, one would 
feel more comfortable if for instance Malto a^ could be shown to go back to an 
appropriate form of aa-; or if relics (in 'frozen' onomatopoeia, perhaps) of 
the old quotative could be found in Kurukh and/or Brahui; or if the optional 
ele 'thus' of Kuvi could be plausibly accoimted for; etc. 

Even more difficult is the question of the morphosyntax of the original 
quotative construction. Should we assimae that the quotative marker syntactic- 
ally belonged to QUOTE (as it certainly seems to do in the Southern languages) 
or that it was a linking element, connecting QUOTE to the following sentence 
(as it seems to be in Malto)? Similarly, should ve assume that the fairly rigid 
QUOTE + quotative marker + SPEAK structure of the Southern Dravidian languages 
is inherited or that the extrapoaed, Emhracizig structures found for instance in 
Kuvi are more original? 

The most difficult issue, h.owever, is that of the original syntax/pragmat- 
ics of the quotative. Should we attribute the patterns found in the Southern 
languages- to Proto-Dravidian? Note that one would feel more comfortable about 
doing so if the relevant facts in the other Dravidian languages were better 
known. Even then, however, the difficulty arises as to whether we should re- 
construct the more fully developed pattern of Kannada or the more restricted 

structures of Tamil. (Given the general conservatisjn of Tamil, the decision 
should perhaps be made in fayor of this language (?).) Moreover, we have to 
contend with the fact that a number of Modern Indo-Aryan languages have compar- 
able patterns and that Classical Sanskrit shovs the most fully developed system. 

Under these circumstances it would be difficult to argue for or against 
any of the following propositions: 

(a) The extended syntax/pragmatics of the quotative is entirely Dravidian 
in origin; 

Cb) The extended syntax/pragmatics of the quotative is entirely Indo-Aryan 
in origin; 

Cc) The extended syntax/pragmatics of the quotative originated in a third 
language group; 

Cd) The extended syntax /pragmatics of the quotative results from converg- 
ent and mutually reinforcing developments in Indo-Aryan and Dravidian (as well 
as, perhaps, in other languages of the area). 

18: Munda/Austro-Asiatic 

As Kuiper (1967 , with ample references) pointed out, a number of the Munda 
languages have quotative constructions, marked by forms of verbs of speaking, 
although the verb selected as a marker and its morphological make-up may differ. 
Combined with the apparent absence of a quotative in Korku and Kharia, this fact 
is interpreted by Kuiper as showing 'that this construction has been introduced 
in relatively recent times,' presumably under Dravidian influence. 

However, as noted earlier, if we applied the same kind of reasoning to 
Dravidian, we might have to claim that also in that group of languages the quot- 
ative cannot be inherited. Moreover, we have just seen that if we do reconstruct 
a quotative for Proto-Dravidian , then we must allow for morphological and lexic- 
al renewal or even loss in some of the individual languages. Surely, what is 
acceptable practice for Dravidian must be acceptable also for Munda. Finally, 
as I have pointed out elsewhere (Hock 1975=90), quotative markers derived from 
different verbs of saying are found also in the non-Indian languages Mon, Khmer, 
and Nicobarese, which belong to the same, larger, 'Austro-Asiatic ' family as 
Munda. Here as elsewhere, therefore, the possibility of inheritance cannot be 
ruled out. 

Note that in the case of Munda, our knowledge of extended uses of the quot- 
ative is even more restricted than for the "tribal" Dravidian languages, except 
that Kuiper makes references to the use of the quotative with ONOM in some of 
the Munda languages. 

19; Tibetan and Tibeto-Burman 

As noted by Ramp Cl976:36l with note 33), Hock (1975:9Q), and Joseph (1982), 
quotative constructions are found also in (Modern) Tibetan, Gurung, Lahu, Lushai , 
and Burmese. In many cases the quotative particles are synchronically opaque; 
but note Mod. Tib. sg^ (quot.) beside sea (quot./SAY); cf. Goldstein and Kashi 
1973:llU-15. Note also the (Northeast India) Kokborok quotative particle hinoy , 
whose -oy looks suspiciously like the verbal absolutive marker; cf. Karapurkar 

1976: 99- And in Lahu the marker seems to meem 'thus, so'. 

The earliest attested language of this group, Classical Tibetan, s.hows even 
more interesting constructions:, similar in their morphosyntactic "exuberance" to 
ancient Elaraite, involving preposed SPEAK plus preposed di_ 'this' and postposed 
de 'that', elements such as skad(a} 'speech', pre- and postposed cesCa) 'thus', 
as well as pre- and postposed absolutival forms of SAY, such as (ba)sgoo 'say- 
ing'; cf. JSschke l883:8ii-5, as well as pp. 38 and 108.39 Interestingly, the 
sentence dividers in Jaschke's text sample suggest that the postposed combina- 
tion of ees(a) 'thus' + absolutive of SAY belongs with QUOTE, not with the fol- 
lowing sentence. 

Perhaps, then, some quotatival construction is native also to Tibeto-Burman. 
Unfortunately, however, it is again difficult to get any information of the syn- 
tactic/pragmatic uses of the construction. 

20: The larger area 

As can be seen from the discussion in sections 15 - 19, quotatival construc- 
tions are foiond over a vast territory, stretching from the ancient Near East, 
through South Asia — and even beyond, to the Far East (cf. Hamp 19T6:361 with 
note 33). Recurrent features of the quotative constructions found in these lan- 
guages are (a) some, usually non-finite form of SAY and/or (b) a particle mean- 
ing 'thus ' . 

This 'areal' aspect of the quotative opens up the possiblity that any of 
the languages or language families historically attested with a quotative may 
owe the construction at least in part to convergent developments, rather than to 
straight inheritence. However, given the uneven chronological attestations 
(ranging from the 5000-year old record of the Ancient Near East to the present- 
day evidence of some of the "tribal" languages), given the large number of lan- 
guages and language families involved, and given the lack of reliable informa- 
tion on the (pre-)history of most of these, it must at this point be considered 
impossible to establish a single source for the quotative and to trace the pro- 
cesses through which the construction spread through the area. 


21: The findings of the preceding sections and the evidence for quotatival 
constructions in all of the early Indo-European languages have important reper- 
cussions for an assessment of the claim that the Sanskrit quotative resulted from 
Dravidian influence: 

The early Rig-Vedic morphosyntax and syntax/pragmatics of the iti- quotative 
do not seem to differ in any appreciable manner from the various patterns found 
in the other ancient Indo-European languages or in the non— Indo-European langua- 
ges of the ancient Near East. Specifically, the morphosyntax and syntax/prag- 
matics of early Rig-Vedic are remarkably similar to what we find in Avestan (ex- 
cept that Avestan has two constructions in complementary distribution, one mar- 
ked by uiti 'thus', the other by * iQa 'thus'). 

The Embracing construction of Late Rig-Vedic and especially of the later 
language, to be sure, differs appreciably from what we find in any of these other 

ancient languages. True, as we liave seen in 5-5, it is possitle to motivate 
this innovated construction in terms of the sjmchronic structure of Rig-Vedic 
Sanskrit. Still, the absence of similar developments in other Indo-European 
languages and the fact that in the non-Indo-European languages of South Asia, 
structures of this sort are possible Cas in South Dravidian) or even common 
Cas in some of the "tribal" Dravidian languages, as well aa- in Classical Tibe- 
tan), suggest that the development may have been due to areal pressures. It 
does not follow, however, that these pressures must have come from Dravidian. 
For as noted earlier, it is by no means clear whether Embracing constructions 
(with extraposition of QUOTE plus guotative marker) should be reconstructed as 
a common phenomenon of Proto-Dravidian, or whether the stricter pattern QUOTE 
+ quotative marker + SPEAK of the Southern Dravidian languages should be recon- 
structed. If the latter should be the case, then of course the Embracing con- 
struction of Sanskrit, with its extraposition of QUOTE + iti , would be quite 
un-Dravidian. Moreover, given that extraposition is an eminently Indo-European 
phenomenon, it might be possible that the Embracing quotative of Sanskrit and 
the rebracketing of the quotative marker with the preceding QUOTE likewise is 
an essentially Indo-European development, and constitutes one of the elements 
which Sanskrit contributed to the South Asian convergence area. 

A much more promising area woiild be that of the syntax/pragmatics of the 
quotative. For in the other ancient Indo-European languages, as well as in the 
ancient Near Eastern languages, that syntax/pragmatics was rather "shallow", 
with only SAY and THINK (occasionally also HEAR), as well as 0, governing the 
quotative, and with very few specialized uses of the quotative. If it shoiald 
turn out that the impressive array of uses found in Clasical Sanskrit and, in 
somewhat diminished form, in Modern Tamil, Kannada, Bengali, Oriya, Nepali, 
Marathi, and Dakhini Hindi/Urdu, is limited to South Asia, then the increasing 
development of Sanskrit toward such a complex quotative syntax may constitute 
a component of the "Indianization" of Sanskrit. 

Even here, however, it seems necessary to exercise some caution. For in 
our present state of knowledge we cannot be sure (a) whether the extended 
quotative syntax is an exclusively South Asian feature and (b) to what extent 
that syntax may be attributable to Sanskrit, to Dravidian, to other languages 
of the area, or to convergent and mutually reinforcing developments in all of 
these languages. Note that as we have seen, all the Sanskrit uses of the quot- 
ative can be explained in terms of pxirely internal developments, involving re- 
interpretations and generalizations. In fact, the more fully developed range 
of uses found in Classical Sanskrit (as compared to Modern Tamil and Kannada) 
jnakes it acmewhat difficiilt to attribute the total pattern to Dravidian influ- 

The best that can be said, then, at our current state of knowledge, is that 
the development of the Embracing construction and of various special syntactic/ 
pragmatic uses of the quotative in later Sanskrit may constitute part of the 
"Indianization" of Sanskrit. It is not, however, possible to state with any 
degree of certainty the extent to which these developments are attributable to 
internal Sanskrit developments, to outside influence, or to a convergent com- 
bination of the two. Nor does our current state of knowledge permit the claim 
that if there was outside influence, that influence can have come only from 

Clearly, what vould be needed to come to more informed judgments in this 
matter is a significant increase in our understanding of the structure and his- 
tory of the various non-Indo-European languages and language families of South 
Asia. It is my fervent hope that this challenge will he met, especially by 
scholars who would like to argue for outside, non-Indo-Aryan influence on Sans- 
krit. ^Q 

■"-Research on this paper has been in part supported by 19T9-8Q and 1982- 
83 grants frojn the University of Illinois Research Board. T have also benefit- 
ed from discussions and correspondence with the following scholars: M. B. 
Emeneau, F.B.J. Kuiper , C. Masica, E. Polome , F. Southworth, S. N. Sridhar. 
Needless to say, these scholars would not necessarily agree with all the con- 
clusions reached in this paper. — For perspicuity's sake, Sanskrit examples 
will be given in their pre-pausal form, not in their attested sandhi form. 
Quotative particles and related linguistic forms are characterized by double 
underlining; quoted material, by single underlining. 

^Rloch (I93i*: 325-8) and Mayrhofer (1953:355) anticipated Kuiper. However, 
Bloch had certain reservations about claiming Dravidian influence, and Mayr- 
hofer felt that there might have been a pre-Dravidian and pre-Sanskrit substra- 
tum from which both Sanskrit and Dravidian got their quotatives. 

Emeneau's I969 paper expands on Kuiper 's discussion of onomatopoeia + iti 
in post-Rig-Vedic Sanskrit. 

Classical Sanskrit examples quoted in this paper are from Speijer I886. 

^Note however that Debrunner 19^8 prefers not to consider this a type of 
indirect discourse (or of direct discourse). 

|ru- 'hear' is attested once in the Rig-Veda with direct discourse; cf. 
5.6, example (Uo) below. 

"("Possible additional Rig-Vedic examples of such more 'orthodox' indirect 
discourse constructions, not listed in Debrunner, are found at U.I8.6, 5.2?.'* 
(with preceding itj^) , 5.30.2, 5.^+8.5, 10.52.1 (2x). 

^qUOTE + itd^at 10. 17.1, 2U.5. 33.1, 3h.6, 61.12, 73.10, 95-18, 97.^4, 109-3, 
115.8-9 C^tJc), 119.1 C2x), 130-1, li+6.1;. Unmarked QUOTE at 10.9-6, 10.11, 18.I, 
22.6, 23-2, 27-18; 3^.14,5,12,13; Uo.5,11; 52.1, 61.18, 79-^, 82.2, 88.17, 95-17; 
97.17,22; 1Q9.U, 120.9, 129-6, 16U.I. 

^Rig-Vedic passages with auch uncertain interpretation of the function of 
iti are; 1.138.3, U.1.1, 5-7-10, 5.27.1* (followed by indirect discourse), 5-'+l. 
17, 5.53.3, 6.62.7, 8.30.2, 10. 27. 3, 10. 61. 26, 10. 120. U. In addition there are 
considerable difficulties in interpreting the occurrences of iti in I.191.I and 
5. 52.11; cf. Hock 1975, note 22. 

.2U.30, 10.18.9, 10.23.2, IO.52.U, 10. 

Arnold's Cl9Q5 ) division of the Rig-Veda into five strata: Archaic (A), Stro- 
phic (S), Normal (N}, Cretic Cc), and Popular (P) . for ease of exposition 
and so as to have sufficiently large numbers for statistical comparison, I have 
combined the first two and the last two of these and, with some renaming, di- 
vided the Rig-Veda into the following three chronological strata: Early (= A 
+ S), Middle C= N), Late (= C + P). — I am fully aware that there are a number 
of problems with Arnold's criteria for determining chronological affiliation. 
However, I don't know of any other full chronologicization which could satisfac- 
torily replace it. Moreover, some comfort can be derived from the fact that 
the quotative was not one of the criteria used by Arnold in determining his 

,92.2, 8.93.5, 9.101.5, 10.73.10. 

"""1.109.3, 1.161.9 (2x), 6.5^*.!, l.hl.2; 7.10i*.15,l6 C2x); 10. 33.1, 10.109. 

3, IO.1U6.I*. 

"""^1.162.12, I.I6U.15, 2.12.5 C2x), 6.56.1, 9.1li+.l. 


For definition and discussion of this term, of. my other contribution to 

this volume. Note that ca and ced (<£a + id ) never can be clause-initial, and 

that ced must be second in its clause. 

Here, indicates non-quotative; iti , quotative. 


A great deal of Atharvanic material has been taken over verbatim from 

the Rig-Veda. This material is ignored in the following discussion. 

cxxLt passage to be a riddle, the answers being: 'the dog', 'the leaf, 'the hoof 
of an ox' . 

^^These passages are (a) SB 8.1.1, 8.1.3-^*,,12-18, and (b) 11. 5. 
1. Ca) contains Cin 8.1.1 and 8.2.1) sections heavily quoting from the ritual 
texts of the VSJasaneyi^-Sa^hita, with brief explanatory restatements or para- 
phrases and Cin 8.1.3^), less 'text-bound' explanations of the ritual. {h) con- 
tains the story of Urva^t and Pururavas , with the text of RV 10.95 used as the 
direct quotations of the two protagonists. Though containing a few explanatory 
restatements or paraphrases of that text, this selection represents a much less 
'technical', much more 'literary' yariety of Vedic Prose. 


A similar passage, with iti 'omitted' after the second, final fragment of 

QUOTE, is found at JB 2.128-30. Conversely, there are a few cases where iti may 

appear after each sentence of a longer QUOTE, even if there is no intervening 

SPEAK; cf. the following example: 

yam . . . kamayeta ksodhuka ayad iti tgam . . . idi iti (MS 3.2.5) 
'of which h.e should desire "May it be hungry; ""I have eaten its 
strength . . . "' 

I have found only one possible exception, namely (ii) below. However, 
the context is such that this passage can be explained as a case of dittology; 
The preceding paragraph contains (i ) which, following the general r\iles of Vedic 
Prose, gives an 'internal', 'subjective' reason for an action. Both (ii) and 
(iii), on the other hand, state 'external', 'objective' reasons, where it would 
be impossible to insert or supply something like 'with this thoxight ' . In (iii) 
this reason is stated by means of a dependent-clause structure, marked by hi_ 
'for, because', following what appears to be the normal practice of Vedic Prose. 
The deviation from tfiat practice in (ii) seems most naturally explained as due 
to the influence of (i) in the immediately preceding paragraph. (it is of course 
possible that 'dittological' structures of this sort formed the basis for the 
post-Vedic extension of causal iti to 'external', 'objective' contexts.) 

Ci ) ... t&i ha sma tSm purS brahmaijalj na taranti anatidaghda agnina 
vaisvanarepa iti (SB l.U.l.lU) 

'that (river) the earlier brahmins did not use to cross ( think- 
ing/because ) "A.V. has not burned it over"' 

(ii) ... tad ha akgetrataram iva Hsa ... asvaditam agnina vaisvanarega 
iti (ibid. 15) 

'at that time it (= the area near the river) was quite unculti- 
vated, because A.V. had not tasted it' 

Ciii) ... sa api ... sam iva eva kopayati ttvat sita anatidagdha hi 
agnina vaisvanarena (ibid.l6) 

'that (river) roars through (the area), as it were, so cold (is 
it), because A.V. has not burned it over' 

3. In addition, note that R = 
rare, C = common, F = frequent. Also, I = iti- initial, F = SPEAK-final, E = 
Embracing quotative; G = general frequency (for all quotative structures). (For 
G, the frequency rating is made in comparison to competing constructions; for 
I, F, and E, it is between these three constructions only.) Finally, the names 
of the various sub-types of SPEAK are given only in terms of their first three 


The data are taken from the Thesaurus, s.v. inquam . 

'I am not, however, convinced of the usefulness of th^e Sanskrit evidence 

cited by Joseph.; As far aa I can see, iva 'like, as' never has any meaningful 

quotative value, comparable to that of iti or other quoted-speech markers. 

Unless otherwise indicated, examples are taken from the selections in 
Sturtevant 1235. References to these are by descriptive title, followed by s«5c- 
tion and line number. For ease of exposition I give a quasi-phonetic interpret- 
ation of the syllabic transcription, without any vowel length indications. And 
to more clearly set off QUOTES, I make no distinction in underlining between 
Sumerograms and other portions of th.e text. 


For Aveatan I rely on the evidence of the Romanized portions of Reichelt's 
(19Q9 and 1211) aelections. In addition I have worked through the GSthas and 
the Hyum to Mithra in their entirety. For these I have used the editions of 
Eumbach (.19591 and Gershevitch (1967)- To save space, I have in many cases in- 
dicated the location of ^UOTE merely in the glosses. 

For other references, cf. Bartholomae 190^*, s.v. uiti . 

The interpretation of this passages seems to be difficult. 

In fact, RV bai^ (once ba(^a ) has been connected with Av. baJL , ba5a ; cf. 

e.g. Debrunner 1957:92 with references. Note however that Bartholomae (l90^, 
s.v.) points out that bat is a hapax legomenon, the usual form being bi. More- 
over, on the Sanskrit side, one would need to account for the retroflex, not 
dental stops. Presumably, however, this could be done in terms of contamin- 
ation from the ritual interjections vagat , sraugat , for which see Wackernagel 
1896 :Ul, 172, etc. 

Except for the ambiguous (103) above, I have not noted any such examples 
with i9a 'thus'. The closest thing would be passages like i9a at yazamaide 
ahuram CY 37-1, sim. Y 39-1,3) 'thus we worship A.', without QUOTE (or any 
other obvious referent for i9a ) . 


I am grateful to my colleague, Frank Gladney, for providing information 

on the use of the Slavic constructions. 


^"Except for Old Persian which, however, is attested only in royal pro- 
clamations, with very little opportunity for the use of quotatival constructions. 

Also Latin occasionally has ita 'thus' with SPEAK. However, the examples 
in the Thesaurus (s.v.) seem to be generally followed by indirect (infinitival 
or dependent-clause) structures, as in ita laudabunt : bonum agricolam (ace.) 
'they will praise him thus, (as being) a good farmer ...' 

'These examples are taken from Riemschneider 1969:162-3. 

Friedrich's presentation does not make it possible to be absolutely cer- 
tain as to vhick of the three initial words means 'thus, in this way'. 


I apologize for the perhaps unconventional transliterations of Jaschke's 

Tibetan-script examples. 

An appropriate conclusion to this paper might consist of the revival of 
an obsolete, quasi -quotatival English expression, found in hooks of the l6th 
century: Finia . quoth. Hans Henrich Hock. 


Avestan: 7 = Yidevdat ; Y = Yasna; Yt. = Yasht. 

Sanskrit: AV = Atharva-Veda; JB = Jaiminiya Brahmana (Caland's selections); 

KS = Kathaka Samhita; MS = Maitrayani Samhita (non vidi); RV = Rig-Veda; 

SB = Satapatha BrShmapa; TS = Taittiriya* Samhita; VS = Vajasaneyl Samhita. 


ARNOLD, E. Vernon. 19Q5. Vedic metre in its historical development. Cambridge: 
University Press. CReprinted 1967, Delhi: Motilal Banarsidass. ) 

BARTHOLOMAE, Christian. 190i+. Altiranisches Worterbuch. Strassburg: Triibner. 

BLOCH, Jules. 193**. L'indo-aryen du veda aux temps modernes. Paris: Adrien- 
Maisonneuve. (Engl, transl. by A. Master, with retention of 193^+ pagina- 
tion, ibid. 1965. ) 

BLOOMFIELD, M. l899. The Atharvaveda. Strassburg: Triibner. 

BURROW, Thomas, and S. Bhattacharya. 1970. The Pengo language. Oxford: 

CARRUBA, Onofrio. 1972. Beitrage z\m Palaischen. Istambul: Nederlands His- 
torisch-archaeologisch Instituut in het Nabije Osten. 

DEBRUNNER, A. I9U8. Indirekte Rede im Altindischen. Acta Orientalia 20.120- 

. 1957. Nachtrage zu Band I of Wackernagel I896 (1957). 

DELBRtJCK, Berthold. I888. Altindische Syntax. Halle: Waisenhaus. 

DRESSLER, Wolfgang. 1970. Grundsatzliches zur Funktion der altanatolischen 
Satzpartikeln. Archiv Orientalny 38.385-90. 

EMENEAU, Murray B. I969. Onomatopoetics in the Indian linguistic area. Lan- 
guage ^^5-27^-99. (Reprinted in Emeneau 198O. ) 

. 1971. Dravidian and Indo-Aryan: the Indian linguistic area revisited. 

Symposium on Dravidian Civilization, ed. by A. F. Sjoberg, 33-68. (Re- 
printed in Emeneau 198O.) 

198Q. Language and linguistic area: essays selected and introduced by 
Anwar S. Dil. Stanford: University Press. 

ERIEDRICtt, Johannes. 19^3. Die Partikeln der zitierten Rede im Achamenidisch- 
Elamischen. Orientalia 12.23-30. 

1967. Hethitisches Element arbuch , 1. 2nd ed. Heidelberg: Winter. 

GELDNER, Karl. I885. Miscellen aus dem Avesta. KZ 27.225-60. 

GERSHEVITCH, Ilya. I969. The Avestan hymn to Mithra. Cambridge: University 

GOLDSTEIN, Melvyn C. , and Tsering Dorje Kashi. 1973. Modern Literary Tibetan. 
Urbana: University of Illinois Center for Asian Studies. 

gQtZE, Albrecht, and Holger Pdersen. 193^. Mursilis Sprachlahmung . (Det Kgl. 
Danske Videnskabernes Selskab, Hist.-fil. Meddelelser, 21:1.) Copenhagen: 

HAHN, Ferd. I9II. Kurukh grammar. Calcutta: Bengal Secretariat. 

HAMP, Eric p. 1976. Why syntax needs phonology. Papers from the Parasession 
on Diachronic Syntax, 3kQ~6k, Chicago: CLS. 

HOCK, Hans Henrich. 1975. Substratum influence on (Rig-Yedic) Sanskrit? 
Studies in the Linguistic Sciences 5:2.76-125. 

. 1982. AUX-cliticization as a motivation for word order change. Studies 

in the Linguistic Sciences 12:1.91-101. 

HUWBACH, Helmut. 1959- Die Gathas des Zarathustra. Heidelberg: Winter. 

ISRAEL, M. 1979. A grammar of the Kuvi language. Trivandrum: Dravidian Lin- 
guistics Association. 

JASCHKE, E. A. I883. Tibetan grammar. 2nd ed. London: Triibner. 

JESTIN, Raymond. 19^6. Le verbe sumerien. Paris: Boccard. 

JOSEPH, Brian. 1981. Hittite ivar , va(.r) and Sanskrit iva . KZ 95-93-8. 

1982. More on Ci ) -wa CrT ! KZ 96. CPrepublication copy received from 
the author. I 

KACHRU, Yamuna. 1979. The quotative in South Asian languages. South Asian 
Languages Analysis 1.63-77. 

KARAPURKAR, Pusha Pai . 1976. Kokborok grammar. Mysore: CIIL. 

KUIPER, F. B. J. 1967. The genesis of a linguistic area. Indo-Iranian 

Journal 10.81-102. (Repr. 197^, in International Journal of Dravidian 
Linguistics 3.135-53.) 

MAHAPATRA, B. P. 1979- Malto: an ethnosemantic study. Mysore: CIIL. 

MASICA, Colin P. 1976. Defining a linguistic area: South Asia. Chicago: Uni- 
versity Press. 

MAYRHOFER, Manfred. 1953. Die Substrattheorien und das Indische. Germanisch- 
Romanische Monatsschrift 3^.230-^*2. 

OED = "The Oxford English Dictionary", i.e. A new English dictionary on histori- 
cal principles, 1897-1928 + Addenda. (Micrographic reprint of reissued 
version, 1971.) Glasgow et alibi: Oxford University Press. 

REICHELT, Heins. 1909- Awestisches Elementarbuch. Heidelberg: Winter. 

. 1911. Avesta reader. Strassburg: Triibner. 

von SODEN, Wolfram. 1952. Grundriss der akkadischen Grammatik. Rome: Ponti- 
ficium Institutum Biblicum. 

SPEIJER, J. S. 1886. Sanskrit syntax. (1973 reprint, Delhi: Motilal Banarsi- 
dass . ) 

STURTEVANT, Edgar, and George Bechtel. 1935. A Hittite chrestomathy . Phila- 
delphia: Linguistic Society of America. 

THESAURUS linguae latinae. 1900-76. Edited by 5 German Academies. Leipzig: 
Teubner . 

VESPER, Don R. 1971. Kurukh syntax with special reference to the verbal sys- 
tem. University of Chicago Ph.D. dissertation in Linguistics. 

WACKERNAGEL, Jakob. 1896. Altindische Grammatik, I. (Reprinted 1957, with a 
revised Introduction generale by L. Renou and Nachtrage by A. Debrunner. 
Both of these have separate pagination.) Gottingen: Vandenhoeck und Rup- 

Studies in the Linguistic Sciences 
Volume 12, Number 2. Fall 1982 

Yamuna Kachru 

This paper presents a brief report on two selected 
topics from on-going research investigating syntactic variation 
in Modern Standard Hindi (hereafter, Hindi) as spoken in . 
the Eastern and Western regions of the Hindi-speaking area. 
The two constructions under focus are the possessive and the 
ergative. The Eastern and the Western varieties show a marked 
difference in both these constructions. These differences 
can be traced to contact with other languages/dialects of 
the area. Furthermore, they seem to be indicators of syntactic 
change in progress. The observed patterns of variation 
raise several important questions with regard to the issues 
of standardization and medium of instruction in the Hindi 

0. Introduction 

Geographical variation in language has always been one of the main 
interests of linguists. Very little.reliable information, however, is 
available on the varieties of Hindi. The research discussed below, 
initiated in 1982, is the first systematic attempt to determine the 
range of variation in selected aspects of the syntax of Eastern and 
Western Hindi. I will limit myself here to a discussion of two syntactic 
topics only: the possessive and the ergative constructions. There are 
two main reasons for this. First both of these have been described in 
detail in tradition grammars as well as modem linguistic analyses 
(e.g., in Guru 1920, McGregor 1972, Sharma 1958, Kachru 1969, 1980, 
Pandharipande 1981a, and Pandharipande and Kachru 1977). Second, 
traditional grammars as well as modem linguistic descriptions exhibit 
awareness of regional variations in these two constructions (e.g., 
Vajpeyi 1958, Kachru 1980, among others). 

The main purpose of this paper is to demonstrate that regional 
variation in Hindi has arisen due to the influence of the various mother 
tongues that Hindi is in contact with in these regions. Furthermore, 
this variation is ushering in a process of broader syntactic change 
in Hindi. 

The paper is organized as follows. Section 1 outlines the grammatical 
description of the possessive construction, presents the findings of my 
research and discusses the implications of these findings. Section 2 
does the same for the ergative construction. The concluding section 
suggests further directions of research on variation in Hindi. 

The data for this study consists of questionnaires filled out by 
46 graduate students of Jawaharlal Nehru University, Delhi. The regional 
distribution of the respondents is as follows: 18 from Eastern Uttar 
Pradesh and Bihar, 28 from Western Uttar Pradesh, Delhi, and contiguous 
areas. In addition, ten oral interviews were conducted by me in Patna 
and Dhanbad, Bihar. The subjects interviewed were doctors (3), engineers 
(2), college/university professors of social sciences C3) , and lawyers 
(2). All were born, raised, and educated in Bihar. 

1. The Possessive Construction 

According to the information available from grammars and more recent 
descriptions of the construction, it has the following characteristics 
(Kachru 1969, Pandharipande 1981): 

1. Possession is expressed in Hindi by the use of the genitive, 
locative, or dative postposition with the possessor noun and the verb 
hona 'be or becomes'. The possessed noun controls agreement in the 
sentence. For example, consider the following sentences. 

1. raam kii do betiyaa. thii 
Ram of two daughters were 
Ram had two daughters. 

2. raam ke do bete the 
Ram of two sons were 
Ram had two sons. 

3. raam ke paas do kaare har/thii 
Ram near two cars are/were 
Ram has/had two cars. 

k. raam ke paas do makaan hai/the 
Ram near two houses are/were 
Ram has /had two houses. 

5. raam ke paas ek kar /makaan hai 
Ram near a car/a house is 
Ram has a car /house. 

The possessor raam is followed by the genitive postposition kAA 
in sentences 1-2 and by the locative postposition ke paas in sentences 
3-5. The verb hona 'be'or 'become' is in the past tense in 1-2, and 
both the genitive postposition and the verb agree with the possessed 
nouns in terms of number and gender. In 3-5, the postposition is invariable 
but the verb shows similar agreement features (only number in the present, 
both number and gender in the past). 

II. The distribution of the genitive (kAA or invariant k£ 'of'), 
locative ( me , ke paas 'in, near') and dative (ko 'to') postpositions 
in the possessive construction is as follows: 


a. If the possessor is animate and the possessed is either animate 
or denotes kinship or body parts, the postposition is k£ (kAA is also 
acceptable) ; 

b. If the possessor is animate and the possessed is concrete, 
the postposition is ke paas 'near' ( kAA is also acceptable); 

c. If the possessor is animate and the possessed is abstract, the 
postpositions are me 'in' or ko 'to'- me if the possessed denotes a permanent 
characteristic or property, ko^ if it is transitory. ^ 

The questionnaire that I used to investigate variation in Hindi was 
designed to determine the range of the following postpositions in Eastern 
vs. Western Hindi: the genitive kAA , the invariant ke and the dative 
ko to indicate kinship. I concentrated on these postpositions for the 
following reasons. First, even in written standard Hindi, both ke_ and 
ko are used to denote kinship in literary works from the Eastern region, 
e.g., consider the following sentences from Shukla 1903. 

6. is putr ke sivaay unhe_ koii aur santaan na thii 
this son except him to any other offspring not was 
He had no other offspring except this son. 

Cunh£=unko^ 'him to'] 

7. bangaalii mahaashay ke ek putr thaa 
Bengali gentleman of one son was 
The Bengali gentleman had one son. 

The dative ko and the invariant k£ are used in 6 and 7 respectively 
to denote the same relationship. 

Second, grammarians such as Vajpeyi have very emphatically charac- 
terized this use of the dative postposition as ungrammatical and inappro- 
priate C1958, pp. 151-152, 228-229, and 374-375). Since Hindi is the 
medium of instruction in schools and colleges and is becoming increasingly 
so even at the university level, it would be reasonable to assume that 
most speakers would at least agree in their judgments about the form 
characterized as the grammatical and appropriate form. The sentences on 
the questionnaire to determine the occurrence of these postpositions 
were as follows. 5 

8a. raam baabuu ke ek hii betaa hai 
Ram Babu of one only son is 

b. raam baabu ko ek hii betaa hai 
Ram Babu of one only son is 

c . raam baabuu kaa ek hii betaa hai 
Ram Babu of one only son is 
Ram Babu has only one son. 

9a. mittaljii ke ek hii be-^ii hai jiskaa naam ushaa hai 
Mittalji of one only daughter is whose name Usha is 

b. mittaljii ko ek hii betii hai jiskaa naam ushaa hai 
Mittalji to one only daughter is whose name Usha is 

c. mlttalji kil ek hii betii hai Jiskaa naam ushaa hai 
Mittalji of one only daughter is whose name Usha is 
Mr. Mittal has only one daughter whose name is Usha. 

10a. kal hii Shriimatii Shrivaastav ke ek betii huii 

yesterday only Mrs . Srivastav of one daughter came to be 

b. kal hii Shriimatii Shriivaastav ko ek betii huii 
yesterday only Mrs . Srivastav to one daughter came to be 

c. kal hii Shriimatii Shriivaastav kii ek be^ii huii 
yesterday only Mrs . Srivastav of one daughter came to be 
A daughter was born to Mrs. Srivastav only yesterday. 

Based on the granmatical descriptions, one would expect the following 
patterns of occurrence in the two regions. 

11. East: 

ko (alternant kAA) 
ke (alternant kAA) 

Instead, the following surprising patterns emerge from the data I have 







8 and 10b 

8 and 9c 

jNo. of respondents 
|who use these 






Three respondents did not fit any of the above patterns: one 
respondent marked 8a, 9c, and 10b grammatical and the rest ungrammatical 
and the other two respondents marked 8c, 9c and 10a grammatical. 

The b sentences were identified as belonging to the variety spoken 
by the people from Bihar and or Eastern Uttar Pradesh by the following 
number of respondents. 


No. of S 

8b 9b 


No. of respondents 

13 12 


Out of the 18 speakers from Bihar/Eastern Uttar Pradesh, 5 did not 
identify with any group of speakers the forms they did not use themselves, 
4 identified all b sentences as occurring in Hindi spoken in Bihar and 
2 identified only 8b and 9b as such. Out of the 27 respondents from other 
parts of the Hindi-speaking area, 8 did not identify the forms they themselves 
did not use and 6 identified all b sentences as occurring in the variety 
spoken in Bihar. 

The total range of data indicates the following: respondents from 
Delhi and Western Uttar Pradesh prefer the genitive postposition kAA; 
respondents from Eastern UP and contiguous areas of Bihar (Western) 
prefer 8c, 9c, and 10b; and respondents from other parts of Bihar prefer 
8b, 10b, and 9c. Note that this results in two subvarieties : one in 
which ko is used only for non-stative (sentence 10) , the other in which 
the genitive showing agreement features is preferred for denoting kinship 
with a female (sentence 9) . The two differ only with respect to denoting 
kinship with a male: one subvariety uses the genitive, the other the 
dative. For the non-stative (sentence 10), both prefer the dative. 

What is clear from the above discussion is that the postposition 
ko is used with relatively greater consistency by speakers from Bihar 
and Eastern UP as compared to the invariant k£ by speakers from the 
Western region. The low use of k£ and the greater use of the genitive 
kAA seems to signal a syntactic change in progress whereby the distinction 
between the so-called inalienable vs. alienable possession is further 
weakening in the entire Hindi-speaking region. 

One plausible explanation for this pattern of usage is as follows. 
The languages/dialects spoken as mother tongues in Eastern Uttar Pradesh 
and Bihar do not make a distinction between the genitive and the dative 
consistently. 7 Several of them have a form k£ which is used both as 
a genitive and as a dative postposition. Thus, the distinction between 
genitive and dative may not seem so crucial to speakers from these 
areas. In addition, all these dialects/languages as well as Hindi have 
a dative subject construction (see Verma 1976 for details of this 
construction in a number of South Asian languages). The use of the 
dative in the possessive construction thus leads to a greater unity of 
construction types in that most non-volitional constructions (whether 
stative or inchoative) end up with a dative subject (Kachru 1981, Pandhari- 
pande 1981). The languages to the east, such as Bengali, use the regular 
genitive forms in ^J. for expressing possession. 

The distinction between the sajfhii vibhakti and the sambandha 
pratyaya that characterized the Sanskrit constructions was lost in the 
Eastern NIA languages so that both 'x has a son' and 'x's son' have the 
same genitive marker. In view of this, the fact that the Eastern variety 
of Hindi does not use the invariant k£ which is claimed to be analogous 
to the sajfhii vibhakti of Sanskrit (Vajpeyi 1958) is not surprising. 
The impending loss of the distinction between k£ and kAA in the Western 
variety, however, seems to signal a syntactic change that needs further 

2. The Ergative Construction 

The ergative construction has been described in detail in Pandharipande 
and Kachru 1977. The major characteristics of the construction are 
as follows. 


I. The agent-subject of a transitive verb in perfective is followed 
by the ergative postposition n£ and the verb ceases to agree with it; 

II. if the DO is unmarked, the verb agrees with it; 
III. if the DO is marked with the postposition k£, the verb is in 
neutral agreement, i.e., it is in the third person masculine singular 
form ; and 

IV. there are some intransitive verbs that govern the ergative 
and there are some transitive verbs that do not; in addition, some 
transitive verbs can occur in constructions with or without the agent- 
subject in the ergative. 

In order to determine the use of the postposition ne, the following 
items were included in the questionnaire. 

12a. ham ne sab kitaabe pafh Hi hai, turn le jaao 
we all books read took have you take go 

b. ham sab kitaab parh liye hai , turn le jao 
we all book read took have you take go 
We have read all the books, you take them. 

13a. raaju ne mujhe koii kitaab nahii dii 
Raju me any book not give 

b. raajuu hamko koii kitaab nahii diyaa 
Raju me any book not gave 
Raju did not give me any book. 

lUa. hamne usee caar kitaabe maagii, usne ek bhii nahii dii 
we him four books asked he one only not gave 

b. ham usse caar tho kitaab maage, u eko nahii dihis 
we him four book asked he one not gave 

We asked him for four books, he didn't give us any. 

39 respondents marked the a sentences 
grammatical, 3 the b sentences; 4 gave a mixed response. Out of the 
18 respondents from Bihar/Eastern Uttar Pradesh, 10 favored the a 
sentences, 1 respondent the b sentences and 7 gave a mixed response. 
Out of the 10 who favored the a sentences, 8 identified the b sentences 
as used in Bihar. Out of the non-Bihari respondents, 23 identified 
the b sentences with Bihar, 2 with Bihar/Uttar Pradesh and 8 with 
Uttar Pradesh. It is clear that the b sentences are overwhelmingly 
identified with Bihar by both the Biharis and others. 

Whereas the use of ne in the Eastern variety is unstable, it is 
expanding in the Western variety. Item 15 was designed to test this. 
The obligative construction exemplified by 15 requires a dative subject. 

15a. kal hamne sinemaa jaanaa thaa, par nahii jaa sake 
yesterday we movie had to go but not go could 

b. kal hame sinemaa jaanaa thaa, par nahii jaa sake 
yesterday we movie had to go but not go could 
Yesterday we intended to go to the movies but we couldn't go. 

Although the respondents overwhelmingly marked the a version 
ungrammatical, 27 identified it with Panjabi speakers. 10 What is interesting 
is that 2 respondents marked both a and b versions grammatical and 3 
marked only a grammatical. All the five respondents who marked 15a 
grammatical came from Delhi and surrounding areas. Thus, the ergative 
marker ne is spreading to the obligative construction exemplified by 15 
in at least one part of the Western region, the region in direct 
contact with Panjabi. 

Note that the acc/dat postposition is nu in Panjabi, ne in Kauravi 
(spoken northeast of Delhi) and nai in Hariyanwi . It is, therefore, 
reasonable to assume that the spread of ne to the obligative construction 
in Hindi is due to contact with these languages and dialects. Furthermore, 
according to Newton 1896, the instrumental nai was used with the subject 
of an infinitive to denote necessity, obligation, purpose or wish. 
It is quite likely that this construction still exists in some varieties 
of Panjabi. The apparent phonological similarity of nai and ne may also 
play a role in the spread of ne to the obligative construction in 
Hindi as currently spoken in this region. 

3. Conclusion 

The questionnaire used in this research was broad in scope in 
that it included several topics from Hindi grammar. In order to arrive 
at firm conclusions with regard to the syntactic change in progress 
tentatively suggested above, much more detailed work needs to be done. 
First, the entire range of possessive constructions, including all 
possible combinations of possessor-possessed and postpositions needs 
to be investigated. Secondly, many more locations and types of populations 
need to be surveyed. According to Pandharipande (personal communication), 
the use of the invariant ke to indicate kinship is restricted to the 
older generation in Nagpuri Hindi and Hindi as spoken in Madhya Pradesh; 
and the younger generation uses consistently either kAA or ko. There 
are no mixed patterns in the varieties investigated by her. It would 
be interesting to determine what the isoglosses would be for the patterns 
in 12. The same is true of the extension of ne^ to the one obligative 
construction included in the questionnaire. As far as I am aware, 
ne is not used with the obligative paynaa or caahiye in any variety 
of Hindi. Even so, all these issues need to be investigated further 

The marked deviation from the standard in the Eastern variety has 
larger implications for the issues of standardization and medium of 
instruction. All the interviewees from Bihar were consistent in rejecting 

the standard forms in their own speech. Their reason for doing so was 
that it sounded artificial. The question naturally arises if it is fair 
to expect school children to write standard Hindi when they grow up 
hearing and speaking the Eastern variety. Note that the ergative construc- 
tion affects a wide domain in that it determines verbal agreement patterns. 
It would be interesting to investigate the rate of failure in Hindi in 
the schools and colleges of Bihar and Eastern Uttar Pradesh to determine 
the educational implications of variation in Hindi. 


The Hindi area extends from the east of the river Yamuna in the 
west to the border of West Bengal in the east and from the Himalayan 
foothills in Western Uttar Pradesh to the borders of Gujarat, Maharashtra, 
and Orissa in the south. The areas where Awadhi, Bhojpuri, Magahi, 
Maithili, Chattisgarhi and Bagheli are spoken are usually included in 
the Eastern Hindi region. For the purposes of my research, I have 
excluded Madhya Pradesh and adjoining areas of Maharashtra, as 
Pandharipande has been studying Hindi spoken in these regions. 

I am grateful to the American Institute of Indian Studies for a 
short-term grant which enabled me to do the fieldwork, to the Research 
Board of the Graduate College of the University of Illinois for a grant 
which enabled me to analyze the data, and to Professor Anvita Abbi 
of the Jawaharlal Nehru University without whose help 1 could not have 
gotten the questionnaire filled. 

There is considerable information available on the Dakkhini 
(Southern variety of Hindi-Urdu (see Kachru 1980a for references.). 
Pandharipande (1980, 1981, 1982a, 1982b) contain valuable information 
on Nagpuri Hindi. Sinha 1979 describes some features of Bihari Hindi. 

I am including the invariant kje here among the genitive post- 

This account of the postpositions does not take into account 
certain factors such as status or emotional distance that determine 
the use of ke vs. ke paas if the possessed is human. See Kachru 1969 
and Pandharipande 1981 for a detailed discussion of these. 

In this paper and on the questionnaire 
do not correspond. 

The numbers in this and subsequent tables and discussions refer 
to the respondents to the questionnaire only. The subjects interviewed 
consistently used the dative k£ in all the possessive constructions 
(i.e., 8b, 9b, and 10b). 

The shared postpositions for accusative, dative and genitive are 
as follows: 

Awadhi: ka/kaa 
Mag ah i : ke 
Bhojpuri: ke/ke 

In addition to these, there are other postpositions that mark the above 
case functions in these languages. Their distribution is determined 
by complex factors. 

There is a great deal of syncretism in the genitive and dative markers 
in the entire course of the historical development of the NIA languages. 
For details, see Chatterjee 1970, pp. 751-762. 

The use of ham 'we' for mat 'I', and the forms eko and dihis 
instead of ek bhii 'even one' and diyaa 'gave' respectively are not 
relevant to this discussion. 

The numbers refer to the respondents to the questionnaire only. 
The interviewees all preferred the b versions, even though they 
characterized the a versions as standard. 

to the respondents to the questionnaire 
only. The interviewees all characterized the a version as ungrammatical , 
and the b version as used by Panjabis/speakers from Delhi. 


CHATTERJEE, Suniti Kumar. 1970. The origin and development of the 
Bengali Language. London: George Allen and Unwin. (First 
published in 1926 by the Calcutta University Press). 

GUNE, Kamta Prasad. 1920. Hindi vyakaran. Varanasi: Kashi Nagari 
Pracharini Sabha. 

KACHRU, Yamuna. 1969. A note on possessive construction in Hindi- 
Urdu. Journal of Linguistics 6.37-45. 

. 1980. Aspects of Hindi grammar. New Delhi: Manohar Publications. 

. 1980a. The syntax of Dakkhini ; a study in language variation 

and language change. Invited paper for a volume on South Asia 

As a Linguistic Area, Osmania University Publications in Linguistics. 

(in press) . 

MCGREGOR, R. S. 1972. Outline of Hindi grammar. Oxford: Oxford 
University Press. 

NEWTON, E. P. 1896. Punjabi manuals and grammars. Reprinted in 1961. 
Patiala: Director Generaloof Languages, Punjab. 

PANDHARIPANDE, Rajeshwari. 1980. Language contact and language variation: 
Nagpuri Marathi . Proceedings of Second International Conference 
on South Asian Languages and Linguistics. 

. 1981. Two faces of language change. Hindi and Marathi 

in central India. Paper presented at the Language Contact 
Symposium held at the University of Wisconsin, Milwaukee. 

. 1981a. Interface of lexicon and grammar: some problems in 

Hindi grammar. SLS 11:2.77-100. 

1982a. Language contact and language change: studies in 
Indian multilingualism. (unpublished manuscript) . University 
of Illinois. 

1982b. Processes of creativity: Marathi-Hindi in central 
India. Paper presented at the 4th South Asian Languages Analysis 
held at Syracuse. 

_ and Y. Kachru. 1977. Relational grammar, ergativity and Hindi- 
Urdu. Lingua 41.217-38. 

Aryendra. 1958. A basic grammar of modern Hindi. New Delhi: 
Directorate of Hindi, Ministry of Education and Social Welfare, 
Government of India. 

Grarah varsh ka samay. Reprinted in Hindi 
1., Devesh Thakur. New Delhi: Meenakshi 

SHUKLA, Ramchandra. 1903. 

ki pahli kahani, i 

Prakashan. 1977. 
SINHA, Prabhakar. 1979. 

IL 40:4.301-311. 
VAJPEYI, Kishoridas. 1958. Hindi 

Nagari Pracharini Sabha. 
VERMA, Manindra K. 1976. The notion subject in South Asian languages. 

Madison, University of Wisconsin: Department of South Asian 


Some linguistic features of Bihari Hindi. 

shabdanushasan. Varanasi: Kashi 

Studies in the Linguistic Sciences 
Volume 12, Number 2, Fall 1982 

Rajeshwari Pandharipande 

This paper focuses on the language contact situation 
in Central India where centuries of stable Marathi-Hindi 
bilingualism has resulted in the convergence of the languages 
in contact. The aim of this paper is (a) to discuss mutual 
borrowings in Hindi and Marathi, (ii) to show that the process 
of borrowing abides by certain linguistic constraints, 
and (iiij to argue that these constraints can be viewed 
as devices or forces which prevent a merger of the languages 
in contact and thereby permit them to maintain their independent 
linguistic identity. It is suggested that the sociolinguistic 
context of the situation in central India further supports 
the above hypothesis. 

1. Introduction . It is a well-known fact that bilingualism 
provides a language contact situation which results in the convergence 
of languages spoken by bilinguals. Gumperz (1964, 1968), Haugen C1953), 
Kachru (1978, 1980), Clyne (1972), Pandharipande (1978, 1980, 1981, ip82) 
among others, have discussed a variety of questions related to the form 
and function of bilingualism and implications for the languages spoken 
by bilinguals. It is evident from the above studies that questions 
such as which of two languages would borrow more linguistic material, 
and which of the two would maintain its independent identity are deter- 
mined by the function(s) of the languages in society. For example, 
a relatively less prestigeous language would borrow more than the other. 
Similarly, a language shared by the whole community would have better 
chances of maintaining its identity than one shared by only a small 
fraction of the society. 

If the above hypothesis is correct, then it follows that: (a) 
when both languages spoken by bilinguals enjoy similar prestige in the 
society, (b) when each has a definite function/functions in the society, 
and (c) when both languages are shared by a whole community (i.e., when 
the whole community is bilingual), then we would expect (i) both languages 
to equally borrow from each other, (ii) both languages to undergo change, 
and (iii) both languages to maintain their independent identity. 

The aim of this paper is to discuss mutual borrowings in Hindi 
and Marathi and to point out that these borrowings support the hypothesis 
of the linguistic covergence of languages in contact, to show that the 
process of borrowing abides by certain linguistic constraints, and 
to argue that these constraints can be viewed as devices or forces which 
prevent a merger of the languages in contact and thereby permit them 
to maintain their independent linguistic identity. 

Before I discuss the language change phenomenon, a note on Hindi- 
Marathi bilingualism in Central India (Nagpur area) is in order. A majority 
people in the Nagpur area are bilinguals. Due to centuries of mutual 
contact of Hindi and Marathi in this area, linguistic convergence of 
Hindi Marathi has taken place. Consequently, these languages spoken 
in this area are known as Nagpuri Hindi CNH) and Nagpuri Marathi (NM) . 
While NH shows the influence of Marathi, NM shows the influence of Hindi. 
Linguistic features borrowed from Marathi into Hindi mark NH separately 
from Hindi spoken elsewhere, i.e., Bihar, Uttar Pradesh, etc. Similarly, 
the linguistic features borrowed from Hindi into Marathi mark NM separately 
from Marathi spoken elsewhere, i.e., Khandesh, Pune, etc. In addition to 
marking "regional" varieties in Central India, NH and NM represent 
cultural as well as emotional identity of the people in the Nagpur 

Although the linguistic convergence of Hindi and Marathi is seen 
at various linguistic levels such as phonology, morphology, syntax, 
semantics, etc., the focus of this paper primarily focuses on the 
convergence phenomenon in the area of (morpho) syntax. 

2. Language Change: Nagpuri Marathi . In earlier papers (1980, 
1981) I have discussed the linguistic changes that Hindi and Marathi 
have undergone in the Nagpur area. I will briefly recapitulate these 
changes here. Examples 1-21 point out borrowings from Hindi into 
Marathi at different linguistic levels, while examples 22-55 illustrate 
borrowings from Marathi into Hindi . 

As mentioned above, the borrowings from Hindi into NM are not 
shared by other varieties of Marathi. In the following discussion 
I will compare NM with one of the major varieties of Marathi, i.e., 
Puneri Marathi (PM) a variety of Marathi, spoken near Pune. The reason 
for choosing PM is that (i) Marathi-Hindi bilingualism is not widespread 
in the area (around Pune) where PM is spoken and (ii) PM has traditionally 
been considered as "standard Marathi". 1 Let us briefly consider the bor- 
rowings from Hindi into Marathi . 

2_^1 Vocabulary . Notice examples 1-4. As can be readily seen,_ 
Nagpuri Marathi (NM) shares vocabulary items with Hindi, while Puneri 
Marathi (PM) does not. 



























' empty ' 











'to fear' 






'at all' 

2.2 Compound/Conjunct Verbs . Examples 5 and 6 show that NM forms 
compound verbs in the same manner as Hindi. In PM, however, certain 
combinations such as gheun ghene (take-take) 'to take for oneself 
or mhapum depe (say-give) 'to say for someone else' are blocked. The 
conclusion is that NM has borrowed this pattern from Hindi. 


'to take 
for oneself 

'to say out' 

Example 7 shows that NM forms certain conjunct verbs such as 
ghussa karpe (anger-do) 'to get angry' just like Hindi, while PM has 
very different formations. 







'gheun ghene 
take take 



le : 


'mhanun dene 











a b 






ragavne ghussa karpe 

gussa karna 

'to get i 

anger do 

anger do 

Adverb Formation. Now let 


consider exampl 


8 and 9. 


Both in Hindi and Marathi an adverb can be derived by adding instrumental 
suffixes (i.e., S£ 'by, with' (Hindi) and n£ 'by' Marathi to a noun). 
However, example 8a shows that not all adverbs are derived by this process. 













As examples 8 and 9 show NM takes a middle position between PM 
and Hindi by using the PM adverbial suffix -ne in combination with the 
Hindi nouns. However, in 8 NM differs from PM by having a suffixal 

2.4 Progressive construction . Now consider examples 10-12. NM 
has borrowed the progressive construction in 11 from Hindi (example 12), 

10. PM: ti gapa mhante ahe 
she song sing-prog is 
She is singing a song. 


11. NM: ti gana mhanun rahili ahe 

she song sing prog. is 
She is singing a song. 

12. Hindi: vah gana ga rahi ht. 

she song sing prog is 
She is singing a song. 

Notice that in NM (example 11) and Hindi (12) the progressive 
aspect is expressed by the perfective form of the verb rah ne 'to be' 
(Marathi) and rahna 'to be' (Hindi). This construction is not employed 
in PM. Instead, PM expresses the progressive aspect of a verb by 
adding /t/ and a vowel /e/ to the verb stem as is evident from example 10. 

2.5 Negation in future tense . Now let us consider the negative 
construction in NM. Consider examples 13-16. 

13. PM: amhi udya mumbaila zau 

NM: we tomorrow Bombay-to will go 
We will go to Bombay tomorrow. 

Ik. Negation 

PM: amhi udya mumbaila zapar nahi 

we tomorrow Bombay to will go not 

We will not go to Bombay tomorrow. 

15. NM: amhi udya mumbaila nahi zau 

we tomorrow Bombay-to not will go 
We will net go to Bombay tomorrow. 

16. Hindi: ham kal bambai nahi jaege 

we tomorrow Bombay not will go 
We will not go to Bombay tomorrow. 

A sentence in the future tense is negated in PM by changing the 
form of the verb (i.e., zau (13) — ) zanar (14)) which is then followed 
by the negative verb nahi 'not' (Dainle (1911:696)). (Notice examples 
13 and 14). The future negative construction described in example 15 
is not acceptable in PM. NM has borrowed this construction from Hindi 
(16). Notice that unlike 14 both NM and Hindi place the negative words 
nahi (M) and nahi (H) before the verb and the main verb does not change 
its form. 

2.6 The pahije 'want' construction . Examples 17-21 show another 
pattern of the negation involving the verb pahije 'to want'. 

17. PM: mala caha pahije 
NM: I-to tea want 

I want tea. 

18. PM: mala caha nako 

I-to tea do not want 
I do not want tea. 

19. NM: mala caha nahl pahije 

I-to tea not want 
I do not want tea. 

20. Hindi: mujhe cay cahiye 

I-to tea want 
I want tea. 

21. Neg: mujhe cay nahi cahiye 

I-to tea not want 
I do not want tea. 

PM negates the verb by substituting the negative verb nako 'do not 
want' (example 18) for the verb pfhi je 'to want' (17). In contrast 
to this, NM retains the verb and places the negative word nahi 'not' 
before the verb - a pattern identical to the one in Hindi, cf. examples 
20 and 21. 

3.0 Borrowings from Marathi into Hindi . In earlier papers (1980, 
1981) 1 have discussed the linguistic material borrowed from Marathi 
into Hindi. These borrowings are observed at various linguistic levels. 
The major reasons for treating these features as borrowings from Marathi 
into NH are as follows: (a) NH is the only variety of Hindi which has 
these features, (b) these features are typically shared by Marathi, 

and (c) the widespread bilingualism in the Nagpur area has provided 
a language contact between Marathi and Hindi. 

3.1 Emphatic particle /c/ . Let us consider examples 22-2U 
which illustrate the borrowing of the emphatic particle /c/ 'only' 

from Marathi into Hindi. Hindi (both SH and NH) has an emphatic particle 
hi 'only' which emphasizes the immediately preceding word/phrase, etc. 
Consider example 22. 

22. SH: vah larka hi_ aya the. 
NH: that boy only came aux. 

That boy (emphatic) had come. 

23. NH: vah la<?kac aya tha 

that boy-emph came aux. 
That boy (emphatic) had come. 

2k. M: to mulgac ala hota 
that boy-emph came aux. 
That boy (emphatic) had come. 

In addition to ]u 'only' NH has another emphatic particle, i.e., 
/c/ which is borrowed from Marathi (example 23). Notice that /c/ is 
the emphatic particle in Marathi (24). 

3.2 Conditional construction . Marathi has a conditional construction 
(example 27) where the auxiliary verb agpe 'to be, to remain' is used. 

SH does not allow the use of the auxiliary rah-na 'to be, to remain', 
which is the semantic counterpart of the Marathi verb asne 'to be' 
(example 26). However, NH allows both types of conditional constructions, 
i.e., the one with the auxiliary verb rahna 'to be, to remain' (example 25) 
and the one without it (26) . 

25- NH: agar vah aya rahta to ml us se mila rahta 

if he came be aux. then I him to met to be aux. 
If he would come I would meet him. 

26. SH: agar vah at a to mj us se mi It a 

if he come then I him to meet 
If he would come I would meet him. 

27. M: to ala asta tar mi tyala bhetlo asto 

he came be aiix. then I him-to met to be aux. 
If he would come, I woiild meet him. 

3.3 Pronoun apan 'we' . The pronoun apan 'we' of NH is functionally 
identical to the Marathi pronoun apart 'we'. On the other hand, apan 

'we' is absent in SH. SH uses the pronoun ham 'we' instead. As a result 
of this borrowing, NH has two pronouns, i.e., apan and ham 'we', which 
convey the same meaning. Consider examples 28-32. 

28. NH: didi , apan kal bazar Jaege 

sister we tomorrow market will go 
Sister, we will go to the market tomorrow. 

29. NH: apan ko si bate zara bhi pasand nahi 

we to such things a little also like not 
We do not like such things at all. 

30. M: tai, apap udya bazarat zau 

sister we tomorrow market-in will go 
Sister, tomorrow we will go to the market. 

31. aplyala asa gos-fi ajibat avadat nahit 
us-to such things at all like not 
We do not like such things at all. 

32. SH: didi, kal ham bazar Jaege 

sister tomorrow we market will go 
Sister, we will go to the market tomorrow. 

Notice that NH uses apan 'we' (37 and 38) where Marathi uses apan 
'we' (30 and 31), while SH uses ham 'we' (32). The borrowing of apan 
'we' in NM has affected the syntactic function of the pronoun ham 
'we' in NH. 


The borrowed pronoun apan 'we' does not replace ham 'we' in all 
the contexts. Consider the following examples which show that when 
the possessive postpositions ka, ke, and ki_ follow the pronoun ham 
'we'the use of ham 'we' is obligatory in both SH and NH. 

33. NH: hamare bacce kal ghar aege 
SH: *apanke 
our children tomorrow home will go 
Our children will come home tomorrow. 

3^. NH: hamare liye kuch am bhijva dijiye 
SH: *apneliye 

*apan ke liye 
us for some mangoes have sent 
Please have some mangoes sent for us. 

Thus in NH ham 'we' is used in contexts such as 33 and 34 
and apan 'we' is used elsewhere, while SH uses ham 'we' in all contexts 
such as in examples 28-34. 

3.4 Coercive causatives . Another construction borrowed from Marathi 
into NH is the coercive causative construction exemplified in 35 SH 
has only one morphological causative construction as in 37 and lacks 
the coercive causative construction described in 42. 

35. NH: me ne us ko kam karneko lagaya 

I ag. him to work do-dat. made 

I made him ( forced him to) do the work. 

36. M: mi tyala kam karayla lavla 

I him-to work do-dat. made 

I made him (forced him to) do the work. 

37. SH: m^ ne us se kam karwaya 

NH: I ag. him by work do-caus-past 
I made him do the work. 

NH has both the coercive causative construction (35) as well as 
the regular non-coercive construction (37). Marathi has two types of 
causatives, i.e., morphological and periphrastic. 36 is an example of 
the periphrastic (coercive) causative in Marathi. Now consider the morpho- 
logical causative construction. 

38. NH: mi tyacya kadun kam karavto 

PM: I him by work do-cause-to 
I make him do the work. 

Notice that 38 is similar to 37 in that both express a non-coercive 
causative construction. Also notice that the coercive (periphrastic) 
causative construction in NH (35) is a caique of the equivalent construction 

in Marathi (36). The dative suffix of Marathi ( la ) is substituted by 
the Hindi suffix (ko) and the auxiliary verb lavne 'to attach' is sub- 
stituted by its semantic equivalent in Hindi, i.e., lagana 'to attach'. 

3.5 Quotative construction . SH does not have a particular morpheme 
or word which can be labeled as a quotative marker, instead, it uses 
a formal complementizer la 'that'. 

39. SH: vah kah raha tha ki usko kavitae parhna pasand ht 

he say prog, was that to him poems to read like aux. 
He was saying that he liked to read poems . 

Kachru (1979:76) points out that in collonuial Hindi preposing of 
the subordinate clause to indicate a quote is common. Kachru (1P79) 
further claims that the Dakhini variety of Hindi-Urdu uses a quotative 
marker bolke (literally having said). Kachru (1979:74-75) points out 
that in SH the linking of two sentences by the complementizer ki^ yields 
reason or purpose interpretations. Consider the following sentences. 

ko. SH: vah bahar nahi nikla ki baris ho rahi thi 
he out not came that rain happening was 
As it was raining, he did not come out. 

kl. SH: m'^ riika tha ki apse mulakat karta calu 
I stopped remained that you meeting doing leave 

I remained here in order to see you before leaving. 

The situation in NH is quite different from SH. NH does not 
use the quotative marker bolke as in Dakhini. Also, it does not use 
the complementizer la to mark a quote. Instead NH uses the quotative 
marker karke (literally 'having done') to carryout the functions of 
ki in SH (examples 39-41). Consider the following examples. 

k2. NH: mt auga karke bol raha tha 
I will come quot. say prog was 
He was saying that he would come. 
(Literally, "I will come" , thus he said.) 

U3. NH: baris ho rahi thi karke vah bahar nahi nikla 
rain happening was quot. he out not came 
It was raining therefore he did not come out . 

kk. apse mulakat karta calu karke ml rvka. tha 

with meeting doing leave quot. I stopped remained 


I remained here in order to see you before leaving. 

The quotative marker karke (literally 'having done') in NH is 
borrowed from Marathi which uses the semantically similar quotative 
karun (literally 'having done'). 

It5- M: yein yein karun to ala nahi 
I will I will quot. he come not 
come come 
After saying "I will come, I will come ", he did not come. 

The use of the verb karne 'to do' as a quotative marker is restricted 
to ? context such as in 45 where the quote involves repetitions 
of the verb. Also karpe 'to do' is used as a quotative in an idiomatic 
expression such as in 46. 

I46. M: hoy na karta 
yes no doing 

After a little bit of unsurety (literally, saying yes 
and no ) . 

However, the quotative most commonly used in Marathi is mhanun 
'having said'. It can replace the verb karne 'to do' in 45 and 46 
and is used as a regular quotative in sentences such as 47 and 47a. 

1+7. M: aica patra ala ahe mhapun to sangat hota 

* karun 
mother's letter come aiix. quot. he say -ing 
He was saying that mother's letter had arrived. 

47a. lavkar parat ye mhanun mu tyala mhatle 
soon comeback quot. I to him said 

I said to him "come back soon". 

The use of karpe 'to do' is blocked in 47 and 48. Mhanun is 
used to indicate a clause of reason or purpose (Damle 1911:935) 
which is similar to karke used in NH. Consider examples 48 and 49 - 
the Marathi counterparts, 43 and 44 in NH respectively. 

h&. M: paus padat hota mhsinun mi baher nighalo nahi 
rain fall -ing quot . I out came out not 
I did not come out because it was raining. 

^9. M: tula bhetun zava mhanun mi thamblo 

you having met go quot. I waited 
I waited in order to meet you. 

Notice that the use of the verb karne is blocked in 48. It is 
interesting to note that the quotative karke in NI' is functionally 
similar to the Marathi quotative mhanun and the complementizer ki^ 
in SH. The use of ki_ is totally blocked in NH. Karke OIH) is semantically 
similar to the Marathi quotative karun . ^ 


3.6 Obligational construction . Examples 50-52 illustrate the 
obligational construction borrowed from Marathi into Nagpuri J'indi. 
Notice that both Nil and Marathi use the infinitive + possessive 
manner as in NH, janeka (50) and in M zayca (51), respectively. 
However, SH uses only the infinitive form of the verb as in 52. IVhile 
SH does not allow constructions like 50, NH allows both 50 and 52. 

50. NH: mujhe bambai janeka ht. 

me-to Bombay go-poss . aux. 
I have to go to Bombay. 

51. M: mala mumbaila zayca ahe 

I-to Bombay to go-poss. aux. 
I have to go to Bombay. 

52. SH: mujhe bambai j ana hfc' 
NH: ne-to Bombay go aux. 

I have to go to Bombay. 

3.7 Abilitative construction . The abilitative construction borrowed 
from Marathi into Nagpuri Hindi is illustrated in examples 53-55. 

53. NH: mujhe angrezi pac^hneko ata ht 

me-to English read-dat. comes aux. 

I can read English. 

(Literally, Reading English comes to me). 

5I+. M: mala ingraji vacayla yeta 
I-to English read-dat. comes 
I can read English. 

55. SH: mujhe angrezi payhna ata ht 

me-to English read comes aux. 
I can read English. 

While SH uses the infinitive form of the verb as in the parhna 
'to read' of 55, NH and Marathi use the infinitive followed by the 
dative case marker ko^ in 53 (NH) and li^ in 54 (M) . 

4. Maintenance of Linguistic identity . The discussion so far 
has shown that both Marathi and Hindi spoken in the Nagpur area have 
borrowed from each other and as a result, both languages have undergone 
change. The mutual borrowings in Marathi and Hindi support the hypo- 
thesis of linguistic convergence of languages in contact. Two major 
questions need to be discussed in this context. First, whether there 
are any constraints on borrowing or whether the mutual borrowings are 
random and irregular. Second, if there are constraints on borrowing, 
what is the function (linguistic/extralinguistic) of these constraints. 
In what follows, I will discuss the phenomena which can be viewed as 
constraints on language change in the above context. 


A close examination of the preceding sections 2.1-3.7 shows that 
the borrowing abides by the following three constraints, i.e., (i) 
languages in contact do not exchange identical linguistic units, (ii) 
borrowing is blocked if it creates ambiguity, and (iii) borrowed material 
does not get fully nativized. In what follows I will examine these 
constraints and argue that together they represent constraints on 
change and a counteracting force which constrains convergence of 
languages in contact. 

4.1 Lack of exchange of identical linguistic units . The borrowings 
discussed in sections 2.1-3.8 point out that the languages in contact 

do not exchange identical linguistic units. For example, NM has borrowed 
a progressive construction from Hindi. However NH has not borrowed 
one from Marathi in exchange. Mutual exchange of identical linguistic 
units is totally blocked. 

4.2 Ambiguity constraint . Recall section 3.1 examples 22-24. 
It is interesting to note that NH borrows the emphatic particle /c/ 
'only' from Marathi. The question is whether there is any reason why 
NM fails to borrow the emphatic particle hi^ 'only' from Hindi. Notice 
that the borrowing of hi 'only' would create ambiguity in Marathi 
(examples 56 and 57). Notice that the native Marathi particle hi^ 
means 'also' while the Hindi particle means 'only'. Example 58 points 
out the semantic ambiguity that would result as a result of the borrowing. 
Similarly, Hindi does not borrow hi 'also' from Marathi for the Scune 

56. M: /c/ 'only', hi_ 'also' 

57. H: /hi/ 'only', bhi_ 'also' 

58. M: Jan lu ala hota 

John only come aux. 

John only had come. 


These examples show that borrowing is blocked if it creates 

4.3 Lack of nativization . Another interesting constraint noticed 
in the above context is that borrowed material does not get completely 
nativized in the languages, as is evident from the following evidence. 
Not all the native grammatical processes apply to borrowed material. 
Thus, the adjective lazalu 'bashful' can be derived from the native 
word laz (60) , but the suffix Vju cannot be added to the borrowed 

noun s arm (example 59). Similarly, in NH the second causative cannot 
be derived from the causative borrowed from Marathi into Hindi (63) 
while it is derived from the native causative (64). 

59. M: sarm 'bashfulness ' -^ *sarmaju 'bashful' 

(borrowing from Hindi ) 

60. laz -> lazaiu 'bashful' 


61. Causative: deneko lagana 'to cause to give' 
NM: (borrowed pattern from Marathi ) 

62. H: dilana 'to cause to give' 


63. Second causative: 

NH: *Qeneko lagwana 'to cause X to cause Y to give' 

6^4. NM, SH: dilwana 'to cause X to cause Y to give' 

In section 3.3 it is pointed out that the pronoun apan 'we' 
(borrowed from Marathi) in NH does not substitute the native Hindi pronoun 
ham 'we' in every context. 

The above constraints on the process of borrowing and on borrowed 
material share one function in common, i.e., they prevent the merger 
of Hindi and Marathi into one language. Notice that the exchange of 
the same linguistic item, the complete nativization of a borrowed item 
would contribute to lessening the linguistic differences and the loss 
of the linguistic identity of the languages in contact. Therefore, 
it is plausible to assume that in a bilingual sociolinguistic context 
such as the one in the Nagpur area, language change abides by two 
opposing forces, linguistic convergence and maintenance of the linguistic 
identity. This argument is further supported by the evidence from 
phonological data from NM and NH. 

The following data shows that as a result of contact with Marathi, 
NH has lost the retroflex flaps 1x1 , /rh/ which exist in SH but not in 
Marathi. Thus the retroflex /d/ and /dh/ merge with /r/ and /rh/ 
in Nh. Let us first consider the loss of the retroflex flaps /r/ 
and /rh/ in NH. 

SH: /d/, /dh/, /r/, and /rh/ 

NH: /d/, /ah/, i 4, 

SH: dal-na 'to put' -> 6Ta. NH: (Jal-na 'to put' 

per 'tree' ^ 68a. ped 'tree' 

(Jhal 'shield' -> 69a. (Jhal 'shield' 

parh-na 'to read ) 70a. padh-na 'to read, 

to study' to study' 

Notice that SH consistently maintains the contrast between the 
retroflex stops /d/ and /dh/ and the retroflex flaps /r/ and /rh/ 
respectively (examples 67-70). In contrast to this, NH substitutes 
the retroflex stops 1^1 and /dh/ for the retroflex flaps /r/ and 
/rh/ respectively (examples 68a-70a). Notice that this loss does not 
affect the structure of Hindi since the loss of the above flaps does 
not create ambiguity. The above retroflex flaps do not carry any 
semantic load in Hindi. 


However, Marathi does not lose the lateral flap l\l as a result 
of its contact with Hindi which typically lacks it. Notice that the 
semantic difference between 71-73 and their respective counterparts 
71a-73a is due to the fact that while 71-73 use the lateral stop /I/, 
their counterparts 71a-73a use the lateral flap /I/. 

71. mala 'to me' 71a. mala 'garden' 

72. kala 'art' 72a. kala 'pains' 

73. ukal 'disentangle/ 73a. ukal 'boil' 


Notice that the loss of the lateral flap would create semantic 
ambiguity in examples, such as 71-73. 

The question may arise then as to why the above languages borrow 
at all since mutual borrowings make the languages more like each other, 
especially when languages such as the above borrow even at the cost 
of complicating their grammatical systems. Notice that Marathi has 
a native progressive construction. VVhy does it borrow another progressive 
construction from Hindi? Similarly, Hindi has a conditional construction. 
Why does it borrow another one from Marathi? 

5. Borrowings: Possible explanations . In the following discussion 
1 will point out that some of the borrowings are justifiable on the 
grounds of 'gap filling', 'grammar simplification', the principle of 
'maximum difference'. However, not all of the borrowings can be explained 
by the above principles. Therefore, I will argue that in order to 
provide a unified explanation for the borrowings and the constraints 
on the borrowings, it is essential to take into account the sociocultural 
setting of the language contact situation in Central India. 

Let us consider the borrowings in section 2.2 which show the 
formation of new compound/conjunct verbs in NM as a result of the 
influence of Hindi. The use of the explicators (auxiliary verbs) 
ghene 'to take' and depe 'to give' is already common in Marathi. The 
use of these explicators in contexts such as 5 and 6 is blocked in 
PM, and other varieties of Marathi. Examples 5 and 6 show that the 
influence of Hindi has filled in the gaps in the Marathi compound verb 
system by providing compound verbs such as .S and 6 on the model of 
karun dene 'to do for someone else' and karuh ghene 'to do for oneself, 
do give "" 

Conjunct verbs are very commonly used also in Marathi. For example, 

Ik. usir hone usir karne 'to delay' 

delay to become delay to do 

kalji asne kalji karije 
worry to be worry to be 

However, the noun rag 'anger' does not have such a pair of conjunct 
verbs, i.e., while PM has rag yepe 'to become angry', its expected 

anger come 
counterpart rag karpe 'to make angry' is absent. The borrowing 

anger to do 
of the word ghussa 'anger' (Hindi) regularizes the pattern, by producing 
ghussa yene 'to become angry ' ghussa karpe 'to get angry' (to choose 
anger come angry to do 

to get angry) . 

Now let us consider the progressive construction borrowed into MM 
from Hindi (section 2.4). Possible explanations for the above is as 
follows: In PM, the difference between the simple imperfect and the 
progressive construction is marginal. For example, compare the following 
two sentences with each other. 

76. present imperfect: 

ti gap a mhapte 
she song sings 
She sings songs . 

77. present progressive: 

ti gana mhante ahe 

she song sing is (prog. ) 

She is singing a song. 

Notice that the auxiliary ahe 'is' is the only factor which dis- 
tinguishes 76 from 77. In the spoken language the auxiliary ahe 
is replaced by a semi-vowel /y/, i.e., mhantey = 'is singing', which 
brings the two forms, i.e., present imperfect and progressive even 
closer to each other. 

Now compare example 77 with the progressive construction in Nfl 
(example 11) borrowed from Hindi. Notice that the additional auxiliary 
rahpe 'to remain (progressive)' clearly marks the progressive construction 
separately from the present imperfect construction. Thus by the principle 
of maximum difference, this construction is readily allowed in the 
grammatical system of NM. The negative constructions discussed in 
sections 2.5 and 2.6 can be viewed as the result of simplification 
of the negation patterns in PM. Recall examples 13, 14, 17, and 18 
in PM. While a negative sentence in the future tense requires_a 
change in the verb form plus the insertion of the negation nahi 'no' , 
negative of the pahije -construction replaces the verb by the negative 
verb nako 'do not want' cf. 18. In contrast to this, the pattern of 
negation in Hindi is simple (16, 2(^ 21) in that it requires only the 
insertion of the negation word nahi 'no' before the verb without 
any change in the form of the verb. By borrowing the negation pattern 
from Hindi, NM has in fact simplified the system of negaiton in 


Let us consider the borrowings from Marathi into NH. Some of the 
borrowings are explainable on the basis of their 'gap-filling' function. 
For example recall section 3.5. Hindi lacks quotative markers. There- 
fore, the borrowing of kar ke 'thus' is readily acceptable to the 
grammar of Hindi. Similarly, the borrowing of the coercive causative 
(section 3.4) from Marathi into NM is perfectly justifiable on the 
same basis, i.e., the borrowing of the above fills in the gap of 
coercive causatives in NH. 

Now recall section 3.3. The use of apan 'we' in NH instead of 
ham 'we' (SH) on the basis of the pronoun apan 'we' in Marathi is 
justifiable on the basis of the following; the pronoun ham 'we' in 
SH is ambiguous with reference to the hearer, i.e., ham 'we' means 
either 'I + others (excluding the hearer)' or ' I + you' (hearer). 
In contrast to this, the Marathi pronoun apaq 'we' unambiguously 
includes the hearer. Thus its borrowing into NH, is justifiable, since 
it disambiguates the meaning of the pronoun. 

In contrast to the above the borrowing of the conditional (section 
3.2) ,obligational (section 3.6), abilitative (section 3.7) constructions 
is not justifiable on the grounds of 'gap-filling' or 'grammar simpli- 
fication ', etc. There are constructions equivalent to the above in 
the borrowing language. 

5.1 Sociocultu:-al context: A possible explanation . The preceding 
discussion clearly points out that there are two counteracting 'pulls' 
operating on the language change phenomenon, i.e., the 'pull' or 'force' 
which motivates 'convergence' of languages in contact and the other 
'force' which operates as a 'buffer' to prevent total merger of the 

In addition to being inadequate (section 5.), an hypothesis which 
takes into account only the grammatical structures of the languages 
in contact, totally ignores the role of sociolinguistic function of 
mutual borrowings in the languages under focus. Sankoff (1980:48) 
while discussing variation, argues "... the distribution of linguistic 
features cannot be understood solely in terms of their internal rela- 
tionships within grammar, but must be seen as part of the broader socio- 
cultural context in which they occur." 

The sociolinguistic attitudes of the people/function of NM and 
NH in society provide clues for a better understanding of the situation 
in the Nagpur area. Let us first consider the function of NM and NH. 
These two are recognized as the regional varieties. Thus they serve 
as 'codes' for the cultural and emotional identity of the speakers in 
the Nagpur area, (i.e., NagpurT lok (M) 'people from Nagpur). Thus 
NM and NH have a janus-like character. As linguistic systems, they 
represent a convergence of Marathi and Hindi in the Nagpur area; 

as codes they represent the regional identity of the people in that 
area. Fishman (1972:16) points out "dialects may easily come to 
represent (to stand for, to connote, to symbolize) other factors than 
geographic ones." 

Bilingual ism in the Nagpur area represents the mixture of the 
Hindi-speaking and Marathi-speaking speech communities. The mutual 
linguistic influence of Hindi and Marathi has a definite social function 
in society. The terms Nagpuri Marathi and Nagpuri Hindi certainly 
indicate those varieties of Marathi and Hindi which (among other 
features) show a marked influence of Hindi and Marathi, respectively. 
However, NM and NH do not only represent language varieties, but rather 
they stand for the people, culture, and society in the Nagpur area. 
Thus those regional varieties mark the speaker's cultural and social 

In Halliday's (1978:51) words those varieties have a certain 
"meaning potential". The "meaning potential" here is the cultural/ 
social identity of the people in the Nagpur area. If we take into account 
the social function of these varieties, it becomes clear that these 
are 'codes' (Hasan 1973:258) which according to Halliday (1978:68) 
are "types of social semiotic, symbolic orders of meaning generated 
by social systems". Thus these 'codes' (i.e., NM and NH) transmit 
underlying patterns of the mixed culture which exist in Central India. 
The above discussion makes it plausible to assume that it is the mixed 
culture, bilingualism, and the social system in the area which is 
responsible for the language change, in terms of the convergence of 
Marathi and Hindi spoken in the Nagpur area. 

Now let us consider the possible motivation for the other side 
of the picture. Maintenance of separate linguistic identities I think 
is the motivation for preventing a total merger of the languages. 
In earlier work (1981) I have pointed out that bilinguals in the Nagpur 
area have four linguistic 'codes' available to them. The following 
diagram summarizes the distribution of codes according to their social 

Speakers of Marathi L Hindi L^ 


informal formal market 1. interstate 

contexts contexts business communication 

(home, (school transaction 2. used mostly 

peer group, official as a written 

etc.) correspondence, language 

news media, etc.) (in school) 

Speakers Hindi L, Marathi L 


(home, peer 
group, etc.) 


forma 1 
news media, 

with friends, 


used only in 
written form/ 
rarely spoken 

Notice that while NM and NH are used in the 'informal' context, 
PM and SH are used in the 'formal context'. While emotional identity 
of Marathi (L ) and Hindi (L ) speakers is expressed by NM and NH 
respectively, their competence in the respective standard varieties 
(i.e., PM and SH) is important to them. Their competence in their 
respective standard varieties is essential for a wider communication 
with the Marathi Hindi speakers in other parts of India. Moreover, 
the insistence of the bilinguals in the Nagpur area for using the 
respective standard varieties in the formal contexts indicates that 
Marathi and Hindi speaking communities (i.e., which use Marathi and 
Hindi as their L. , respectively) are aware of their independent lin- 
guistic identities and in fact they intend to maintain them. This 
attitude of bilinguals is at least partially responsible for the resistence 
to a total merger of the two languages in contact. 

The hypothesis in this paper is especially relevant for a better 
understanding of the form and function of widespread stable bilingualism/ 
multilingualism in India. Recent studies (Masica 1976, Kachru 1980, 
Hook 1982) provide various points of comparison for the languages 
spoken in India. These studies are aimed at investigating the similarities 
across languages which support Emeneau's (1956) hypothesis about India 
as a linguistic area. The hypothesis proposed here throws light on 
the other side of the picture, i.e., it points out that there are 
certain 'buffers' operating as linguistic mechanisms which control 
a merger of languages in contact. Further investigation is necessary 
to determine whether the constraints discussed here are applicable to 
the convergence of other languages in India. 

Another question which needs to be discussed is whether the 
constraints discussed here are applicable to the process of borrowing 
in general or whether they are restricted only to the mutual borrowing 
of languages spoken by bilinguals. A large body of data needs to be 
investigated before any conclusive statement is made. 

There are two implicit assumptions in the hypothesis proposed here: 
(a) when the maintenance of an independent linguistic identity is 
necessary/ important/possible for speakers, then we expect that the 
constraints proposed here would apply, and (b) even though one of the 
two languages in contact is more prestigious than the other, the above 
constraints will operate if the linguistic identity of the languages 
needs to be maintained. 

A great deal of research is necessary in order to either strengthen 
or falsify these assumptions. For example, the validity of asstimption 
(a) can be well examined in the context of immigrant languages in the 
U.S.A., such as German, Hindi, Norwegian, etc. Speakers of these languages 
are generally hi Unguals, i .e. , they speak English in addition to 
their native language. It is difficulty, though not impossible, for 
the speakers to maintain their linguistic identity in the U.S.A. 
where English is used in almost all walks of life, except perhaps in 
the homes and in a few other social contexts. In this situation, 
we expect that the above constraints would not operate on the borrowings 
of English into the native languages spoken by bilingual immigrants. 

The validity of assumption (b) can be examined in the context 
of the borrowing from English into the native modern Indian languages. 
A majority of speakers of English in India are bilinguals, i.e., they 
speak English as well as at least one modern Indian language. Kachru 
(1982) discusses and defines the relative domains of the sociolinguistic 
function of English and of modern Indian languages. From studies on 
the bilingual/multilingual setting in India (Kachru 1981, 1982, 
Pandharipande 1982, and Sridhar 1982), it is clear, that it is necessary 
to maintain an independent identity of both English and the native 
Indian languages of the bilinguals. A closer examination of the 
borrowings from English into Indian languages would provide insights 
into the applicability of the constraints discussed in this paper. 

2-3.6 as opposed to SH and PM, respectively, is to show the contrast 
between Hindi and Marathi in general. No particular variety of 
Hindi (NH, SH, etc.) or of Marathi (NM, PM, etc.) is represented by 
the terms Hindi and Marathi . 

It is not clear at this point why NH has borrowed the more 
restricted quotative karke instead of mhanun which is more commonly 
used in Marathi. It is interesting to note that a form of the verb 
karrie 'to do' in Marathi is used to convey purpose or cause. Consider 
the following examples: 

(a) purpose: to hindi ?ikoya karta bharatala gela 

he Hindi learn-in order India-to went 


He went to India in order to learn Hindi. 

Cb) cause: p£se naKit y akarta ti sikat nahi 
money not therefore he studies neg. 
He does not have money. Therefore, she does not study, 
i.e., does not go to school. 

CLYNE, M. 1972. Perspectives on language contact. Based on a study 

of German in Australia. Melbourne: Hawthorn Press. 
DAMLE, M. K. 1911. (Reprinted in 1965). Sastriya Marathi vyakaran. 
EMENEAU, M. B. 1956. India as a linguistic area. Language 32:1.8-16. 

FISHMAN, J. A. 1972. Language and nationalism. Rowley, MA: Newbury 

GUMPERZ, J. J. 1964. Linguistic and social interaction in two communities. 
American Antrhopologist 66:2.137-54. 

. 1968. Types of linguistic communities. In Joshua Fishman, 

ed. : Readings in the sociology of language. The Hague: Mouton. 

HALLIDAY, M. A. K. 1978. Language as a social semiotic: the social 
interpretation of language and meaning. Baltimore: University 
Park Press. 

HASAN, R. 1973. Code, register and social dialect. In Basil Bernstein, 
ed.: Class, codes and control 2: applied studies towards a socio- 
logy of language (Primary socialization, language and education). 
London: Rout ledge and Kegan Paul. 

HAUGEN, E. 1953. The Norwegian language in America: a study of bilingual 
behavior. Philadelphia: University of Pennsylvania Press. 

HOOK, P. E. 1982. South Asia as a semantic area: convergent meanings 
and divergent forms. Paper presented at the Fourth South Asian 
Languages Analysis held in Syracuse, NY, May, 1982. 

JGSHI, S. K. 1965. ?abdacya zatf. In M. K. Damle, Jastriya Marathi 
vyakaran. Pune: Venus Prakasan. 

KACHRU, B. 1978. Toward structuring code-mixing: an Indian per- 
spective. International Journal of the Sociology of Language 

. 1980. The bi Ungual's linguistic repertoire. To appear 

in B. Hartford and A. Valdman, eds. : Issues in international 
bilingual education: the role of the vernacular. New York: 

. 1981. The Indianization of English: the English language 

India. New Delhi: Oxford University Press. 

1982. Models of non-native Englishes. In B. Kachru, ed. 

The other tongue, pp. 31-57. Urbana: University of Illinois. 
KACHRU, Y. 1979. The quotative in South Asian languages. South 

Asian Languages Analysis 1.63-78. 
MASICA, C. 1976. Defining a linguistic area: India. Chicago: 

University of Chicago Press. 
PANDHARIPANDE, R. 1978. Nativization of loanwords in Marathi, a 

matter of linguistic attitude? Paper presented at the LSA 

summer meeting at the University of Illinois, Urbana. 
. 1980. Language contact and language variation: Nagpuri 

Marathi. Proceedings of Second International Conference on 

South Asian Languages and Linguistics. 
. 1981. Two faces of language change. Hindi and Marathi 

in Central India. Paper presented at the Language Contact 
Symposium held at the University of Wisconsin, Milwaukee. 

. 1982. Language contact and language change: studies in 

Indian multilingualism. (unpublished manuscript. University of Illinois), 
SANKOFF, G. 1980. The social life of language. Philadelphia: 

University of Pennsylvania Press. 
SRIDHAR, K. K. 1982. English in a South Indian urban context. 

In B. Kachru, ed. : The other tongue, pp. 141-53. Urbana: 

University of Illinois Press. 

Studies in the Linguistic Sciences 
Volume 12, Number 2, Fall 1982 

Elizabeth Pearce 

Morin and St-Amour (1977) claim that all Old French 
infinitival complements are base-generated VPs and that a 
similar analysis applies to Modern French. This situation 
is said to have arisen through the loss of underlying S 
infinitival complements in Late Latin. The evidence for 
these claims largely comes from case marking and the place- 
ment of clitics. Changes in Modern French are attributed 
not to change in base structure, but to change in a 'perme- 
ability' feature governing the placement of clitic pronouns. 

By reexamining the evidence of Old French and comparing 
it to the structures of Modern French and other Romance 
languages, this paper comes to rather different conclusions: 
(a) Evidence supporting surface VP infinitival complements 
in Old French is convincing, but later developments in 
French do not support such an analysis. (b) Where Morin and 
St-Amour claim a differentiation of infinitival complement 
structures only for Modern French, the evidence shows that it 
goes back to Old French. (c) These and other findings suggest 
the need for a reexamination of the historical developments 
from Latin, via Old French, to Modern French. 

0. Introduction 

Historical developments in Romance languages show differing resolu- 
tions of the forms taken by infinitival complements. This paper will 
draw on evidence from Old French to consider how infinitival complements 
in the earliest attested stage of French can be analyzed and how the 
proposed synchronic analysis then bears on the analysis of the diachronic 

Discussions of developments in infinitival complements in Romance 
languages focus on the forms of two classes of infinitival complements, 
represented abstractly in (1) and (2): 

(1) NP V [NP Inf . . .] 

(2) NP^ V [NPj^ Inf . . .] 

The NPs shown in (1) and (2) indicate the role of subject either of a 
governing verb (= 'V') or of an infinitive complement. The subscripts 
indicate referential properties of the NPs. In (1) the subject of the 
governing verb is non-distinct from the subject of the infinitival 
complement. In (2) the subject of the governing verb and the subject 
of the infinitival are distinct. Evidence of diachronic change in the 
surface forms of types (1) and (2) appears in French as well as in 
other Romance languages. 


For type (1), whereas Latin permitted a surface reflex of the subject 
of the infinitive in the form of a reflexive pronoun (see Section 2 below 
for further discussion), no such surface reflex appears in any attested 
stage of French. However, one observable diachronic change for type (1) 
in French is that, in Old French, pronoun complements of the infinitives 
were attached to the governing verb, whereas in Modern French such pronoun 
complements are associated with their infinitive. This change leads to 
differing analyses for the 2 stages in question 

Analyses of stages in the patterns of evolution of type (2) focus, 
on the one hand, on the emergence of the 'causative' construction, con- 
taining a dative-marked subject of the infinitive, as in Mod. Fr. faire 
faire quelque chose a quelqu'un 'to make (to) someone do something'; and, 
on the other hand, on the tate of the 'accusative + infinitive' con- 
struction, as in Lat. facere eum domum aedificare 'to make him build a 
house', in which the subject ot the infinitive, eum , is marked as an 
accusative. The causative construction is well attested in the earliest 
stages of the Modern Romance languages and the accusative + infinitive 
construction is well attested in Latin. Speculation therefore centres 
around the emergence of the causative construction in the Romance 
languages as they develop from Latin and the relative status of the 
accusative + infinitive construction in subsequent developments. The 
outcome in Modern French, for example, is that the causative construction 
is the only possible construction with the verb faire , whereas laisser 
'to permit/let/allow' and the perception verbs show competing causative 
and accusative + infinitive constructions. 

This paper will consider the developments outlined above primarily 
from the point of view of whether the brackets shown in (1) and (2) are 
to be analyzed as underlying Ss or as underlying VPs and of what either 
one of these analyses means in terms of the historical developments. 
The data to be examined, for the most part, will be that of constructions 
of type (2) in early Old French. I will discuss, in particular, the 
analysis of Morin and St-Amour (1977) in Section 1, and I will consider 
their claims as they apply to a systematic collection of data from the 
earliest stages of Old French in Section 3. In the remaining sections 
of the paper, I will discuss analyses of modern Romance data which are 
relevant to the Old French material in question (Section 2) and con- 
clusions which can be drawn on the basis of the possible paths of 
evolution of French given the available analyses of the historical changes 
considered in earlier sections (Section 4) . 

1. The analysis of Morin and St-Amour (1977) 

1.0 Morin and St-Amour (1977) claim that all infinitiyals in Old 
French are base-generated VPs and that a similar analysis applies to 
infinitival s in Modern French. They claim that developments in Late 
Latin point to the loss of underlying S infinitval complements, which 
in their view, come to be replaced by VPs. The evidence of clitic place- 
ment in verb + infinitive constructions in Old French and the evidence 
of the use of causative type constructions in Old French, forms the 
basis of their argument that Old French provides a clear case of base- 
generated VP infinitival complements. The later introduction of the 
accusative + infinitive construction and the changes in clitic pronoun 


placement do not, however, lead them to the conclusion that there is a 
change from VP to S complementation. Rather, the evidence, even in 
Modern French, for variability in pronoun placement and variability in 
the placement possibilities for floating quantifiers in verb + infinitive 
constructions leads to the suggestion that the changes in pronoun place- 
ment simply reflect a change in a 'permeability' feature associated with 
governing verbs. Similarly, the development of the accusative + infini- 
tive construction does not lead them to account for this change as a shift 
from VP to S complements, but to another type of structural differentiation 
based on the introduction of an NP complement as the object of the governing 

In this section, the claims of Morin and St-Amout (1977) will be 
outlined as follows: Section 1.1 will present the basis of their claims 
about the evolution from Late Latin to Old French; Section 1.2 will dis- 
cuss the claims about infinitival complements in Old French; and Section 
1.3 will consider the evidence put forward for the continuation of VP 
complementation to Modern French. 

1.1 Morin and St-Amour attribute the loss of the Latin constructions 
represented by (3) and (4) below to the replacement of underlying S 
infinitival complements by VP infinitivals in Late Latin. 

(3) (is) me venire vult 
'he wants me to come' 

(4) (is) se venire vult 

'he wants (himself) to come' 

In (3) and (4) the subject of the infinitive appears as an accusative 
( me in (3) and the reflexive s£ in (4)). These forms are in contrast 
with their Modern French equivalents: 

(5) il veut que je vienne 

(6) il veut venir 

which do not, for the governing verbs in question, allow an accusative 
marked subject of an infinitive. As in (5), the only possible version 
of (3) in Modern French has a clausal rather than an infinitival comple- 
ment. The modern reflex of (4) has an infinitival complement without 
any surface manifestation particular to the subject of the infinitive. 
Thus, the position taken by Morin and St-Amour is that the Late Latin 
developments show a shift in complementation possibilities to allow 
either an underlying S complement which is manifested as a surface 
tensed clause or an underlying VP complement which is manifested as 
an infinitival and which does not contain in the surface (nor in the 
underlying structure) a subject of the infinitive. For the verbs in 
question, the tensed S complement is the only available complement 
type in the type (2) construction and the VP complement is necessarily 
restricted to the type (1) variety. 

This account of developments in Late Latin is one part of the 
argumentation which Morin and St-Amour use to substantiate their claim 


that all infinitival complements in Old French are underlying VPs. That 
is, the changes outlined here would not in themselves be proof of the 
hypothesis for infinitivals in Old French (even just for a particular 
class of verbs), but they are not in conflict with, and tend to support, 
the hypothesis when they are viewed in combination with the analysis 
of the evidence from Old French. 

1.2 There are two characteristics, in particular, of the structure of 
Old French that lead Morin and St-Amour to propose that all infinitivals 
in Old French are underlying VPs: the attachment of clitic pronouns to 
governing verbs and the use of 'clause union' type constructions^ (hithei 
to referred to as the 'causative' construction) in the case of governing 
verbs which enter into the class (2) construction but which do not enter 
into the class (1) construction. 

In Old French, pronouns which are complements of infinitives are 
cliticized to the governing verb rather than, as in Modern French, to 
the infinitive. The examples below show the relevant grammatical ity 
ratings for Old French and for Modern French. 


(7) a. Je les veux manger j x 

b. Je veux les manger x «/ 

'I want to eat them' 

However, when an infinitive is governed by a preposition, its 
complement pronouns may not move out of the infinitival phrase. In 
Old French such pronouns appear in stressed form, whereas in Modem 
French they are cliticized to the infinitive. 

C8) a. (II est venu) pour moi ocire (OF) 
b. pour me tuer (MF) 

'(he came) in order to kill me' 

Morin and St-Amour claim that the preposition (thus, a PP node) in 
structures like (8a) blocks the movement of complement pronouns of the 
infinitive, whereas there is no such additional node present to block 
movement in constructions like (7a). They conclude that the infinitival 
phrase in a structure (7a) is a base-genrated VP, directly embedded 
under the higher VP, roughly as in: 

(9) /S, 




y \ 



1 / \ 

veux V CI 

1 1 
manger les 

Amov^ent rule then applies on structures like (9) to derive an output 
like (7a). 


The second argument put forward by Morin and St-Amour in support of 
their claim that infinitival complements are base-generated as VPs in Old 
French concerns the nature of verb + infinitive constructions of type (2). 
They claim that, before the 2nd half of the 13th century, all such con- 
structions are of the clause union type, thus surface VPs, and could, 
therefore, be directly generated as VPs. The clause union construction 
can be characterized by the fact that the case marking of the subject of 
the infinitive varies in accordance with whether the infinitive is intran- 
sitive or transitive. When the infinitive is intransitive its subject is 
marked as an accusative, and when the infinitive is transitive its subject 
is marked as a dative. The examples below illustrate this property for 
a range of governing verbs and should be taken as representative of the 
verb classes included ( faire in (10) and laisser in (11) = 'causative'; 
voir in (12) = 'perception'; estevoir in (13) = impersonal; commander = 
'order/ say' ) . 

(10)a. En seintes flurs il les facet gesir. (Rol 1856) 
'in holy flowers he makes them lie' 

b. A mil Franceis funt ben cercer la vile, (Rol 3661) 
'they make a thousand Frenchmen encircle the town' 

(ll)a. Ainz dist qu'il le_ laissast uncore reposer, (Beck 2044) 
'so he said that he would let him rest still more' 

b. Bien lur deit hum laissier lur costumes tenir; (Beck 2787) 
'well must one allow them to keep their customs' 

(12)a. Vus ]_e verrez murrir encui." (Brend 340) 
'you will see him die today' 

b. E li abes le veit traire 

A cent malfez chil funt braire . (Brend 1205) 

'and the abbot sees him pulled along 

by a hundred evil-doers who make him call out' 

(13)a. Or est le jur qu'el£ estuvrat murir." (Rol 1242) 

'now is the day that it will be fitting for them to die' 

b. dune estuvera a celui ki I'avera entre mains numer sun guarant, 
'thus it will be fitting for the one who will have it between his 
hands to name his guarantee' (Lois 21) 

(14)a. Par penitence les cumande(t) a fCrir. (Rol 1138) 
'through penitence he orders them to strike' 

b. L'empereitrli cumande(t) a guarder. (Rol 2527) 
'he orders him to guard the emperor 

In (10) - (14) the (a) examples contain intransitive infinitives with accusa- 
tive marked subjects and the (b) examples have transitive infinitives with 
dative marked subjects. The data thus exemplifies the use of the clause 
union construction which, Morin and St-Amour claim, is general for verbs 
having the type (2) pattern in Old French. 

If, as we have just seen, both type (1) and type (2) constructions can 
be analyzed as surface VPs, and furthermore, if there are no 'non-surface VP' 


infinitives in Old French, then, as Morin and St-Amour argue, we can 
consider the possibility that all infinitivals in Old French are base- 
generated as VPs and that there are no base-generated S infinitivals 
in Old French. The advantage of such a proposal is that it eliminates 
the need for additional rules that would be required to reduce base- 
generated Ss to surface VPs. However, the proposal calls for a special 
resolution for the semantics of the subject of the infinitive in the 
clause union constructions. Morin and St-Amour propose that intransitive 
infinitives in clause union constructions will be base-generated as in 
(16a) and transitive infinitives as in (16b). 

(15)a. Andr6 fait partir Jean. 
'Andr6 makes Jean leave' 
b. Andr6 fait manger les carottes a Jean. 
'Andrg makes Jean eat the carrots' 


NP ^VP, 

I ^ \ 

Andr# V VP 

fait V NP 

partir Jean 



fait V NP ^ PP 
manger les carottes k Jean 

In the analysis of Morin and St-Amour, the interpretive component will have 
access to subcategorization specifications to derive the appropriate semantic 
representations from such structures. Thus, for [ partir Jean] in (16a), 
the interpretive component will have access to the information that partir 
is intransitive and will assign the only possible argument relation of 
subject to Jean. The verb manger , on the other hand, will be defined as 
(optionally) transitive and the role of subject of manger in (16b) will be 
assigned to the dative marked I Jean . Other possibilities with unspecified 
subjects will derive semantic representations including a 'PRO' subject. 

Thus, in the terms of the analysis of Morin and St-Amour, developments 
in Late Latin exhibit a tendency towards the loss of infinitival S comple- 
ments as they come to be replaced by infinitival VP complements and, in the 
period of the earliest attested material in French, the generalization of 
underlying VP infinitival complements is demonstrated as complete with 
extension to the class (2) type as manifested in the 'clause union' con- 

1.3. The particular characteristics of infinitival constructions in Old 
French which have been outlined in Section 1.2 and which form the basis 
of the VP complement proposal put forward by Morin and St-Amour do not, 
however, remain stable in the subsequent evolution of the language. Pronoun 
complements of infinitives begin to evolve in the direction of the Modern 
French forms (cf. (7b)) in which they remain associated with the infini- 
tive and can no longer be attached to the governing verb. As shown in 
Gougenheim (1929) and in the data presented in Galet (1971) the innovating 
form comes to be preponderant in the 2nd half of the 17th century. The 
eventual fate of the clause union construction (at least to the present 
time) is that it becomes restricted to a small list of governing verbs 

(obligatory with faire , 'optional' with laisser , and reaching a very low 
frequency of occurrence with the perception verbs and the verbs envoyer 
and mener ) . The competitor for the clause union construction is a newly 
introduced accusative + infinitive type, which, according to Morin and 
St-Amour, begins to emerge in the 2nd half of the 13th century. 

Somce. as we have seen, Morin and St-Amour use the evidence of the 
attachment of pronoun complements of infinitives to their governing verb 
to support the analysis of VP infinitival complementation, then we might 
expect that the reversal of this positioning of pronoun complements would 
lead to the analysis of the new forms as S infinitival complements. 
However, this is not the case. Morin and St-Amour have preferred to 
analyse the new forms as differing not in structure from the old; but 
differing in a feature applying to the governing verb, which they call the 
'permeability' feature. The reason for this approach is that there is 
evidence of some variation in the movement possibilities of certain types 
of pronouns and of quantifiers. Thus, Modern French examples like those 
in C17) (from Morin and St-^nour, p. 143) and (19) are in contrast with 
(18) and (20) respectively. 

(17)a. Tu devrais laisser en acheter (quelques-uns) k ta fille. 
'You ought to let your daughter buy some (of them).' 

b. (des questions), ^a a fait s'en poser (plusieurs) aux auditeurs 
'(questions), that made the listeners ask themselves (several)' 

(18)a'. Tu devrais les laisser acheter h ta fille. 
a". *Tu devrais laisser les scheter a ta fille. 
'You ought to let your daughter buy them.' 

b'. (Ja les a fait oublier aux auditeurs. 
b'. *(Ja a fait les oublier aux auditeurs. 
'That made the listeners forget them.' 

(19)a. J'ai voulu les rSparer tous . 
b. J'ai tous voulu les reparer. 

'I wanted to repair thera all.' 

(20)?(. J'ai certifie les avoir tous lus. 
b. *J'ai tous certifie les avoir lus. 

'I certified having read them all.' 

Although (17a) and (17b) are clause union constructions as evidenced by 
the dative marking for the subject of the infinitive ( a ta fille and aux 
auditeurs ) , the pronoun en in (17a) and in (17b) and the reflexive pronoun 
se (£') in (17b) remain associated with the infinitive. That this is not 
possible with the complement pronoun les is shown by the acceptability 
ratings in (18). Morin and St-Amour argue, therefore, that the movement 
possibility is not blocked by structural characteristics, since both (17a) 
and (17b) have surface VPs as complements of faire and laisser in the 
clause union construction. The excunples in (17) therefore show absence 
of movement out of an embedded VP. Extension of this observation to the 
(7b) type ( je veux les manger ) leads to the conclusion that the structure 
in (7b) also has a VP infinitival complement and that the possibility of 


movement is attributable to a feature on the governing verb. Similarly, 
(19) and (20) show differing movement possibilities for the floating 
quantifier tous , which, again, is analyzed as a function of a feature on 
the governing verb- -the permeability feature. 

Obviously, it would be preferable to derive the contrast shown in 
(17) - (20) through general properties of the grammar of the language 
rather than through idiosyncratic feature specifications located on 
governing verbs. It does appear, however, that, at least for the case 
of floating quantifiers, the feature specification approach will be 
necessary, especially when we consider that quantifiers may float out 
of 'tensed' clauses with certain governing verbs. Thus, Pollock (1978; 
102-107) gives the following acceptability ratings: 

(21)a. ?Je veux tous qu'ils partent. 
'I want them all to leave.' 

b. ??Je dis tous qu ' i 1 s partent. 

'I say that they all leave.' 

c. Je dis tous qu' ils sont partis. 
'I say that they have all left.' 

(22)a. II faut tous que Marie les lise. 

'It is necessary that Marie read them all.' 

b. Pierre declare tous que Marie les a lus. 

'Pierre declares that Marie has read them all. 

The contrasts in acceptability in (21) and in (22) represent what 
Pollock terms a difference between a 'close' and a 'weak' semantic 
connection between a main clause and an embedded clause. The 'strength' 
of the connection may be affected by whether the complements are infinitival 
or clausal, but it is apparent also from the examples in (21) and (22), and 
in (19) and (20), that 'infinitival' versus 'clausal' does not provide an 
adequate characterization of the observable contrasts. It would seem, 
therefore, that Morin and St-Amour are correct in proposing that some 
specification on the governing verb will be necessary to account for the 

On the other hand, for the case of the positioning of pronoun comple- 
ments, the evidence points to pronoun rather than to verb idiosyncracies, 
such as are indicated in the contrast between (17) and (18) . The only 
construction in Modern French which permits movement of pronouns is the 
clause union construction, that is, a sub-set of the class (2) type 
constructions. Since this construction is a special case, however it is 
to be analyzed, it would seem necessary to include an additional feature 
on the verbs that enter into the clause union construction to block the 
movement of a sub-set of pronouns. We may take the view that it is the 
pronouns rather than the verbs which are idiosyncratic in their behaviour. 

The permeability feature is thus a highly suspect device if it is 
to be used as a means of capturing distinctions in the placement possibi- 
lities of pronoun complements of infinitives. Such a device may have a 
role to play in specifying contrasts on quantifier movement, but it cannot 


be seen as descriptively illuminating if used to capture the distinction in 
pronoun placement exhibited in (7b) and (18). In general, in Modem French, 
movement of pronouns is possible in clause union constructions but not in 
other verb + infinitive constructions and it is thus the construction type 
which should be viewed as providing the differentiating characteristic. 

The use of the permeability feature to preserve the analysis of under- 
lying VP infinitivals is, therefore, also suspect, although for the intial 
stages of the change such a device may be descriptively relevant. However, 
it is only through close analysis of the change in question (the reposi- 
tioning of pronoun complements) that conclusions could be drawn as to the 
validity of such a hypothesis. 

The second change discussed in Morin and St-Araour is the introduction of 
the accusative + infinitive type for verbs of the class (2) type. This 
construction is limited to class (2) in French, that is, it does not extend 
to class (1) as it did in Latin. Morin and St-Amour suggest that this new 
development in French should be analyzed as the development of a construction 
in which the semantic subject of the infinitive is base-generated as a comple- 
ment of the governing verb, roughly as in: 

(23) a. J'ai laissfe Jean manger les carottes 
'I let Jean eat the carrots' 


NP ^VP, 


je y 




ai laisse 





les carottes 

The structure proposed in (23b) preserves the notion of the base-generated 
VP complement. Morin and St-Amour argue that (23b) is the appropriate 
underlying structure because the verbs that permit the accusative + 
infinitive construction are those which can take a direct object NP in 
simple clauses. Thus, they suggest that faire has not followed the same 
path as laisser or the perception verbs in adopting the accusative + 
infinitive alternative because 'lorsqu' on voit partir Pierre , on voit 
Pierre , lorsqu' on laisse partir Pierre , on laisse Pierre , etc' (when one 
sees Pierre leave, one sees Pierre, when one allows Pierre to leave, one 
allows Pierre, etc.') and that 'Cependant lorsqu' on fait partir Pierre 
ce n'est pas le cas qu' on fasse Pierre ' (p. 140) ('However, when one makes 
Pierre leave it is not the case that one makes Pierre'). 

The adoption of the structure (23b), however, does not in itself account 
for the distinction between the positioning of the pronoun complements in: 

(24)a. J'ai laisse manger les tomates a Pierre . 

b. Je les lui ai laiss6 manger. 

c. *Je lui ai laiss^ les manger. 

'I let Pierre/him eat the tomatoes/them.' 

(25)a. J'ai laiss€ Pierre manger les tomates . 

b. Je Vai laiss^ lei" manger. 

c. *Je l_e les ai laiss^ manger. 

If the permeability feature were to apply in (25) to block the movement 
of the complement pronoun les , it would have to be able to distinguish 
between the clause union construction and the accusative + infinitive. 
However, for Old French the relationship between the introduction of the 
accusative + infinitive and the introduction of changes in the placement 
of pronoun complements of infinitives has not yet been examined. We 
will take up this question in Section 3 below. 

2. Infinitival complements in Modern Italian 

In this section I will outline briefly the general nature of some 
proposals which have been made with respect to infinitival complements 
in Modern Italian. The relevance of the analysis of Modern Italian to 
that of Old French is that the constructions to be discussed share certain 
properties in the two languages. Whereas Modern French does not allow 
pronoun complements of infinitives to attach to governing verbs in class 
(1) type constructions, such pronoun placement is observed with certain 
governing verbs in Modern Italian ( volere C'to want'), potere ('to be 
able'), dovere ('ought/must'), . . .). Where permitted, the association 
of the complement pronoun with the governing verb in Italin is 'optional', 
as in: 

(26)a. Voglio riparare la macchina. 

b. Voglio ripararla 

c. La voglio riparare. 

'I want to repair the car/it' 

Rizzi (1976, 1978) has applied the term 'restructuring verbs' to 
those verbs which permit construction C26c) on the basis of an analysis 
in which both (26b) and (26c) have the same underlying structure from 
which (26c) is obtained by the application of a 'restructuring' rule. 
Thus, approximately: 

(27)a. voglio [^ riparare la] Equi-NP deletion 

b. [yVOglio riparare] la Restructuring 

c. la voglio riparare Clitic placement 

The embedded S node disappears under restructuring and the clitic pronoun 
is therefore free to attach to the higher verb. In the derivation of 
(26b) restructuring (an optional rule)~ does not apply and movement of 
the pronoun is therefore blocked. The basic surface structure distinction 
between (26b) and (26c) is that (26b) includes an S node which is absent 
in (26c). The absence of the S node in (26c) is comparable to the absence 
of the S node in the Morin and St-Amour analysis of parallel constructions 
in Old French. The basic difference between the analyses of Rizzi and of 
Morin and St-Amour, however, is that in the former case the surface structure 
is derived by a syntactic rule and in the latter case it is directly 


The important aspect of Rizzi's analysis for our consideration of 
Old French material is that it makes use of a variety of evidence to show 
that there is no embedded surface S in the structures in question. It 
will not be possible to examine similar phenomena from material in Old 
French (Object Preposing, hTi-Movt., Cleft sentence formation), but, 
failing counter-evidence, we may assume that the 'reduced' structures in 
Modern Italian are generally comparable with the only available structure 
for such verb types (class (1)) in Old French. However, one particularly 
salient point of comparison which can be observed in Old French is the 
change of auxiliary phenomenon. In Modern Italian this is represented 
as in (from Rizzi, 1978:136 (84): 

(28)a. Maria ha dovuto venirci molte volte, 
b. Maria c'e dovuta venire molte volte. _ 
c.*?Maria cj_ ha dovuto venire molte volte. 

'Maria has had to come here many times' 

The verb dovere is conjugated with the auxiliary avere , as in ha dovuto in 
(28a). In (28b), however, the use of the auxiliary essere , with e dovuta , 
is related to the presence of venire as the infinitival complement of 
dovere . (Note that if used by itself, venire regularly takes the auxiliary 
essere . ) The 'reduced' nature of the structure in (28b) is evidenced by the 
association of the clitic £' (= ci^) with dovere rather than with venire (cf . 
(28a)). The unacceptability of (28c) is due to conflicting manifestations 
of restructuring--attraction of the clitic, but non-attraction of the 
auxiliary. Gougenheim (1929:172) cites evidence for comparable auxiliary 
attraction with verbs governing infinitives in Old French. (Here again, 
the use of est etc. as auxiliary is motivated by venire , not by voulu etc.) 

(29)a. Li mareschaux n' estoit voulu venir a lui. (Livre de la Conqueste, p. 412) 

b. Vous estes volue apparoir. (Miracles N.D., I, 460) 

c. La flambe du puis oil elle estoit deue cheoir. (La Tour Landry, p. 75) 

d. Sur I'asnesse est volu raonter. (Arnould Greban, Le Mystfere de la 

Passion, v. 16135) 

The data from Old French in (29) does not include clitic pronouns, but it 
derives from the period in which pronoun complements were attached to the 
governing verbs and therefore should be regarded as comparable to the 
'restructured' type in (28b). On this evidence, and on the evidence of 
clitic pronoun placement in Old French, we may assume that verb + infinitive 
constructions belonging to class (1) in Old French have surface syntactic 
properties in common with 'restructured' forms in Modern Italian. 

It is another question whether the surface forms in Old French are to 
be analyzed as 'restructured' or as base-generated. There are, in fact, a 
variety of analyses for the data discussed above from Modern Italian. In 
addition to Rizzi's proposal for a Restructuring rule, we find also 'V 
Raising' (Van Tiel-di Maio,1978), 'VP Raising' (Burzio, 1981^ and base- 
generation of VPs (Strozer, 1981). What all of these analyses have in 
common, however, is the absence of S domination of the infinitive in the 
surface structure. The aim of this Section has been to show that the 
surface VP analysis for class (1) verbs in Old French is consistent with 
analyses of parallel constructions in Modern Italian. 

3. Further evidence from Old French 

3.0. The discussion presented in this Section will be based on the 
evidence provided by a collection of data from the earliest stages of 
Old French. 

The data has been collected from 3 groups of texts. Group I covers 
the earliest available material from the year 842 to the end of the 
llthC. The texts are: the Serments de Strasbourg (a pprox. 7 lines), 
Sainte Eulalie (29 lines), the Sermon on Jonas (mixed Latin and French, 
approx. 226 lines), the Vie de Saint L^ger (240 lines), the Passion of 
Clermont-Ferrand (516 lines), the Sponsus (40 lines). Group II consists 
of (Anglo-) Norman texts, including one text from the 2nd half of the llthC: 
the Vie de Saint Alexis (625 lines); and with the remaining texts all from 
the period 1100-1125: the Chanson de Roland (4,002 lines), the Cumpoz of 
Ph. de Thaun (3,550 lines), the Voyage de Saint Brendan (1,840 lines), the 
Lois de Gu. le Conqu^rant (approx. 52 paras), and the Declaration de Gr^g . 
II sur les images (21 lines). Group III covers the period 1150-1175 for 
Francien (including one Norman text) and includes : the Prise d' Orange 
(1,888 lines), A'iol (Part I, lines 943-1,623, 1,885-3,205), Erec et 
Enyde (lines 2,021-4,025), the Roman de Rou (Part II, lines 1,001-3,013). 
and the Vie de Saint Thomas Becket (2,001-4,000). 

Group I provides approximately 1,000 lines of text and each of 
Groups II and III has approximately 10,000 lines of text. Group III, 
however, provides the largest sample of data because it contains verses 
with longer lines (up to 12 syllables). 

From all 3 groups of texts an exhaustive collection has been made 
of the examples of verb + complement constructions (clausal and infini- 
tival complements) for those governing verbs which appear in class (2) 
but not in class (1).-^ The verbs collected fron\ the texts fall into 
categories as: 

(a) cause/permit: faire , laisser , laier 

(b) perception: voir , oir 

(c) impersonal: loisir , esteyoir , convenir , plaire 

(d) order/say: (com)mander , rover , prier , requerre 

(e) others: donner , guarder 

The sample provides a total of 937 instances of such verb + complement 
types, including clausal as well as infinitival complements. 

In the discussion to follow, we will exclude from consideration 
those governing verbs which occur fewer than 6 times in the total data 
sample because the infrequency of their occurrence means that the forms 
that they exhibit must be taken as less significant for the patterns 
that might be established. The total of occurrences remaining then 
comes to 881. Table A below lists the number of occurrences for each 
governing verb, showing the totals for both infinitival and clausal 
complements and the percentages for infinitival complements. 

Table A shows that, among the more frequently occurring verbs, 
faire, laisser, estevoir , convenir , and rover clearly prefer infinitival 
complements (see note 3 for reasons for the non-inclusion of clausal 

complements with perception verbs), whilst prier shows a clear preference 
in the direction of clausal complements. The overall infinitival per- 
centages for each group indicate an increase in the use of infinitival 
complements, although this general pattern for the groups is matched 
only by one verb ( rover ) and the totals for the individual verbs in Group 
I are too small for the results in Group I to be significant for the 
pattern as a whole. The increase in the use of infinitival complements 
from Group II to Group 111 would indicate an increase in the use of 
synthetic as against analytic constructions, although this tendency within 
verse texts could be a function of the evolution of aspects of the literary 
style. Analysis of the behaviour of individual verbs will be the focus of 
the discussion to follow. 









I - 





Inf CI 


Inf CI 


Inf CI 






93 1 


249 3 


356 4 





21 3 




86 3 





4 4 


20 6 


24 10 























4 - 




6 1 




19 - 











16 1 


19 1 







2 1 


8 1 





4 1 


6 1 


18 5 





4 12 


9 17 


15 30 




1 26 


- 20 

1 53 




- - 


1 8 


1 8 







2 1 


3 1 


5 2 







1 3 


1 3 


2 6 



J7 12 


216 51 


504 61 


757 124 


3.1 We will now consider aspects of the behaviour of faire , the most frequently 
occurring verb for all 3 groups. 

Of the infinitival complement types with faire , only a relatively small 
proportion have transitive infinitives with lexically realized subjects. 
Table B below lists the numbers of occurrences of faire + infinitival com- 
plement according to characteristics of the complement type. This Table 
includes as a point of comparison the figures for the same phenomena from 
a Modern Italian text, a collection of short stories by Giorgio Saviane, 
La donna di legno . 

TABLE B : faire + infinitive types 

A % B % C % Tot 

Group I 7 50 4 29 3 21 14 

Group II 49 53 37 40 7 8 93 

Group III 148 60 63 26 31 13 242 

Tot: 204 58 104 30 41 12 349 

Saviane 14 35 16 40 10 25 40 

subject of infinitive is unspecified 

intransitive infinitive with specified subject 

transitive infinitive with specified subject 

conjoined complements including distinct complement types 

A: (30) Par multes terres fait querre sun amfant; (Alex 112) 
'throughout many lands he makes seek his child' 

B: (31) (^o dist I'imagene: "Fai I'ume Deu venir, (Alex 171) 
'this said the image: "Make the man of God come' 

C: (32) La dreite vide nus funt tresoblier, (Alex 619) 
'(they) make us forget the straight path' 

Z: (33) En quei Deus te trovad, cum il t'a fait munter 

E creistre e enrichir e tun regne afermer. (Beck 2933) 

'In which God found you, as he made you rise 
and grow and become rich and affirm your reign' 

The percentages of Table B show a relatively low frequency of occurrences 
of Category C for the Old French data, with a somewhat higher frequency for 
this category in the Modern Italian text. The point of interest will now 
be to consider the patterns attested with the B and C categories, both of 
which have lexically specified subjects. 

Out of the total of 104 instances of Category B over Groups I - III, 
there is a total of 2 occurrences (both from Group III) in which the subject 
of the infinitive is marked as a dative, the remainder being either accusative 
(= 90) or morphologically ambiguous (= 12). The 2 datives are in: 

(34)a. si jLi^ fe(rai) souffrir, m(e)ngier araer et sur." (Rou 2301) 
'I will make him suffer, eat bitter and sour' 

b. Encor faiseit il plus al cors mal endurer: (Beck 3941) 
'he still made his body endure bad(ly) more' 

It is possible that the predicative nature of the complements in both of 
these examples means that the infinitives are being treated as transitives. 
Furthermore, it is not clear whether souffrir should be regarded as an 
intransitive verb or as an impersonal verb. If the latter, (34a) would 
rightly belong in the conjoined class. Category Z. These 2 occurrences, 
therefore, do not provide a basis for assuming that a dative marked subject 
of an intransitive is valid for the period under consideration or, indeed, 
that there is any degree of fluctuation between dative and accusative. 


Subj . of infinitive: 


Group I 


Group II 


Group 1 1 1 


Tot. : 



In Category C the subject of the infinitive can be marked as accusative, 
dative, or as an agentive (i.e. governed by the preposition par) . Table C 
below shows the totals of each type for each Group. 

TABLE C: faire + transitive infinitives 

Pat . Agent . Ambig . 

1 - 1 

22 2 3 
26 2 8 

Whereas the accusative marking for the subject of the intransitive infini- 
tive (Category B) has been shown above as well established, the dative 
marking for the subject of the transitive infinitive, although preponder- 
and, appears to be less firmly established, with the total for this type 
being 26 against 5 accusative. The examples in Category C which have an 
accusative-marked subject of the infinitive are as follows: 

(35)a. voldrent la_ faire diaule servir. (Eul 4) 
'they want to make her serve the devil' 

b. Loeys le ferai tout otroier, (Ai'ol 288) 
'I will make Louis authorize it all' 

c. Dune fist li reis Henris Randulf del Broc crier 
Par tute Norhantune que I'um laissast aler 

Les hummes I'arcevesque quitement le jur cler; (Beck 2051-3) 

'thus King Henry made Randulf del Broc proclaim through 
the whole of Northampton that the archbishop's men be 
allowed to go freely in daylight' 

d. Sis volt faire par force sainte iglise tenir. (Beck 2349) 

'and he wants to make them (the laws) hold the holy church by force' 

e. L'empereur Archadie fist iglise voidier 

Innocenz I'apostolie, nel volt pur li laissier, (Beck 2998-9) 

'Pope Innocent made the Emperor Areadius leave the church, . . ,' 

Morin and St-Amour (1977) have claimed that accusative marked subjects 
of transitive infinitive complements occur so rarely in pre-1250 Old French 
that those which do occur should be regarded as aberrations. Let us con- 
sider the examples in (35). Firstly, for the case of (35a), Morin and St- 
Amour suggest that the use of the accusative in this example is related to 
the fact that some verbs in Old French, including servir , exhibit alternations 
between accusative and dative complements (that is, between servir qqn . and 
servir ^ qqn . ) . They suggest that the object of servir , di'aule , is an under- 
lying dative which appears as accusative on the surface. However, Tobler- 
Lommatzsche classify servir as transitive and their listing of examples 
with servir includes only one instance with a dative complement ( Deus 1' 
exaltat cui el servid (Leg 29)). On this evidence, therefore, the dative 
as a complement of servir should be regarded as an aberration and we do 
not have support for the claim that diaule is not an accusative, even at 
the level of underlying structure. 

Three of the remaining 4 examples in (35) are all from the same 
text, Becket . However, this text also provides 7 of the examples with 
dative marked subjects of transitive infinitives,'* and so follows the 
general pattern of preference for the dative in this category. The 
example (35c) is an instance in which the infinitive is classified as 
transitive because it takes a clausal complement. 

Overall, the number of instances of accusative-marked subjects of 
transitive infinitives is small enough to allow them to be regarded as 
aberrations. However, when we compare the total of 5 accusatives and 
the total of 26 datives for the transitive infinitives with the total 
of 2 datives and 90 accusatives for the intransitives, then this makes 
a greater proportion of aberrations with the transitive infinitives and 
would seem to indicate that the construction with the dative is less 
firmly established than the accusative + intransitive counterpart. 

Agentive subjects of infinitives with par do not appear in the 
data sample until Group III and they appear in conjunction with the 
introduction of the instrumental use of par in the same period, e.g.: 

(36) et Anquetil le prouz fist par engin tuer, 

et Baute d'Espaingne o un escu garder; (Rou 1364-5) 

'and he had the valiant Anquetil killed by ruse , and Baute 
of Spain guarded with/by a nan of arms' 

The 2 examples with agentive subjects are as follows: 

(37)a. Puis a fet un suen escuier 

par une pucele apeler, (Erec 2612-3) 

'then she had her servant called by a maiden' 

b. Par duze le fesist la justise prover, (Beck 2453) 
'he has it proved right (?) by 12 (men)'^ 

The low level of occurrence of agentives in the sample means that their 
use cannot be analyzed. 

Thus, we have seen that the least numerous of the infinitival comple- 
ment types with faire, those with transitive infinitives, show a degree 
of fluctuation in the treatment of the case marking on the subject of 
the infinitive. The more numerous category of intransitive infinitives, 
on the other hand, shows the clearly set pattern of accusative marking 
for the subject of the infinitive. The low frequency of occurrence (cf. 
the comparable data for a text in Modem Italian in Table B) in combina- 
tion with the fluctuation in case marking with the transitive type 
indicates that the construction is not fully set in a syntactic mould 
(or that it is breaking out of an already established mould). In the 
next part of this Section we will consider how other governing verbs 
behave in comparison with faire. 

3.2. In the previous part of this Section we have seen that the use of 
the verb faire in the data collected from the early period of Old French. 
can be characterized in terms of the relative frequency of infinitival 
complement types and in terms of the case marking of the logical subject 
of intransitive infinitives versus transitive infinitives. We will now 
consider how other verbs of the sample behave with respect to the same 

Firstly, let us consider the phenomenon which with faire showed the 
greatest degree of fluctuation--that of the case marking for the subject 
of transitive infinitival complements. Table D below lists the occurrences 
of accusative versus dative marking for all the verbs of the sample which 
are attested with transitive infinitival complements. 




. + 









I - III 








dat . %acc . 







26 16 








3 40 








3 50 








2 33 
















5 17 
























1 50 

















53 20 

Verb types : 





'speak' /'command' 



















Although, apart from faire , no single verb in Table D shows a high 
number of occurrences of either types, when the verbs are arranged in 
groups, the results call for further comment. It is of interest to note 
that it is precisely the verbs which have grammatical accusative + infini- 
tive constructions in Modern French ( laisser and the perception verbs) 
which show the highest frequency of occurrence for this construction in 
this period of Old French. The figure of 40-44-6 accusative for laisser 
and the perception verbs (versus 16% for faire ) goes against the claim 
made by Morin and St-Amour that the accusative + infinitive construction 
was not a part of the grammar of Old French in this period. We may, 
indeed, readily assume on the basis of the data collected here that 

laisser and that perception verbs allowed both types of case marking for 
the subjects of transitive infinitival complements (what we have been 
calling the 'accusative + infinitive' type and the 'clause union' type). 
Furthermore, when we include the additional observation that laisser and 
the perception verbs have a combined total of 132 accusative marked 
subjects of intransitive infinitives and no dative marked subjects 
(there are an additional 19 morphologically ambiguous occurrences), it 
is clear that the variation observed with the transitive infinitival 
complements is specific to this complement type. 

However, with the remaining 2 groups, those classified as 'imper- 
sonal' and those falling into the ' speak '/' command ' category, we 
have evidence of some variation in the case marking of the subject of 
intransitive infinitives. For the verbs represented in Table D, the 

figures for Groups 

I - III 

are as follows: 

TABLE E: acc./dat 

+ intransitive infinitive 


dat. Ambig. 







2 14 



1 9 



3 (=21%) 25 

' speak ' / ' command ' 



1 3 






1 (=14%) 5 

The one instance of dat. with rover in the 'speak' /'command' category 
in Table E cannot be taken as indicative of a dative alternative for this 
class. With the impersonal verbs, the higher proportion of ambiguous 
cases comes from heavier use of the complement pronouns me_ and vous , 
which can be either accusative or dative. The total of 3 examples of 
dative marked subjects in this category indicates a degree of variation 
in contrast with the lack of such variation seen above for faire and for 
laisser and the perception verbs. 

Inspection of the distribution of accusative and dative markings 
for subjects of both intransitive and transitive infinitival complements, 
therefore, has shown that governing verbs in Old French are not undif- 
ferentiated in terms of the markings that they give to infinitival 
comolements. We have seen a definite contrast in the behaviour of two 
frequently occurring governing verb 'tyoes', on the one hand faire , and, 
on the otner hand, laisser and the perception verbs, with respect to 
the case marking of the subject of transitive infinitival complements. 
With the less frequently occurring classes of impersonal verbs and 
' speak '/' command ' verbs, we have found that their behaviour with transi- 
tive infinitival complements appears to be comparable to that of faire . 


but that the impersonal class shows some evidence of variation in case 
marking with intransitive complements, which, on the other hand, is 
not evidenced with faire. The data is therefore in conflict with two of 
the claims made by Morin and St-Amour: (1) the accusative + infinitive 
complement type did not become grammatical in Old French until the 2nd 
half of the 13th Century, (ii) infinitival complementation is a unitary 
phenomenon, undifferentiated according to governing verb type. We 
have seen that a relatively high frequency of accusative + infinitive 
with transitive infinitives governed by laisser and the perception verbs 
falsifies both claim (i) and claim (ii) . And claim (ii) is further 
falsified (but more weakly, given lower totals of occurrence) by evidence 
of variation in the case marking of subjects of intransitive infinitives 
governed by impersonal verbs. 

3.3 In this sub-section we will explore further aspects of complementation 
in Old French which have bearing on the analysis of the infinitival comple- 
ments discussed in Sections 3.1 and 3.2. We will consider, firstly, fur- 
ther evidence of differentiation in complement types and, secondly, we 
will consider hypotheses suggested by the data as to the analysis of the 
accusative + infinitive with the transitive infinitives. 

In the previous parts of this Section we have seen that there is 
evidence of variation in the case marking of the subjects of transitive 
infinitives. The Old French data also includes examples of clausal 
complements accompanied by an NP complement which may be either accusative 
or dative. The following two examples illustrate this construction, (38) 
having an accusative NP complement and (39) with a dative NP complement. 

(38) uncore l£ mande I'un que il plege truse e vienge a dreit (Lois para. 47) 
'until one orders him that he pledge truce and come to the law' 

(39) Quant Deus del eel Ii mandat par sun a(n)gle 
Qu'il te dunast a un cunte cataignie; (Rol 2319) 

'when from heaven God ordered him by his angel that he 
give you to one of his captain counts' 

Table F below shows the number of occurrences of accusative versus 
dative marking of NP complements accompanied by clausal complements. 



dat. + 






I - 

- Ill 



ace. dat. 




























(com)mander - 

1 3 








13 4 














The largest number of examples in Table F comes from the 'speaking' 
class of verbs: rover , ( com)mander , prier , requerre . As shown in Table 
A, two of these verbs, prier and requerre , heavily favour clausal comple- 
ments, having only one occurrence in each case of an infinitival comple- 
ment. If we do not assume a structural difference between Verb NP^cc S 
and Verb NPdat S, we must assign the fluctuation between accusative and 
dative as shown in Table F to other factors. The examples representing 
the 52 occurrences in Table F show some evidence of a differentiation 
between pronominal and substantive NPs. However, this is more clearly 
the case with prier , which has many instances of what must be a fixed 
expression, prier Dieu , contrasting with li prier . Table G below shows 
the totals for each type for all the verbs of Table F and with prier 
extracted as a separate case. 

TABLE G: acc./dat. + clause, and NP/pro 


pro NP 

9 5 

11 1 

20 6 

When prier is extracted from the total in Table G it is not clear 
whether we can attribute any significance to the figures for the pro/ 
NP alternation with the remaining verbs. The percentage for accusative 
with the remaining verbs is 26%, showing an overall preference for the 
dative with these constructions. 

If more evidence were available it might be possible to make a con- 
nection between the dative marking in the clausal construction and the 
dative marking in constructions with infinitive complements that would 
suggest a parallelism between the two structures represented below in (40). 

(40)a. ^S^ b. y^\ 

NP .VP. NP /VPv^ 

V NP S V NP Inf 

[+dat]/__^^ [+dat] 

(tensed clause) 

Since we must assume that (40a) is the appropriate structure for the 
majority of cases with dative marking of the NP in V NP [5. . .], if we 
were to find parallel case marking accompanying infinitival complements, 
then we could be led to posit a connection between the two types. Ard, 
for example, has discovered that historical change in complementation in 
English can be characterized by the following schema for verbs in the 
'order' class (1977:24): 



early OE 
late OE 
Mid e. 
Mod. E. 


+ clause 

very frequent 


+ NP + clause 
less frequent 

very rare 

■ NP + inf. 


less rare 



The schema in (41) indicates a direction of change following the 
complement types from (i) to (iii), which would seem to indicate 
the structure of (40b) as appropriate (at least in the developing 
stages) for the NP + Inf. type in English. Ard argues, for instance, 
that the development of (41iii) does not come about through the intro- 
duction of a rule of Subject Raising to Object. 

The present data from Old French does not supply us with compa- 
rable evidence of development in the language, except that we may note 
from Table A that both rover and (com)mander show a higher percentage 
of infinitival over clausal complements in Group III than they do for 
Groups I and II combined ( rover : Groups I-II = 75% inf.. Group III = 
86%; (com)mander : Groups I-II = 32%, Group III = 53%). However, 
( com)mander is the only verb for which this change could be signifi- 
cant because rover , which eventually disappears from the language, is 
already progressively less well-attested proportionally in Groups II 
and III. If the case marking associated with the NP complement in the 
structure ( com)mander NP [g. . .] was being carried over to the infi- 
nitival construction as (com)mander NP Inf , then we would expect a 
prevalence of the dative in the case marking of the NP in the latter 
structure. However, (com)mander does not diverge from the mainstream 
pattern of case marking with infinitive, since over the whole data 
sample it has three accusative NPs with intransitive infinitives 
and five dative NPs with transitive infinitives and shows no varia- 
tion from this pattern. The concentration of the present data sample 
in the relatively short time span of 75 years (1100-1175) does not 
provide us with indicators of a pattern of change such as discovered 
by Ard for data in English covering a much broader time span. 

What remains of interest in Table G is the variation in case 
marking, attested even with the verb prier extracted from the sample 
(% ace. = 26%). We might postulate that the variation in V NP S is 
an effect of lexical idiosyncracy (the lexical subcategorization of 
the case marking of verbal complements) in contrast to the more clearly 
defined (and much less variable) case marking in particular syntactic 
constructions. On the other hand, we must also note that the accusative 
case marking which is overwhelmingly attested for the subject of intran- 
sitive infinitives tends to occur in constructions with a high frequency 
of occurrence. Table H below shows the distribution of verb types ac- 
cording to whether their infinitive complements have unspecified sub- 
jects, are intransitive, or are transitive. 


TABLE H: Infinitival complement types 

c ^°^- 

Tot. % Tot. % Tot. % 

faire 203 58 104 30 42 12 349 

laisser/laiei7+ 68 28 154 64 18 8 240 


impersonal 22 25 41 47 25 28 88 

•say'/'order' 12 34 14 40 9 26 35 

A: subject of infinitive is unspecified 

intransitive infinitive with specified subject 
transitive infinitive with specified subject 

We have seen from Table D above that laisser and the perception 
verbs distinguish themselves from other verbs by having a higher pro- 
portion of accusative marked subjects of transitive infinitives. We 
find in Table H that these verbs further distinguish themselves from 
the other groups in that they have a higher proportion of intransitive 
infinitives. The accusative marking of the subject of intransitive 
infinitives is clearly established (especially for the laisser class) 
in contrast with the degree of fluctuation between dative and accusa- 
tive with other complement types. It would appear that the higher 
proportion of accusative marked subjects of transitive infinitives with 
this same class of verbs is an influence of the higher frequency of 
intransitive infinitival complements. We may hypothesize that the 
tendency for dative marking of subjects in other constructions is not 
yet fully 'syntacticized' in the language and that, in the case of 
laisser and the perception verbs, the influence of the frequency of 
the accusative + infinitive construction is such as to inhibit the 
development with the dative to prevent it from becoming fully set as 
the only construction available with transitive infinitival comple- 
ments. To this extent, the influence of the accusative + infinitive 
construction is a preserving rather than an innovating influence. 

The second observation that can be made from Table H is that 
faire is clearly distinguished from the other groups of verbs in that 
it has a much larger proportion of unspecified subject infinitival 
complements. We saw earlier (Table B) that faire in Old French ap- 
peared to have a higher proportion of such complement types than fare 
in Modern Italian. And now we see from Table H that its behaviour is 
idiosyntactic in this respect vis-a-vis other verbs in Old French. 
It is as if faire might have originally governed an infinitival comple- 
ment type lacking a specified subject and open to the analysis of an 
underlying VP. This would imply that the emergence of the type with a 
lexical subject was derived originally by analogy with verbs that were 
subcategorized for NP complements as well as infinitives. 

The special development of the construction type contrasting accu- 
sative + intransitive infinitive with dative + transitive infinitive 
appears to be a development particular to Romance. Thus, Norberg (1945) 
characterizes the accusative/dative variation of this type as not yet 
syntactically defined in Old French: 'En ancien fran^ais, surtout, on 
trouve assez souvent le datif, mSme si I'infinitif est intransitif, et 
il y a aussi des examples d'un accusatif avec un infinitif transitif 
(p. 94) ('In Old French, especially, the dative is found quite often, 


even if the infinitive is intransitive, and there are also examples of 
an accusative with a transitive infinitive'.) The evidence provided by 
the data examined here, however, has shown that the syntactic definition 
of accusative versus dative is already a marked tendency in early Old 
French, but which varies in strength according to the governing verb 
type. In the analysis of Norberg, both the accusative and the dative 
markings derive from developments from Latin into French, which can be 
characterized roughly as: 

(42) (i) VerbA + NPacc* I^f Verbs + NP^at.* Inf 
(ii) Verbc * NPa^^/dat.* ^f 

(iii) Verbc * NPacc* ^^^intrans ^^^^C * ^Pdat. * ^f trans 

In fact, as Norberg describes the situation, Latin also provides 
evidence of shifts in case marking of the NP complement with the infin- 
itive. For example, licet took an accusative complement in archaic 
Latin which subsequently shifted to the dative in Late Latin. Other 
impersonal verbs, such as decet , pudet , piget , and oportet also developed 
a dative NP complement. On the other hand, the verbs mandare , imperare , 
concedere , and p ermittere show evidence of change from dative to accusa- 
tive NP complements. And iubere , pati , sinere , and facere show changes 
from the accusative to the dative in Late Latin. In the schema in (42, 
(42ii) therefore represents a pre-Romance stage of variation in case- 
marking, which according to Norberg continued into early Old French. 
However, as we have seen with the data that has been analyzed up to 
this point, the syntactic specialization represented in (42iii) is 
already evident, if not fully set, in early Old French. 

According to Norberg, and essential part of the development in 
(42iii) with facere is the tendency for facere to become linked with 
the following infinitive {'k se lier a I'infinitif suivant' (p. 92)). 
As a result, 'il a fallu que le datif se degage de la dependance 
primitive et directe du verbe principal et qu'il se rattache a toute 
1 'unite verbale' ('it was necessary for the dative to detach itself 
from the initial direct dependency on the governing verb and to attach 
itself to the verbal expression as a whole'). In the terms of the 
previous discussion in this paper, the close connection between governing 
verb and infinitive has been described as a function of clitic pronoun 
placement and the 'change of auxiliary' phenomenon- -what we have accep- 
ted (in agreement with Morin and St-Amour (1977) as at least a surface 
VP configuration for the infinitive phrase. No doubt the same argiunent 
about the close connection between governing verb and infinitive applies 
also to the accusative + infinitive type. The question is then to 
determine, assuming that such developments occurred more or less simul- 
taneously (i.e., the 'unification' of governing verb and infinitive 
with both accusative and dative complements), by what means the syntactic 
differentiation in (42iii) between the two constructions was established. 
It could be that the accusative (the 'simplest' solution ?) absorbed the 
majority of instances (those with intransitive infinitives) and the dative 
came to be reserved for the more 'complex' case of transitive infinitives. 
It would appear from the high proportion of infinitives with unspecified 
subjects with faire in Table H and the low proportion of transitive infini- 
tives with specified subjects that the transitive infinitive type with faire 
is indeed more unusual and therefore adopts the more 'complex' construction 


The present discussion, however, implies that there was no funda- 
mental syntactic difference between infinitival complement types, apart 
from the presence or absence of a direct object of the infinitive. 
Modern French, on the other hand, shows a clear syntactic difference 
between the two complement types. The Modern French constraints on word 
order in the dative + transitive infinitive versus the accusative + 
intransitive infinitive constructions are illustrated in (43) and (44) 

(43) a. J'ai fait manger les tomates a Paul. 

b. *J'ai fait a Paul manger les tomates. 

c. Je les fait manger a Paul. 

d. *J'ai fait les manger i Paul. 

'I made Paul eat the tomatoes/them' 

(44)a. J'ai laisse Paul manger les tomates. 

b. *J'ai laiss6 manger les tomates Paul. 

c. J'ai laisse Paul les manger. 

d. *Je les ai laiss^ Paul manger. 

'1 let Paul eat the tomatoes/them' 

The examples in (43) illustrate the construction with the dative and 
in (44) the construction with the accusative. The (a) and (b) examples 
illustrate the grammatical order for the NPs in the respective construc- 
tions. Examples (c) and (d) illustrate the fact that the pronoun comple- 
ment of the infinitive is attached to the higher verb in the construction 
with the dative and to the infinitive in the construction with the accusa- 

The data that we have been considering from Old French is charac- 
terized by relative freedom of word order and by the fact that clitic 
pronouns do not attach to infinitives. For the constructions in question, 
two observations can be made on the basis of the sample: (i) word order 
of accusative versus dative marked NP subjects of transitive infinitives 
does not give evidence of distinct patterns, (ii) the evidence available 
does not suggest a distinction between the (43c) case and the (44c) case. 

The data sample contains a total of 25 examples including an accusative 

or dative-marked substantive subject of a transitive infinitive. The number 

of occurrences in terms of order of the subject of the infinitive in rela- 
tion to the governing verb and the infinitive is given below. 

TABLE I: Order of NP subjects of transitive infinitives 
+ ace. + dat. 

Main verb 3 9 

Inf. - 3 

Main verb Inf. 4 6 

7 18 Tot. = 25 

It is not possible to determine distinguishing patterns of word order from 
these figures, except that there may be a greater tendency for preposed 


dative marked subjects. The material provided by the data sample has the 
considerable disadvantage that the amount of prose included is so small as 
to be negligible and the word order in verse texts operates under some 
constraints and perhaps exhibits a greater freedom in other respects. On 
the basis of the data that we are considering here, however, we must con- 
clude that we have no evidence in support of a surface structure difference 
between transitive infinitival complements with accusative-marked subjects 
and those with dative-marked subjects. 

Of a total of seven examples of accusative-marked subject constructions with 
clitic pronouns, two include clitic pronouns which are complements of the 
infinitive. The seven examples are listed below with the accusative subjects 

(45)a. voldrent la_ faire diaule servir (Eul 4) 

'they wanted to make her serve the devil' 

b. E sis rovet eel receivre. CBrend 358) 
'and so he asks them to receive that' 

c. Loeys le ferai tout otroier, (Ai'ol 2879) 
'I will make Louis authorize it all' 

d. Sis volt faire par force sainte iglise tenir. (Beck 2349) 
'and he wants to make them hold holy church by force' 

e. ne J_e voudrent lessier, si firent grant savoir, 

lors villes essillier et lor mesons ardoir, (Rou 2760-1) 

they do not want to let him, so they had great wisdom, 
exile their towns and burn their houses' 

f. envie out qu'il le^ vit lez Franceis vergonder, (Rou 1379) 
'he was envious as he saw him insult the French' 

g. Dune I'esteiist 1 'evesque al vescunte mustrer; (Beck 2424) 
'thus it was fitting for the bishop to show it to the count' 

The examples (45c) and (45g) contain pronouns which are complements of 
infinitives and, in both cases, they occur attached to the governing verb. 
Although only two such examples occur in the data, I believe that we can take 
them as indicative for the simple reason that we could not expect to find 
a large number of occurrences of this type. Pronoun placement, therefore, 
does not indicate that there is a structural distinction between transitive 
complements with accusative-marked NP subjects and those with dative-marked 
NP subjects. 

4. Conclusions 

Firstly, let us summarize the arguments and evidence that have been 
considered in the previous Sections. 

In Section 1 we considered the arguments put forward by Morin and St- 
Amout (1977) in support of the analysis of all infinitivals as VPs at the 
levels of both underlying and surface structure and in both Old French 
and Modern French. 

In Section 2 we saw how additional evidence in support of the VP surface 
structure for infinitivals in Old French is provided by the analysis, in 

particular that of Rizzi (1976, 1978), of parallel material from Modern 

On the basis of the arguments presented in Sections 1 and 2 we concluded 
that evidence supporting surface VP infinitival complements in Old French 
is sufficiently convincing, but that later developments in French do not 
lend similar support to such an analysis. 

In Section 3 we examined a number of aspects of the material in a 
set of data from early Old French. We found that the data from this period 
shows clear evidence of differentiation in infinitival complement types in 
terms of case marking, and also exhibits differing tendencies according to 
governing verb type. The evidence put forward refuted the claim of Morin 
and St-Amour (1977) that the 'accusative + infinitive' with transitive 
infinitiives was not viable in this stage of Old French, and also refuted 
the claim of these authors that all verbs governing infinitival comple- 
ments of the class (2) structure behave alike in terms of the infinitival 
complement types with which they associate. 

We then examined further aspects of the data collection in order to 
consider what evidence could be obtained that could shed some light on 
the analysis, in particular, of the constructions with transitive infini- 
tival complements. We considered hypotheses put forward by Norberg (1945) 
as to the historical development of the constructions in question, in 
particular the proposal that fluctuation between accusative and dative 
case marking is an inheritance from Latin and that the syntactic bifurca- 
tion in Romance, such as with faire , developed under conditions of a close 
unity between governing verb and infinitive. The latter argument we found 
of particular interest, since the 'unity' notion seemed to reflect the 
conclusion of the discussion in earlier sections in support of the surface 
VP analysis for infinitival complements in Old French. Further examina- 
tion of aspects of word order in constructions with transitive infinitives 
led us to the conclusion that variance between dative and accusative 
marking of subjects of infinitives in such constructions does not indi- 
cate additional syntactic distinctions. 

All in all, the entrance of a new systematically defined set of data 
onto the scene of an old debate has caused some dust to be raised. However, 
I think it has been shown that the dust will not just settle back in the 
came configurations as before. The major alteration is that the governing 
verbs have been placed in patterns according to the behaviour that they 
exhibit. The question of what is under the surface of the individual 
patterns and under the schema as a whole will depend on the organization 
of the theoretical framework adopted. The interpretation of the patterns 
may be available directly from the surface configurations or, alternatively, 
may make reference to a further underlying layer of organization. 

These theoretical questions require further elaboration. ^ In the mean- 
time, we have material for reflection in the suggestions of Norberg (1945) 
as to the notion of the possibility of a developing unit between a governing 
verb and its infinitive complement which may be comparable to the phenomenon 
of the development: dicere habeo (Lat.) > dirh (Ital.), dirai (Fr.), etc. 
('I must speak ' > 'I will speak' = Fut.). The dust has not yet settled. 


The term 'clause union' stems from the original notion of 'verb 
raising' presented in Aissen (1974) and subsequently developed as 'clause 
union' in the Relational Grammar framework, as in Aissen and Perlmutter 
(1976). The aspect of the clause union analysis which concerns us 
here is that the derived surface structure in such constructions contains 
only one clause, that is, only one S node. The term can be applied in a 
general sense to other analyses which, for the same constructions, derive 
outputs with a single surface clause, including versions in Extended 
Standard Theory (e.g. Rizzi (1976, 1978), Burzio (1978)). 

As Rizzi points out (fn.26) the paradigm is logically completed by: 
(i) Maria d dovuta venirci molte volte. 
which, in fact, seems to be acceptable. The facts are not perfectly clear, 
but sentence (ii) is parallel to (i) and is unacceptable. 
(ii)*?Siamo potuti venirci solo poche volte. 

'We have been able to come here only a few times.' 

The data, however, does not include perception verbs with clausal 
complements, which are semantical ly distinct from the infinitive comple- 
ments, e.g.: (i) je le vols venir 

'I see him coming' 
(ii) je vois qu'il est venu 

'I see that he came' 
In (i) the governing verb voir is used with a 'perception' meaning 
implying a response of the senses, whereas (ii) implies a mental process 
close in meaning to a statement like 'It is apparent to me that he came.' 
I believe that such a degree of difference in meaning between construc- 
tions with infinitival and clausal complements is not observed with the 
other verbs collected in the sample. 

Including the following example in contrast with (35e) : 
(i) Mais qu'um lj_ peiist bien faire iglise voidier. 

'since one might will make him (dat.) leave the church' 

Cf . the use of the dative with clausal complements of the infinitives 
in: (i) Richart lor a rendu, puiz lor a fait entendre 
qu'il I'avoit tant tenu por cortoisie aprendre, 
et norrir en sa court tant que le veist rendre. (Rou 2125-7) 
(ii) Iluec voleit il faire as evesques iurer 

Que nul d'els pur apel ne passereit mais mer (Beck 2644-5) 

Cf>. (i) A duze hummes fereit la verite prover, (Beck 2428) 
'he would have 12 men prove the truth' 

included in the grouping with laisser and the 
perception verbs because it is a morphologically distinct variant of 
laisser . However, its use varies from that of laisser , as it is reserved 
largely for the future tense and for constructions with the negative as: 
NEG + laier + clause. The totals for laier included in Table H are: A = 2, 
B = 22, C = 0. 

which will be undertaken in Pearce (in preparation). 

Stras--Les Serments de Strasbourg de 842, in Karl Bartsch, Chrestomathie 

de I'ancien fran^ais (12th edition). New York and London: Hafner, 

1969, pp. 2-3. 
Eul--Cantildne de Sainte Eulalie, Ibid, p. 4. 
Jonas--Guy de Poerck. Le sermon bilingue sur Jonas du ms. Valenciennes 

521 (475). Romanica Gandensia 4 (1955), pp. 31-66. 
Leg--Joseph Linskill. Saint Leger. Paris: Droz, 1937. 
Pass--D'Arco Silvio Avalle. Cultura e lingua francese delle origini 

nella 'Passion' di Clermont-Ferrand. Milan: Ricciardi, 1962. 
Spons--Lucien-Paul Thomas. Le'Sponsus'. Paris: Presses Universitaires 

de France, 1951. 
Alex--Christopher Storey. La Vie de Saint Alexis. Geneva: Droz, 1968. 
Rol--Alfons Hilka. Das altfranzosische Rolandslied nach der Oxforder 

Handschrift (6th edition), Tubingen: Niemeyer, 1974. 
Cump--Eduard Mall. Li Cumpoz Philippe de Thaun. Strasburg: Trubner, 1873. 
Brend--E.G.R. Waters. The Anglo-Norman Voyage of St. Brendan by Benedeit. 

Oxford: Clarendon Press, 1928. 
Lois--John E. Matzke. Lois de Guillaume le Conqu^rant (Collection de 

textes pour servir a 1' etude et k 1 'enseignement de I'histoire). 

Paris: Picard, 1899. 
Greg--D6claration de Gregoire II sur les images, in E. Stengel. La 

Can9un de Saint Alexis. Marburg: Elwertsche, 1882. 
Or--Claude R^gnier. La Prise d'Orange. Chanson de geste de la fin du 

Xlle si^cle (4th edition). Paris: Klincksieck, 1972. 
Aiol--Wendelin Foerster. Aiol et Mirabel und Elie de Saint Gille (edited 

by J. Verdam). Heilbronn: Henninger, 1876-1882. 
Erec--Mario Roques. Les Romans de Chretien de Troyes ^dites d'aprfes la 

copie de Guiot, I: Erec et Enide. Paris: Champion, 1952. 
ROU--A.J. Holden. Le Roman de Rou de Wace (3 vols). Paris: Picard, 

Beck--Emmanuel Walberg. La Vie de Saint Thomas le Martyr par Guernes de 

Pont-Sainte-Maxence. Podme historique du Xlle si^cle (1172-1174). 

Lund: Gleerup, 1922. 
Saviane, Giorgio. La donna di legno. Milan: Rizzoli, 1979. 


AISSEN, Judith. 1974. Verb raising. Linguistic Inquiry 5.325-66. 
AISSEN, Judith^and David Perlmutter. 1976. Clause reduction in Spanish. 

Proceedings of the Second Annual Meeting of the Berkeley Linguistics 

Society, 1-30. 
ARD, William Josh. 1977. Raising and word order in diachronic syntax. 

Indiana University Linguistics Club. 
BURZIO, Luigi. 1981. Intransitive verbs and Italian auxiliaries. MIT 

Ph.D. dissertation. 
GALET, Yvette. 1971. L'Evolution de I'ordre des mots dans la phrase 

fran9aise de 1600 a 1700. Paris: PUF. 
GOUGENHEIM, Georges. 1929. Etude sur les periphrases verbales de la 

langue fran9aise. Paris: Nizet. Reprint 1971. 
KEYSER, S. Jay (ed.). 1978. Recent transformational studies in European 

syntax. Cambridge, MA: MIT Press. 


MORIN, Yves-Charles, and Marielle St-Amour. 1977. Description historique 

des constructions infinitives du fran9ais. Montreal Working Papers 

in Linguistics 9.113-152. 
NORBERG, Dag. 1945. 'Faire faire quelque chose k quelqu'un': Recherches 

sur I'origine latine de la construction romane. Uppsala Universitets 

Arsskrift 12.65-106. 
PEARCE, Elizabeth. A study of the history of verb + infinitive construc- 
tions in selected Romance languages. University of Illinois Ph.D. 

dissertation (in preparation). 
POLLOCK, Jean-Yves. 1978. Trace theory and French syntax. Keyser 1978: 

RIZZl, Luigi, 1976. Ristrutturazione. Rivista di granimatica generativa 

. 1978. A restructuring rule in Italian syntax. Keyser 1978: 

STROZER, Judith. 1981. An alternative to restructuring in Romance 

syntax. Papers in Romance 3, Supplement 2, 177-184. 
TOBLER-LOMMATZSCH. Altfranzbsisches Worterbuch. Vols. 1-2, Berlin: 

Weidmann, 1925-36. Vol. 3- , Wiesbaden: Steiner, 1954- . 
VAN TIEL-DI MAIO, Maria Francesca. 1978. Sur le ph^nom^ne dit du 

d^placement 'long' des clitiques et, en particulier, sur les 

constructions causatives. Journal of Italian Linguistics 2.73-176. 

Studies in the Linguistic Sciences 
Volume 12, Number 2, Fall 1982 


William D. Wallace 

While the Eastern Indian languages developed nominative/accusative 
syntax in the perfective tenses, and the Western Indian languages pre- 
served the ergativity of the Old Indo-Aryan perfective participle, 
which was inherited as a past tense by all the Modern Indo-Aryan 
languages, Nepali developed "western" morphology with "eastern" syntax. 
The syntax of the Nepali perfective tenses, however, appears to be his- 
torically motivated rather than amalgamated frofn the syntax of juxta- 
posed languages. Beginning with its earliest records (c . 1350 A.D.), 
there is evidence that Nepali has modified the ergative perfective 
participle by using pronominal affixes in the perfective past and the 
conjugated auxiliary cha in the compound perfective tenses to agree 
with the underlying subject/agent rather than the ergative subject. 
An ergative postposition le_ is introduced in the 16th century into 
those environments in which ergativity had been inherited from OIA, 
but by the 18th century, this postposition was spreading to mark 
"nominative" transitive and intransitive subjects. A new form of the 
perfective participle arose in attributive clauses in the 16th century, 
and the ergative syntax of these clauses spread to main clauses with 
the compound perfective verbs. Thus, ergativity was reintroduced into 
the language, although the construction in which it appears is eventu- 
ally regularized to conform with the nonergative perfective tenses. 
In this paper I discuss these developments and show how the syntax of 
the Nepali perfective tenses fits into the larger Indo-Aryan context. 
The data presented here have bearing on recent research on how change 
progresses in a grammar through variation and also on recently proposed 
restrictions on syntactic change in an EST framework. 

The study of ergative syntax in Nepali involves principally the study 
of its elimination from the language. Both the earliest and most modern 
forms of Nepali do not exhibit the common forms of ergative syntax found in 
the contemporary Western Indian languages; but during its history, Nepali 
reintroduces ergative syntax into the perfective tenses and eliminates it 
once again. And the means by which Old Nepali becomes partially ergative 
is the same as that by which the early Modern Indo-Aryan languages inherited 
ergativity from OIA--the development of an attributive past participle as 
a compound verb with ergative syntax. 

In this paper I shall discuss the syntax of the Nepali perfective tenses 
from 1350 A.D. to the present; I shall be concerned with the simple past and 
present/past perfective tenses, agent-marking, parallel developments in 

various Indo-Aryan languages of the area, and the influence of Tibetan 
languages on the Indian languages of the Himalayan region. The scope of 
such a study is vast; the data are often mysterious; and the details that 
need discussion are too numerous to treat completely in this paper. But 
the lines of development are clear, and I have attempted to examine rele- 
vant aspects of the topic in as much detail as is necessary. 1 

This study is divided into the following sections: 

1. The Modern Indo-Aryan Perfective Tenses 

2. A Linguistic History of Nepal 

3. A Guide to Terminology and Nepali Forms 

4. Old Malla Nepali 

5. Old Shah Nepali 

6. The Agent Marker le 

7. The Nepali Perfective Participle 

8. The Nepali Perfective Past Tense 

9. The Nepali Compound Perfective Tenses 
10. Conclusions 

One new concept is introduced in order to better discuss the Nepali 
data: agentive syntax . By using this concept, I want to contrast ergative 
and nominative syntax, in both of which the unmarked term controls verb 
agreement, with the situation in Nepali where the subject term--whether 
marked or unmarked --controls verb agreement. Essentially, agentive syntax 
represents the combination of nominative verb agreement with ergative NP 
marking . 

1. The Modern Indo-Aryan Perfective Tenses . 

1.1. The reflexes of the Old Indie perfective passive participle in -ta, 
which came to be a generalized past tense in Middle Indo-Aryan, have gen- 
erally followed two paths: (1) the perfective tenses of the Western Indian 
languages (e.g., Hindi-Urdu, Gujarati, Marwari, Braj Bhakha) have preserved 
the ergative character of this OIA participle, while (2) the perfective 
tenses of the Eastern Indian languages (e.g., Bengali, Assamese, Awadhi, 
Maithili) have eliminated ergativity, so that the perfective tenses in these 
languages are nominative/accusative in syntax like the present, future, 
and other tenses of both the Eastern and Western languages. 

In Sanskrit, the ta-participle is ergative, so the subjects of intransi- 
tive verbs (1) and the objects of transitive verbs (2) control verb agree- 
ment, and the agent or logical subject of transitive verbs appears in the 
instrumental case (2): 2 

(1) a. ye ' yam atra srian'artham agat" (16.1) 

who she-FS-nom. here to bathe go-pp-FS-nom. 

'she who came here to bathe' 

b. kany'avacan'ad raja 'nyatra gatal) (100.9) 

girl-call king-MS-nom. away go-pp-MS-nom. 

'at the girl's bidding, the king went away' 

(2) a. tato raji^g ' 'darepa kanya pr^ta (100.5) 

king-MS-ins. girl-FS-nom. ask-pp-FS-nom. 

'then the king respectfully asked the girl' 

b. etTvati bhagavatya hiranyavatyai svapno dattal; (98.1) 

Devi-MS-ins. dream-MS-nom. give-pp-MS-nom. 

'at that time Devi sent a dream to Hiranyavati' 

(Source: Vetalapancavimsati , Emeneau 1934) 

In Hindi-Urdu, the past tense and the present perfect tense--both 
formed with the past participle--are ergative. The sentences in (3) show 
that the subject controls verb agreement for an intransitive verb, while 
those in (4) show that the transitive verb agrees with its direct object, 
unless the direct object is marked with the postposition ko. In this last 
case (4d, 4f ) , the verb appears in the neutral third singular masculine form. 

(3) a. larka ay a* 'the boy came' 

boy-MS come-pp-MS 

ve larkiya calT 'those girls left' 

I have left' 

(4) a. rain-ne kitab payhi 'Ram read the book' 

the girls ate the food' 
girl-FP-agt. food-MS eat-pp-MS 

maT-ne phal toye 'I plucked the fruit' 

I-agt. fruit-MP pluck-pp-MP 

ram-ne sit?-ko dekh'a 'Ram saw Sita' 

-agt. -dat. see-pp-MS 

bKai-ne patr likha hai 

brother-agt. letter-MS write-pp-MS be-3S 

'the brother has written a letter' 
naukra'ni-ko bu laya 

those girl-FP 


maT cal^ 


1-MS go-pp-MS 


rain-ne kitab 


-agt. book-FS 


larkiy'o-ne khana khaya 

-agt. servant-FS-dat. call-pp-MS be-3S 
'Kamla has sent for the maid servant' 

(Source: Central Hindi Directorate 1975) 

On the other hand, in Modern Awadhi, intransitive and transitive verbs 
agree with the subject, and the subjects of both intransitive and transitive 
sentences appear in the unmarked nominative case: 




I-MS go-pp-lS 

mai ga hau 

I-MS go-pp-S be-lS 

mai dekheu 

I-MS see-pp-lS 

mai dekhe hau 

I-MS see-pp-1/2 be-lS 


they go-pp-3P 


gae ha*! 


go-pp-P be-3P 






dekhini hai 

I went ' 

I have gone' 

I saw' 

I have seen' 

they went' 

they have gone' 

they saw' 

they have seen' 

they see-pp-3P be 

(Source: Lakhimpuri dialect, Saksena 1971) 

1.2. The preservative and eliminative historical developments which have 
produced this split in the Indian languages are we 11 -documented in Hock 
1981 and Stump To Appear. Briefly, we may cite the leveling of MIA case 
distinctions, the use of a dative/accusative postposition which blocks verb 
agreement with the direct object (cf. ko in Hindi -Urdu) , and most important- 
ly, the introduction of pronominal suffixes on the past participle agreeing 
with the agent as contributing to the loss of ergative syntax in the East; 3 
whereas in the West, an ergative postposition was often introduced to pre- 
serve agent forms (cf. ne in Hindi-Urdu), and agreement with the object was 
extended even to direct objects marked with the new dative/accusative post- 
position in a few languages, thus preserving the ergative pattern. 

1.3. The perfective tenses in Modern Nepali represent a mixture of these 
two paths of development, for on the one hand, agents are marked with the 
postposition le^ (which is also the instrumental postposition) , while on the 
other hand, the perfective verb agrees with the intransitive subject and 
transitive agent in person, number, and gender. Thus: 



syam-sita bajar 


ga*^ 'I went to the bazaar with Shyam' 







I ate dinner' 



my-MS friend-MS 

'my friend has lived in Kathmandu' 

stay-pp-MS be-3MS 

b. ram-le yo kif^p pareko cha 

-agt. this book read-pp-MS be-3MS 

'Ram has read this book' 

(9) a. hamiharu hijo likago-ba^a ayau 

we-pl.-agt. yesterday -from come-pp-lP 

'we came from Chicago yesterday' 

b. hamiharu- le syam-sita kura garyau 

we-pl.-agt. -with talk do-pp-lP 

'we talked with Shyam' 

flO) a. ke^iharu pa tan gaeki chin 

girl-pl. go-pp-FP be-3FP 

'the girls went to Patan' 

b. ke^iharu-le kam gareki chin 
girl-pl . -agt . work do-pp-FP be-3FP 

'the girls have done the work' 

(Cf. Clark 1963; Verma 5 Sharma 1979) 

In addition, it is not uncommon to find the ergative marker le^ extended to 
transitive subjects in other tenses (cf. Clark 1963: 92, 126): 

(11) a. ram-le kam gar la 'Ram will do the work' 

-agt. work do-3S-fut. 

b. rame-le ali din-pachi ghadi plucha 

-agt. few day-after watch get-prs. -be-3S 

•Rame will get a watch in a few days' 

1.4. The evolution of the Nepali ergative system has received little, if 
any, attention, although the synchronic aspects of Nepali ergativity have 
been discussed by various linguists, e.g., Abadie 1974, Verma 1976, and 
Kachru 5 Pandharipande 1979. Grierson (1916, iv: 26-27) suggests that the 
development of this mixed system occurs through the influence of Tibetan 
languages, some of which use an ergative marker for the agent in all tenses. 
In Tibetan, for example, intransitive subjects (12) and direct objects (13) 
are unmarked, while agents of transitive verbs occur in the ergative case (13): 

(12) a. sang-nyi yok-po dro-ki-re 

tomorrow servant-nom. go-fut. 

'the servant will go tomorrow' 

b. kho shi-ga-tse ne pha-ri che-ne bang-tok la 

he-nom. from via to 

chhim-pa-re 'he went from Shigatse to Bangtok via Phari' 

(13) a. yok-po kho dung-gi-du 'the servant is beating him' 

servant-agt. he-nom. beat-prs. 

b. kho" nga-la dung -song 'he beat me' 

he-agt. I-obj. beat-pst. 

(Sources: Bell 1919; Roerich 5 Lhalungpa 1972) 

However, Tibetan ergativity is a nominal -marking system; verbs need not 
agree with any NP in the sentence. While the close contact Nepali speakers 
have had with speakers of Tibetan languages may explain the extension of 
le to other tenses beside the perfective ones, it cannot account for the 
verb agreement system of either Modern Nepali or older forms of the language. 

Below, I shall sketch the history of Nepali ergative syntax in the 
perfective tenses from the earliest records (c . 1350 A.D.) through develop- 
ments in the modern language. In this examination of the evidence, I shall 
point out: (1) Nepali eliminated ergative verb agreement very early in its 
history; (2) ergative syntax is later reintroduced through the development 
of a new perfective participle and is lost again in the early modern period; 
(3) Tibetan models may have influenced the syntax of Nepali, but the "in- 
fluence" seems to be an areal phenomenon; (4) although Modern Nepali mixes 
"western" and "eastern" syntax in the perfective tenses, developments in 
Nepali ergativity can be motivated historically. 

2. A Linguistic History of Nepal . 

2.1. The earliest recorded Nepali is found in the inscriptions and decrees 
of the Indo-Aryan Malla rulers of Nepal's Karnali River Basin. This Malla 
dynasty controlled an area including Western Nepal, the Kumauni and Garh- 
wali regions of Northern India, and portions of Southwestern Tibet between 
the 10th and 14th centuries (Tucci 1956; Petech 1958; Regmi 1966, I: 710- 
735). In the late 14th century, the Malla line ran out, and the cities 

of the kingdom passed into the hands of various petty rulers (Tucci 1956: 
121-128; Stiller 1973: chap. 2; Regmi 1975, I: 1-28). 

The inscriptional records from about 1500 A.D. show a mixture of dialects, 
but the principal one is that of various Indo-Aryan rulers of Western 
Nepal city-states, one family of which--from Gorkha--conquered and unified 
what is the present-day kingdom of Nepal in the 18th century. 

I shall refer to the language of the inscriptions up to c. 1450 A.D. as 
Old Malla Nepali after the old ruling family, and the language from about 
1500 to 1800 as Old Shah Nepali after the subsequent rulers of Nepal. 4 As 
we shall see, the language of the Shahs is quite different from that of the 
Mallas, particularly with respect to developments in the syntax of the 
perfective tenses. 

2.2. The origins of these Indo-Aryan peoples of Nepal and Northern India 
have been much discussed by modern linguists and historians (cf. Grierson 
1916, iv: 1-18; Hodgson 1833; Bendall 1903; Tucci 1956; Petech 1958; Regmi 
1966, 1975; Pokharel 1974: 41-63). And adding to the interpretation of 

linguistic and cultural evidence are several vamsSvalis or chronicles, of 
various ruling families (Wright 1877; Hasrat 1970; Regmi 1966, III-IV). 

In general, it is accepted that the later Indo-Aryan rulers of Nepal, 
the Shahs of Gorkha, originate in Rajputana and flee to the Northern 
regions during the 11th to 15th century Muslim invasions and conquest of 
central India. Indeed, one Gorkha varii^avali traces that family back to 
the city of Udaipur in the Mewari district of India (Wright 1877: 273ff.)- 

Support for this explanation comes from the many similarities between 
the Pahari languages--Nepali , Kumauni , and Garhwali--and the languages of 
Rajputana. For example, both the Rajasthani and Himalayan languages have 
o in the singular and a in the plural of strong masculine nouns, whereas the 
other Western languages have a and e, respectively (Kellogg 1893: $155, 
$169-170, and Table III). Also, the Western Rajasthani languages and the 
Pahari languages have developed a future in -j_- (as well as Marathi); thus, 
Marwari marulo , Nepali marula '1 will strike' (Beames 1872-1879, III: $55; 
Kellogg 1893: $502, $514, $523, and Table XX; Bloch 1914: $240-242; 
Chatterji 1926: $728). 

2.3. When these Indo-Aryan tribes came to Nepal is not so easy to estab- 
lish. We must assume that some, e.g., the Mallas, had been in the Himalayan 
regions for several centuries prior to their first, extant inscriptions. 

And if we can believe the Gorkha vamlgvali , that particular family arrived 
in the late 15th century. However, it's probably more reasonable to think 
of these regions as being settled in waves from Rajputana, the new-comers 
gradually marrying into or taking over the already-settled Indo-Aryan 
peoples. So, both Old Malla and Old Shah Nepali may have existed in the 
region for several centuries, but only the ruling families would be likely 
to leave records for any one period, thus producing the apparent view of 
succession in real time. 

Certainly, the evidence of religion would support this view. The Mallas 
were probably Buddhists, for they always invoke Buddha, Dharma , and Sangha 
in their inscriptions. But there was no doubt a sizeable Hindu population, 
as occasionally their inscriptions also cite Brahma, Visnu, and Isvar. 
The Gorkhas are Hindu, and most of the Tibetan tribes of the region mix 
elements of Buddhism and Hinduism in their religious practices (Wright 
1877: chap. 2; Hodgson 1874; Tucci 1956: 109-112). 

2.4. Beside the various dialects of Nepali, of course, numerous Tibeto- 
Burman languages were and are spoken in Nepal (Grierson 1909, i). The 
principal Tibetan language of medieval Nepal was Newari, the language spoken 
by the Tibetan rulers of the Nepal Valley, i.e., the cities of Patan, Bhat- 
gaon, and Kathmandu. These kingdoms had flourishing civilizations during 
the time of political fragmentation of the Indo-Aryan tribes in Western 
Nepal (Regmi 1966), although they all eventually fell to Prithvinarayan 
Shah in the late 18th century. 

There was contact between the Newars and the Indo-Aryans, as evidenced 
hy inscriptions and decrees from the Nepal Valley written in Nepali (e.g., 
Clark 1957), and the influence on the Newari language (Regmi 1966, II: 826ff.), 

In addition, the Eastern Indian languages, Awadhi, Bengali, and Maithili, 
were used in the courts of the Newar kings (Chatterji 1926: 10). And 
many dramatic works are found written in Maithili at the Newar courts 
(Regmi 1966, II: 844-846). 

So, in addition to there being various Nepali dialects, determined by 
region and settlement patterns, we find in Nepal the conditions for con- 
tact among Nepali, the Eastern Indian languages, the Western Indian languages, 
old Himalayan Indo-Aryan dialects, and the indigenous Tibetan languages. 

3. A Guide to Terminology and Nepali Forms . 

3.1. Before presenting the Old Nepali data, I think it will be useful to 
explain a number of terms and forms that appear in this study with meanings 
specifically related to the Nepali data, which may not coincide with their 
general meanings in other literature. 

3.2. First, we shall be concerned with verbs formed from the perfective 
participle, of which we must distinguish two forms in this data. The old 
Nepali perfective participle is formed with the verb root plus the suffix 
yo/ya. The verb stem + yo/ya form is that inherited from Middle Indo-Aryan, 
and it is the form we find in Old Malla Nepali data.^ In Old Shah Nepali, 

a new form of the perfective participle also appears, verb stem -*• ya + ko , 
in which the ko^ appears to be the genitive postposition. As we shall see, 
the verb stem + ya + ko form is extended to most environments in which the 
verb stem + yo/ya form was found. In order to distinguish these forms, I 
shall refer to them as the ya-participle and the yako -participle . 

3.3. There are three finite tenses formed from the perfective participle 
which shall be discussed in this study. One is the simple past or perfect - 
ive past which in Nepali is formed with the ^^-participle plus personal 
endings, e.g., garyl 'I did', garyo 'he did', garyau 'we did'. These 
endings are the same for transitive and intransitive verbs. Primarily, the 
Nepali data on the simple past comes from Old Shah Nepali, but most of the 
Indo-Aryan languages use the perfective participle alone as a past tense, 
so we can assume it existed in Old Malla Nepali too. 

The other two tenses are compound perfect tenses--the present perfect 
and past perfect - -which are formed with the perfective participle and the 
conjugated auxiliary verb in the present ( cha ) or past ( thiyo ) , respectively. 
As there are two forms of the perfective participle, there are two forms 
for each of the present and past perfect tenses. In Old Malla Nepali, we 
find these tenses formed with the y'a-participle; but in Old Shah Nepali, we 
find these tenses formed with the ^-participle and yako -participle. So, 
to distinguish them in the later period, I shall refer to them as the ya- 
perfect and yako-perfect (since we will primarily discuss the present per- 
fect, that will be the unspecified case, and the past perfect will be spe- 
cified when necessary) . 

In addition to these finite tense forms, the perfective participle can 
be used as the main verb of dependent clauses. 

3.4. As for syntax, we must note that we are dealing with participles for 
most of this study, and so agreement in number and gender with the modified 
NP will be found with these verb forms, though not always. 


chori patan 

gaeki che 


go-pp-FS be-3FS 


sipaiharu pa^an 

gaeka chan 


go-pp-MP be-3MP 


*ram-le chori 

dekheki che 

-agt. girl-FS 

see-pp-FS be-3FS 


*rani-le sipaihart 

I dekhek'a chan 

In order to refer to the controlling NP, we must discuss what con- 
stitutes the "subjecf'of these various verb forms. The perfective partici- 
ple as inherited from OIA is ergative; however, that syntax has not been 
retained in the Nepali tenses. I shall here describe the syntax found in 
the perfective tenses in this data; after Dixon 1979, I shall use A 
Ctransitive subject/agent), S^ (intransitive subject), and (transitive 
direct object) to refer to the various NPs in a sentence. 

In ergative syntax, the S and are unmarked and they both control verb 
agreement, while the A is marked: 

'the girl went to Patan' 

'the soldiers went to Patan' 

'Ram saw the girl' 

'Ram saw the soldiers' 

igt. soldier-MP see-pp-MP be-3MP 

*not grammatical in Modern Nepali 

In nominative syntax (or nominative/accusative syntax), the S and A are 
unmarked and control verb agreement, while the may be marked: 

(15) a. ma pa-t:an janchu 'I go to Patan' 

I go-prs.-be-lS 

b. sipaiharu patan janchan 'the soldiers go to Patan' 

soldier-MP go-prs. -be-3MP 

c. ma ram-lai dekhchu 'I see Ram' 

I -dat. see-prs. -be-lS 

d. sipaiharu ram-l'ai dekhchan 'the soldiers see Ram' 
soldier-MP -dat. see-prs. -be-3MP 

Agentive syntax I shall use to describe the situation in which S and A 
control verb agreement, S is unmarked, may be marked, and A is always 

(16) a. ma patan gaya 'I went to Patan' 

the soldiers went to Patan' 























I -agt. -dat. see-pp-lS 

d. sipaiharu-le ram-lai dekhy'a 'the soldiers saw Ram' 
soldier-MP-agt. -dat. see-pp-3MP 

Each, of these three terins--ergative, nominative, agentive--represents 
a specific system of verb agreement and NP marking; and each represents a 
syntactic system which can be found in this Nepali data. A major point in 
this research is that Nepali evolved an agentive syntax in the perfective 
tenses from the ergative syntax inherited from MIA, and it is the processes 
through which this agentive syntax developed with which this paper is 

3.4. In summary, I present Table 1 which outlines the tenses and syntax 
found at the major stages of the language to be discussed in this study. 

Time Period 

Middle Indo-Aryan 

Old Malla Nepali 
(c. 1350-1450) 

Old Shah Nepali 

Modern Nepali 

Perfective Tenses in the History of Nepali 

Table 1 

4. Old Malla Nepali . 

4.1. Pokharel 1974 gives five Nepali inscriptions of the Mallas dating 
from 1336 to 1376 A.D., and five inscriptions of their immediate successors 
in the region dating from 1393 to 1437. These ten inscriptions, represent- 
ing Old Malla Nepali (OMN) , are repetitious and formulaic, so they really 
don't provide all the evidence we would like for describing the grammar of 
the language. Certain aspects of the perfective tense system, however, are 

Examples of intransitive (20) and transitive (21) verbs in the perfect- 
ive tenses are given below: ^ 

(20) a. sTTryagrahana sarvakar akar kari ^la S 

sun-eclipse all-tax exempt do-cp field-MP 

pasa bhaya (1337) 

gift be-pp-MP 

'on the occasion of the eclipse of the sun, having exempted 

them from all taxes, these five fields have become a gift' 

b. viutharpu raja kari akrya" bha?"? pasa" bhai (1337) 

rule do-cp exempt-pp bond-FS gift be-pp-FS 

'having been authorized in Piutharpu, this tax-free bond 
has become a gift' 

c. mah'arajadhiraja valirajT-ki maya bhaicha (1398) 

-gen.-FS gift-FS be-pp-FS-be-3S 

'this was a gift of Maharaja Baliraj' 

— C^ — 

d. ramadasa padhya tin laya suda maya bhaicha (1398) 

three for with gift-FS be-pp-FS-be-3S 
'this was a gift for Ramdas Padhya and three others' 

e. udaivrahma ajita vrahma-lai m'as vasta-ko cau^hag 

-agt. month property-gen. -MS quarter 

toli set-udho dhig vagaj-ghaf^a ij^marpata-ko 

lower field-up to hillock river bank-watermill -gen. -MS 

cauthag ekatra gh'ali pasa bhayo cha (1437) 

quarter together put-cp gift be-pp-MS be-3S 

'through Udaya Brahma and Ajita Brahma, there has become a 

gift of a quarter of Damarpata including the river bank, 
watermill, and hillock up to the lower field together 
with a quarter of the monthly duty' 

We can see that an intransitive verb in the simple past (20a, b) or present 
perfect (20c, d,e) agrees with its subject in gender and number. 

(21) a. asela ma pani acefdrarkasthayi sarvavadhavinirmukta 

now I also moon-sun-stand all-trouble-relieved 

catutisima paryanta visuddha, sarvakara akar , 
four-border ends pure all-tax exempt 

sarvasevlvirahit kanakapatra-ki hjia^a kari sasan 

all-service-free copperplate-gen. -FS bond-FS do-cp copy 

doholi kari golhu joisT pasa kari akrya chtf (1356) 
double do-cp -dat. gift do-cp exempt-pp be-lS 

'now I have granted this tax-free bond of this copperplate to 
Golhu Joisi, having made two copies, having freed from 
all service, having exempted it from all taxes, having 
made it pure to the four borders, having relieved it of 
all duty as long as the sun and moon endure' 


b. athaga ekatkalyama-ko jhusu joisT pasa kiya cHu (1393) 
eighth -gen. -MS -dat. gift do-pp be-lS 

'I have given an eighth of Ekatkalyama to Jhusu Joisi' 

— ^ nt "^ 

c. caudilagava-ko 'aio 1, haku-ko alo 1 yekatra a*!" 2 

-gen. -MS field -gen. -MS field together field-MP 

tile kuse sahit sakalpa ghali that kari 

sesamum seed grass with avowal place-cp decide do-cp 
siusarma visnudasa pasa kiya chau (1404) 

gift do-pp be-lP 

'having made the decision, having made the avowal with sesamum 
seed, we have granted Shiva Sharma and Visnu Das together 
two fields, one from the Caudila village, and one from Haku' 

cacj^i sivanirm^ila gari 

lift-cp -pure do-cp 

pasS" ki akr/g chau (1437) 

gift do-pp exempt -pp be-lP 

'having done the sacrifice in Somnath and made it Shivapure, 
we have granted this to Andhu Joisi' 

The sentences of (21) show that the agent of a transitive verb in the 
present perfect can control verb agreement in number and person with the 
auxiliary. (No examples are found in this data of transitive verbs in the 
perfective past with agents expressed.) 

4.2. The transitive verb phrases that appear in these texts present some 
etymological problems, because the past participle found most frequently 
akr/a has no equivalent verb in either Modern Nepali or Sanskrit. But in- 
ternal evidence points out that we should accept it as the perfective part- 
iciple of a verb * akr - 'to grant a gift, to exempt from duty'. 

4.2.1. All the sentences in (21) are Nepali versions of common Sanskrit 
formulas found in copperplate inscriptions throughout India, as we can see 
by comparing them with the two Sanskrit (22a, b) and one Oriya (22c) in- 
scriptions below: 

(22) a. maya ... chuyipako n?ma grTmal? ... t{r] i-chatvartsad - 
1-ins. name vil lage-MS-nom. 

atharvvani-ka-kule-bhyo=graharTkkritya datta [h] 

family-P-dat. land-grant-cp give-pp-MS-nom. 

'1 have given the village named Chuyipaka as a land grant to 

forty-three families of Brahmans who study the Arthava-Veda' 

(Source: Fleet 1883. Date: 628 A.D.) 

b. bhik^iTna^ dvijadharmabhaijakapani sutradhargkanaip ca / 

-P-gen. Brahman-preacher-P-gen. artisan-P-gen. and 

nijar'ajye sarvakaras ten^^camdrarkafSrakam tyaktal? 

kingdom all-tax-MP-nom. he- ins . -moon- sun- endure relinquish-pp-MP-nom. 

'he has forgiven all the taxes of the Bhiksus, Brahmans, 

preachers, and artisans as long as the sun and moon endure' 

(Source: Tucci 1956. Date: 1354 A.D.) 

c. puru^ottamapura sasana bhumT caudasa-a§tottara 

rent-free-estate land 

h'S 1408 ti dana deluip 

gift give-pp-lS 

'I have given 1408 batis of land as a rent-free estate' 

(Source: Tripathi 1962: 284-285. Date: 1472 A.D.) 

In such phrases expressing the acts of giving or of exempting estates from 
tajces, we expect to find a verb of giving in the perfective tense. In the 
Sanskrit example (22a), the form is the ta-participle of da" 'to give'. In 
the Sanskrit example (22b), we find the perfect participle~of tyaj 'to 
relinquish' . And in (22c) , we find the Oriya reflex of the ta-participle, 
de-1 -, which is not ergative but is still the Oriya perfective past tense. 
Thus, * akr - by context should be a verb of giving in the perfect in the 
OMN inscriptions. 

4.2.2. The verb * akr - is also found paralleling injunctive forms in these 

(25) a. yo bhaga" abhayamalla-ki sakha pasa" ki 

this bond-FS -gen.-FP descendents gift do-pp-FS 

akra joisi mahir'aja joisi-ki sakh'a celi-ko 

exempt- inj. -gen-FP descendents daughter-gen. -MS 

celo ^i bhuca (1376) 

son etc. enjoy-inj . 

'let the descendents of Abhayamalla grant this bond; let the 

descendents of Joisi Maharaja, the sons of daughters, etc. 
enjoy this' 

b. medini brahma-ki sT^a celi-ko 

-gen.-FP descendents daughter-gen. -MS son 

pasa" kara / dattu joisi deuraja joisi jhusu 
gift do-inj. 

joisi-ko celi-ko celo bhiyca (1393) 

gen. -MS daughter-gen. -MS son enjoy-inj. 

'let the descendents, sons of daughters, of Medini Brahma keep 
this promise; let the sons of daughters of Dattu Joisi, 
Devaraj Joisi, and Jhusu Joisi enjoy this' 

So, use in context and the ability to be conjugated support the fact that 
* akr - is a verb. 

4.2.3. Finally, the form of * akr - in the inscriptions parallels that of a 
verb like kar - 'to do', as in (21b, c). The ^ suffix is the Nepali devel- 
opment of the MIA i5 for the perfective affix. The suffix is added to the 
perfect verb stem, thus ki + y? 'done' and akr * ya 'exempted'. 

So, despite the fact there is no Modern Nepali verb * akranu , meaning 
'to bequeathe, to make tax-exempt', we can assume these forms represent 
periphrastic perfect verbs from an OMN verb * akr -. 

4.3. It is also difficult to determine the form of the few expressed 
subjects of these transitive verbs. All we find is ma as in (21a). This 
must be some form of the first person singular pronoun, and it appears to 
be uninflected. At this stage of the language, such a form may well 
represent the leveling of nominative and instrumental pronoun case forms 
(Hoernle 1880: $430; Hock 1981). and so we can't really tell whether ma 
is instrumental or nominative. 

4.4. The OMN data does provide some evidence for the lA ergative construct- 
ion, as opposed to the examples presented in section 4.1. In (24a), the 
logical direct object eti vrtti controls verb agreement with the past 
participle akri , while in the first clause of (24b), again the DO jo bhasa 
controls ki thi , a past perfect form. 

(24) a. eti vrtti purvili "adityamalla rai-ko pupyamalla rai-ko 
such grant-FS ancestors -gen. -MS -gen. -MS 

t'ar ad e i go sa i t i paya himjiu-ki suryagrahapa 

three clans -gen. -FS sun-eclipse 

candragrahaqa sakalpa ghali pas'? kari akri (1356) 

moon-eclipse avowal put-cp gift do-cp exempt-pp-FS 

'in the three clans of Taradevi, Gosai, and Himjiu, having 
made the avowal at the time of the eclipse of the moon 
and at the time of the eclipse of the sun, such a land 
grant has been promised through our ancestors Adityamalla 
the king and Punyamalla the king' 

b. jo bha?? prthvimalla rai-k'a pasa ki thi 

which bond-FS -gen. -MP gift do-pp-FS be-pst-FS 

tai bha'sa ma pasa ki akrya" chTT (1376) 

that-emp. bond I gift do-pp-FS exempt-pp be- IS 

'which bond had been promised by Prithvimalla the king, 
that bond I have granted' 

In the second clause of (24b), the same verb phrase pasa* ki appears, but 
here, the auxiliary verb chiY agrees with the logical agent ma, and so the 
complete verb phrase agrees with the agent. In both ergative clauses, the 
logical agent appears in the genitive, marked by ko. Whether the genitive 
phrases in these sentences are agent phrases is unclear. The genitive 


could be used for agents in Sanskrit (Speijer 1886: $114; Hock In Press), 
and it does appear marking agents in some Himalayan lA dialects (Grierson 
1916, iv: 502, 570, 694). And if we look back at (20c) and (20e) , we find 
formulaic sentence types with pasa/may"^ bhayo 'there has been a gift', in 
which the logical agent is marked by the genitive (20c) or instrumental 
(20e) , which also suggests there may be some variation in the use of these 
two cases in certain contexts. 

These data raise some questions in the interpretation of the OMN data, 
which ultimately must be left without definite answers because of the lack 
of crucial evidence. It appears that like other early lA languages, OMN 
used the ergative construction; however, the only transitive sentences in 
which this can be readily apparent are those with feminine or plural mascu- 
line direct objects. Very often in other early lA languages, the agent 
and nominative noun phrase had no distinguishing marker/inflection; and 
very often neutral third singular masculine verb agreement is found (cf. 
Chand in Beames 1872-1879, II: $57). 

In general, OMN forms such as vytti akri and ma pasT akrya could be 
accounted for by these early lA characteristics, in that the former DO 
clearly agrees with the verb, and the latter DO may or may not agree, i.e. 
the sentence is ambiguous as to verb agreement. However, the obvious al- 
ternations in the auxiliary verb dependent upon the first person agent in 
transitive sentences requires more explanation. 

4.5. All the examples in which we find the ergative construction have 
third person subjects, and those in which the auxiliary agrees with the 
agent have first person subjects. This might be significant, for special- 
ization of ergative and nonergative constructions in the perfective tenses 
for certain persons has occurred in Marathi (Hock 1981; Stumg To Appear; 
Master 1964; Bloch 1914: $252; Kachru § Pandharipande 1979) 


In Old Marathi, there is a personal affix for second person agents 
with transitive verbs in the perfective tense (Master 1964: 132). Bloch 
(1914: 263-264) states that in Modern Marathi dialects, the construction 
in which the transitive subject is nominative and the verb agrees with it 
is strongest in the first and second persons, and in the third persons, the 
ergative and nominative constructions alternate. Grierson 1905 documents 
a variety of dialectal forms ranging from partial to full nominative systems 
with the perfective participle. 

We can illustrate the Marathi situation with some data from the Konkani 
dialects. In (25), the agent suffix--s^ for second singular, n for third 
singular, in for third plural--is added directly to the inflected perfective 

(25) a. tu kam kel-'e-s 'you have done the work' 

you-S-nom. work-NS do-pp-NS-2S 

b. ;^^ pothi lihil-i-s 'you have written the book' 

you-S-nom. book-FS write-pp-FS-2S 

c. blpa-n ■ ■ . mithi marl-i-n ani 

£ather-MS-ins. embracing-FS strike-pp-FS-3S ins. and 

te-ts^ muko ghetl-o-n 

him-to kiss-MS take-pp-MS-3S ins. 

'his father embraced and kissed him' 

d. saheba-nT ma-la dil-e-nT 'the sahabs gave me a tip' 

-MP-ins. me-to give-pp-MS-3P ins. 

(Sources: Bloch 1914: 262-263; Grierson 1905: 210-216) 

In these forms, the third person agents (25c, d) are both marked with the 
instrumental postposition, and it is this postposition that is used as a 
suffix on the participle to indicate what the agent is. The second person 
form tu is nominative or instrumental, the two cases having fallen toge- 
ther, and the £ suffix is borrowed from the second person personal endings 
used with intransitive perfective participles. 

Two points of similarity appear between the OMN data and these Marathi 
forms. First, the first and second person agents are innovative in chang- 
ing from the inherited ergative syntax to a syntax with a "nominative" 
subject and personal verb suffixes in the perfective tenses. Second, the 
innovation of agent verb marking is not as much a process of change in 
forms as it is a modification of the existing forms. Compare: 

(26) a. tu kam ke-l-e'-s Marathi 

2S-noin./ins. NS-nom. V-pp-NS-2S 

b. ma pasa akr-ya chtT Old Malla Nepali 

IS-nom./ins.? MS-nom.? V-pp-MS? IS 

In both the Marathi and Nepali sentences, the agent is not distinguished 

as to case, and the verb is a form that could agree with the direct object, 

although in each case there is an addition to the verb phrase which marks 
the agent. 

If the analysis of the OMN data is parallel to that of the Marathi 
evidence, it would offer us some explanation of the forms that occur in OMN. 
The tendency to mark a nonthird person agent on the perfective participle 
may be the starting point for the weakening of ergative agreement. As we 
shall see in the Old Shah Nepali data, personal suffixes are also used on 
the perfective participle to agree with the agent; the use of the conjugated 
auxiliary in the compound tenses may be a parallel development. First and 
second person agents may be more susceptible because pronoun forms can be 
leveled, thereby obscuring case distinctions, while third person agents and 
nouns are more likely to retain some form of marking, the instrumental post- 
position in Marathi, and perhaps ko or Le (the genitive or instrumental) 
in Nepali. And, as in Marathi, a complete restructuring of the verb phrase 
is not necessary to advance this type of innovation. In Marathi, of course, 
the full range of verb forms is available, but in the Nepali of this period, 
we have only the feminine forms ki^ and akri and the nonfeminine forms kiy? 


and akrya* , which are probably incomplete, and/or may reflect the beginning 
of the Nepali two-gender system. In any event, their distribution is not 
sufficient to allow us to make definite conclusions. 

4.6. We know then that ONIN transitive verbs in the present pefect could 
agree with their agents, at least if the agent is first person. There is 
no apparent agent marker, but the case of the agent is indeterminate--we 
find the uninflected form ma, which could be nominative and agentive, and 
we find genitive phrases in some situations where third person agents are 
expected. Whether this represents a distinction made between nonthird 
persons and third persons, or pronoun vs. noun, or some other opposition, 
is not clear from this evidence. Evidence from Marathi suggests that non- 
third person marking may be a starting point for any innovations in per- 
fective verb marking, but without further OMN evidence, we cannot really 
say whether the differences in syntax between sentences with first person 
agents and those with third person agents (in genitive phrases) is 
systematic . 

5. Old Shah Nepali . 

5.1. The language of the inscriptions after 1500 represents quite a dif- 
ferent form of Nepali. For example, OMN uses the bare stem for most subject 
and object NPs, or anusvara for dative/accusative objects, but Old Shah 
Nepali (OSN) has a set of postpositions for the various case roles, includ- 
ing l_e for ergative/instrumental, Oi / kana for dative/accusative (cf . 
Wallace 1981); OMN has kar - for the stem of 'to do', but OSN has gar -; the 
perfective participle in OMN is formed from the verb stem plus ^, while 

in OSN we find ^ and yako for the perfective participle form, the latter 
apparently being yE. extended by the genitive postposition ko; and OMN uses 
the nominative for the subject of injunctive verbs (27a), but OSN uses 
the ergative (27b): 

(27) a. chidy'a-k'a gava-ki cari k"ai:ana kohi na-pTva (1356) 

-gen.-O village-gen. -FS land to cut someone not-allow-inj . 

'let not anyone be allowed to take away land from the 
village of Chidya' 

b. anya"- anya svavasa pravasa kasai-le 

other-other own-descendent other-descendent someone-agt. 

dharma na-gh'ala (1529) 

not-destroy-inj . 

'let not anyone, one's own descendents, someone else's 

descendents, or any other person, destroy this dharma' 

5.2. The perfective tenses of OSN are much more fully documented than those 
of OMN, and they present a greater diversity of forms than those of OMN. 

We shall be concerned with three tense forms in OSN: first, the perfective 
past, which is formed from the yT-participle plus personal suffixes: 

(28) a. 7 rupiya agl~ 7 visa rupiya 53 than kapa4~ 
each twenty cloth 

ukildara-tira lagya (1751) 

overseer-toward attach-pp-3MP 

'the overseers received seven rupees each, 140 rupees total, 
plus 53 thans of cloth' 

b. devatarppana pitaratarppana sadhya "adi 

god-oblation ancestor-oblation evening-oblation etc. 

samasta karma jas-le garyo (1670) 

all who-agt. do-pp-3MS 

'whoever did all the duties, god-worship, ancestor-worship, 
evening-worship, etc' 

c. sri ra'ghau josi-le sri visvesvara stha'pana' garya (1712) 

-agt. enshrinement do-pp-3MH 

'Shri Raghau Joisi established this shrine for Shiva' 

second, the ^^-stem present (and past) perfect, which is equivalent to 
the present perfect in OMN: 

C29) a. vadi-ka" set-mathi dasa mana 10 sakalpa bhayo cha (1679) 
stream-gen. -0 field-on 10 avowal be-pp-MS be-3S 

'a promise has been made for 10 manas of land on the flood- 

motipur-ka" 2 ala^ vaksi 

field-MP -gen. -MP field-MP give-cp 

diya" chau (1529) 

give-pp-MP be- IP 

'we have given two fields of Motipur' 

c. kusmapadhya-le r'aj^-k'a vac? p'ay'a bhaich a (1722) 

-agt. king-gen. -MP promise receive-pp-MP be-pp-be-3S 

'Kusmapadhya has received the promise of the king' 

and third, the yako present perfect, which incorporates a new form of the 
perfective participle, introduced into the inscriptions in the 16th 

C30) a. tes udi-ka ;et-mathi sakalpa bhayako cha (1679) 

that flood-gen. -0 field-on avowal be-pp-MS be-3S 

'a promise has been made on that flood-field' 

b. van pT^o inu tin jana'-le hlmu diyako cha (1590) 
forest hill these three man-agt. we-dat. give-pp-MS be-3S 

'these three men have given us the forest and hillside' 

c. musikot-m^ hftnra" maiya diySki hun (1800) 

-in our-H sister-FH give-pp-FH be-3H 

'our sister was married in Musikot' 

In the perfective past and j^-perfect, the subjects of intransitive verbs 
and agents of transitive verbs control verb agreement, even though the 
agents are marked with the ergative postposition le^. The agents of the 
yako -perfect are also marked with le^, but they are unable to control verb 
agreement. Subjects of intransitive yako -participles control verb agree- 
ment, and direct objects of transitive yako -participles may control verb 
agreement. The compound yako -perfect tenses are ergative. The other two 
tenses are neither nominative nor ergative, for the subject of an intransi- 
tive verb is unmarked and controls verb agreement, the object of a transi- 
tive verb is also unmarked--unless animate--but cannot control verb agree- 
ment, while the agent is always marked and always controls verb agreement. 

5.3. Whether the system illustrated by the perfective past and ya-perfect 
should be labelled "ergative" is debatable. Certainly, it would be useful 
to distinguish the syntax of these two tenses from that of the yaVo -perfect 
which does conform to common definitions of ergativity.° So, 1 shall adopt 
the term "agentive" for the situation in which all three terms--subject of 
intransitive verb, agent of transitive verb, and direct object of transi- 
tive verb--are treated differently. 

5.4. Despite the fact that we know little about the perfective system of 
OMN, we can see there are two major differences between it and the per- 
fective system of OSN. First, OSN has the agent marker le_ for transitive 
subjects and OMN does not. Second, ergative syntax in OSN is dependent 
upon the form of the perfective participle used in the present and past 
perfect tenses, whereas ergative syntax in OMN is (a) optional or (b) 
dependent upon the person of the agent. 

The perfective past is not a particularly common form in the early 
literature, but we can reconstruct much of its history from its later form 
and comparison to past tense developments in other languages. The OSN 
^-perfect appears to be a continuation of the ^-perfect in OMN, but the 
yako -perfect is a new development. Below, 1 shall discuss the various 
aspects of the perfective system in OSN and their developments in later 
stages separately, beginning with the introduction of le^, then moving to 
the perfective past, and finally to the two forms of the compound perfect 

6. The Agent Marker le . 

6.1. The introduction of ergative le_ in the 16th century coincides with 
that of ne_ in the Literary Hindi of the same period (Hoernle 1880: $311; 
Beames 1872-1879, I: 270; Kellogg 1893: $196), and so it is tempting to 
conclude that the settlers who spoke OSN had indeed come recently from 
Rajputana and brought with them the Western Indian concept of using an 
agent postposition, or that they reinforced its use among those Nepali 


speakers in the area already. And certainly since Nepali shares so many 
other Western Indian characteristics (Hoernle 1880: Introduction; Chatterji 
1926: Introduction), this seems a reasonable assumption. 

6.2. In the early data from the OSN period, it is clear that the agent- 
marker l_e has about the same distribution as ergative postpositions in 
other Western Indian languages (cf. Hock 1981). For example, le marks 
transitive agents in the perfective past (31a) , compound present perfect 
(3Ib) , and injunctive (31c), and any subject in the obligational construct- 
ion (31d,e) . 

(31) a. sri r~ghau josi-le sri visVesvara sthapari^ garya" (1712) 
-agt. enshrinement do-pp-3H 

'Shri Raghau Joisi has established this shrine for Shiva' 

jana-le hamu diyako cha (1590) 

forest hill these three men-agt. we-dat. give-pp-MS be-3S 
'these three men have given us the forest and hillside' 

anya-anya svavasa prav 


other-other own-descendents other-descendents someone-agt. 
na-ghala (1529) 

not-destroy-inj . 

'let not anyone, his own descendents, someone else's descend- 
ents, or anyone else, destroy the dharma ' 

d. vaman-ke gai marniya-le tin din upav'as garnu (1723) 

brahman-gen. cow kill-inf . -agt. three day fast to do 

'he who kills the cow of a Brahman must do three days' fast' 

e. k'aski-1'ai choji gorkh"a-le kaKa j'anu cha (1755) 

-dat. leave-cp -agt. where to go be-3S 

'having left Kaski, where must Gorkha go?' 

But at least later in the period, during the 18th and early 19th centuries, 
we find the use of le^ spreading to agents in the nonperfective tenses. In 
the examples below, j_e marks the agents of the imperfective (32a, b), the 
present (32c, d,e), and the future (32f ) . 

(32) a. kali^ga de^a-ko tyo raja-kana sabai-le 

country-gen. -MS he king-dat. all-emp. -agt. 

madaichan (1825, VI) 

consider-impf . -emp. -be-3MP 

'everyone knows that he is the king of the land of Kalinga' 

b. manu$ya pasu p'3'chi sabai j'ata-ko bha§a 

man animal bird all-emp. being-gen. -MS language 

mai-le jadachu (1825, V9) 

I-agt. know-impf .-be-lS 

'I know the languages of all beings--men, animals, and birds' 


parantu hami-le ta kapat; ta misnya chainS ti (1766) 

furthermore we-agt. deceit mix-inf. be-neg.-lP 

'furthermore, we will not tolerate deceit' 

mana-ma nis'caya gari bTrabara-le ca'diki debT-ko 
mind-in certain do-cp -agt. goddess-gen. -MS 

garcha (1824, V4) 

praise do-prs. -be-3MS 

'having decided, Birbar praises the goddess Candika' 

e. kyan bhanaul"? vahida m'anchya-le darvar-ma 
why say-2P-fut. outside man-agt. palace-in 

vithiti garauchan (1775) 

treachery do-cause-prs. -be-3MP 

'why do you say that foreigners would cause disorder in the 
palace? ' 

f. mai-le hukara garagla (1825, V7) 
1-agt. command do-cause-lS-fut . 

'I will have you do my command' 

In addition, we can also find J_e marking the subjects of intransitive verbs: 

(33) a. tava vyadha-le tas talau-m? pahi pani "ayo . (1776) 
then hunter-agt. that pool-in water also come-pp-3MS 

'then too the hunter came to the water in that pool' 

b. samudradatta-le ratnaJTpa-le ujy'glo bhayako koth'a-in'a 
-agt. good-lamp-ins. light be-pp-MS room-in 

sutna-kana calyo (1825, V3) 

to sleep-obj. move-pp-3MS 

'Samudradatta went to sleep in a room lit by a beautiful lamp' 

6.3. It is not common among the Western Indian languages to use the ergative 
marker in this way (cf. Kachru § Pandharipande 1979); however, two factors-- 
one internal, one external--have probably contributed to the spread of le 
into these environments in Nepali. The first factor we shall consider is 
the structure of reduced clauses formed with the conjunctive participle and 
the interaction of these clauses with main clauses through the rule of con- 
junction reduction; and the second is the distribution of ergative syntax 
in the languages with which Nepali speakers are in contact. 

6.3.1. The conjunctive participle in -i_, in more modern texts also - era 
(from -i_ plus ra 'and'), and also - ikana (from -i_ plus kana , the old dative/ 
accusative postposition), is formed with the perfective stem of the verb; 
thus, it is a perfective participle and should exhibit the properties of 
other perfective participles in the language, including ergative syntax. 
A construction with a conjunctive participial clause indicates two actions 
that closely follow one another or occur at the same time, or two actions 

one of which causes the other. When the underlying subjects of the two 
clauses are identical, conjunction reduction applies; thus, the agent or 
subject of the dependent clause would be deleted under identity with the 

agent or subject of the main clause (34a, b,c), and vice versa (34d,e,f). , 

(34) a. jetKaba'-ko chora ma h ina din-ko 

father's eldest brother-gen. -MS son-H month day-gen. -MS 

bida paera hi jo kalkatt;a-bata ghar "ae 

leave get-cp yesterday -from house come-pp-3MH 

'yesterday my cousin came from Calcutta on a month's leave 
' " ~ ■ ■ iryo 

he sick be-cp die-pp-3MS 
'he became sick and died' 

c . saman bokne mahche-le ek chin ariera "aram garyo 
luggage carry-inf. man-agt. one moment halt-cp rest do-pp-3MS 

'the man carrying the luggage stopped and rested for a moment' 

d. mai-le tyas-lai bhetera ghar-ma g^ 
I-agt. he-dat. meet-cp house-in go-pp-lS 

'after I met him, I went home' 

e. us-le bikh khaera maryo 
he-agt. poison eat-cp die-pp-3MS 

'he died from eating poison' 

f . ma turanta b^hira dagurera gaTkana mera 

I at once outside run-cp go-cp my-MP 

chimekiharu-1'ai sodhe 
neighbor-pl .-dat. ask-pp-lS 

'I immediately ran outside and asked my neighbors' 

(Sources: Clark 1963: 160-177; Meerendonk 1949: 103-105) 

In these examples from Modern Nepali, the "nominative" NPs chora" and u in 
(34a, b) are expected subjects for the intransitive main verbs ^ and maryo , 
respectively, while the ergative NP manche-le is expected for the perfective 
transitive verb garyo in (34c). On the other hand, the main verbs of (34d,e) 
gae and maryo are intransitive, so the surface subjects mai-le and us-le , 
respectively, are the agents of the transitive conjunctive participles 
bhet:era and khaera , and it is these ergative NPs that must control deletion 
of the main verb subjects, which would be nominative. Similarly, the nomi- 
native NP ma, the subject of either intransitive dagurera or gaTkana , must 
control deletion of the expected ergative agent of sodhe in (34f) . 

As we can see in the modern examples in (34) and the OSN examples in 
(35), nominative subjects and ergative agents are treated as being alike 
by the rule of conjunction reduction; therefore, conjunction reduction must 
be sensitive to underlying or logical subjects rather than just surface roles. 

In (35a, b), the underlying subjects of the main clauses are marked as ergative 
on the surface, and they have controlled deletion of the subjects of intransi- 
tive conjunctive participial phrase; while in (35c), the subject of an intransi- 
tive main verb controls deletion of the agent of a transitive participle. 

(35) a. butjhiyaT-le pani ina-ko taraha heri ba(huta) 

old woman-agt. also them-gen.-MS type see-cp very 

khusi bhai "aHarapurbaka ghara-ma bas^i (1825, VI) 
happy be-cp respect-custom house-in stay-cause-pp-3FS 

NB: heri --trans. ; bhai --intrans. ; basai --trans. § perf. 

'the old woman, having seen what kind of men they were, 
happily lodged them in her house with respect' 

b. mai-le smasaha-m'a basi mantra s'adhana' garnu cha (1825, VI) 
I-agt. cemetary-in stay-cp practice to do be-3MS 

NB: basi --intrans. ; garnu cha --obligational § erg. 

'I, seated in the buring ground, must do some incantations' 

c. i^aputra pani tasai matrikan/a-ko hydaya-mT 

prince also that-emp. minister-daughter-gen. -MS heart-in 

dhyaha rakhi aphna* nagara-bige calya* (1825, VI) 
attention place-cp self city-in move-pp-3MH 

NB: rakhi --trans. ; caly'a --intrans. § perf. 

'the prince, thinking only of the minister's daughter in his 
heart, returned to his own city' 

In sentences like those in (35), it is clear which subject has been deleted 
by conjunction reduction: in all three of the above cases, the subject of the 
embedded participial clause has been deleted. However, when both subjects 
are nominative on the surface (36a), or both are marked as ergative NPs (36b), 
then the structure of the sentence is ambiguous as to which NP has been deleted. 

(36) a. rajaputra ghod?-ma asvavara bhai caly? (1825, VI) 

prince horse-in mounted be-cp move-pp-3MH 

NB: bhai --intrans. ; calyT -'intrans. fi perf. 
'the prince left mounted on a horse' 

b. matrlkanyT-le sanmana gari bujhiya-kana 

minister-daughter-agt. honor do-cp old-woman-dat. 

bas?yT bhala kusari gar in (1825, VI) 

stay-cause-cp good greeting do-pp-3FH 

NB: gari --trans. ; basa'yT - -trans. ; garin --trans. & perf. 

•the minister's daughter greeted the old woman well, had her 
sit down, and treated her with respect' 

For sentences like those in (36), there are two possible surface 
structure analyses; these are illustrated in (37) for intransitive main 

and participial clauses like (36a), and in (38) for transitive main and 
participial clauses like (36b) .9 

C37) a. ma khusi bhai pa^an gae' 

I happy be-cp go-pp-lS 

'I went to Patan happily' 

b. [ma [0 khusi bhai ] patan gae ] 

c. [ [ ma khusi bhai ] patan gae ] 

(38) a. mai-le pas" gari alo di^ 

I-agt. gift do-cp field give-pp-lS 

'I presented the field as a gift' 

b. [ mai-le [ pasa gari ] ^lo die ] 

c. [ [ mai-le pasa" gari ] "Slo di'e ] 

In both these cases, since the main and dependent clauses have the same 
transitivity and the same tense, the surface structures are ambiguous. 

The examples of conjunction reduction in OSN we've looked at so far 
have all resulted in the subject of the main verb controlling deletion and 
the subject of the participial clause being deleted, or else the roles of 
controller and victim were ambiguous because both clauses had either "nom- 
inative" or "ergative" underlying subjects. 

We've also seen that there is no restriction in the application of con- 
junction reduction on the identity of the surface roles of the controller 
and victim (cf. 35), i.e., ergative agents may control and cause deletion 
of nonergative subjects, and vice versa. There are, of course, also no 
restrictions on the tense or transitivity of the main clause. 

The conjunctive participial clause, however, will always be ergative, 
whatever the tense or transitivity of the main clause. When the main clause 
verb is nonperfective and/or intransitive, and the conjunctive participle 
is transitive, there would be a conflict between the surface case of the 
two subjects--a transitive agent in the conjunctive participial clause would 
be ergative, and the subject of the main clause would be nominative. A 
conflict would also occur when the participial clause was intransitive and 
the main clause was transitive and perfective. 

Conjunction reduction would still apply in either of the above cases, 
and surface structures like those in (39), (40), and (41) could arise from 
the conjunction of the two clauses in (39a), (40a), and (41a), respectively. 

(39) a. mai-le pas'? gare ra ma patan gae 

I-agt. gift do-pp-lS and I go-pp-lS 

'I made the gift and I went to Patan' 

b. ma pasa gari pa^an ga^ 

c. mai-le pasa* gari pa^an gae 

'having made the gift, I went to Patan' 

(40) a. mai-le pasa* gare* ra ma alo dinchu 

I-agt. gift do-pp-lS and I field give-prs. -be-lS 

'I made the gift and I shall give the field' 

b. ma pas? gari alo dinchu 

c. mai-le pasa* gari alo dinchu 

'having made the gift, 1 shall present the field' 

(41) a. ma pa^an ga%^ ra mai-le pas? gare* 

I go-pp-lS and I-agt. gift do-pp-lS 

'I went to Patan and I made the gift' 
b. mai-le pa'tan gai pasa gare 

ma patan gai pasa gare 

'having gone to Patan, I made the gift' 

In these sentences, the main clause subject may control deletion of the 
participial clause subject (the 'b' sentences) and so appear in the surface 
structure. Such sentences are not particularly unusual, because the main clause 
subjects usually do control deletion. However, when the subject of the re- 
duced participial clause controls deletion (the 'c' sentences), then the 
case of the surface subject conflicts with the expected case of the subject 
of the main verb. Thus, in (39c) and (40c), the expected case of the main 
verb would be nominative, and the surface subject is ergative; and in (41c), 
the expected case of the main verb subject is ergative, and the surface 
subject is nominative. (I shall temporarily ignore seTitences like 41c in 
most of the following discussion.) 

We have seen that sentences in which the participial clause subjects 
control deletion can occur in Modern Nepali; sentences in which the case of 
the surface subject and the expected case of the subject of the main verb 
conflict also are found in the OSN data. For example: 

(42) a. tas vagh-le snTn gari Hat-nia kus likana 
that tiger-agt. bath do-cp hand-in grass take-cp 

va4o talau-k? tir-ma ubhi-rahyo (1776) 

big pool-gen. -0 direction-in stand-remain-pp-3MS 

'having taken a bath, that tiger was standing near the big 
pool with grass in his hand' 

b. pheri baija* parakramT raja bikramasena-le tas-kana 
again great powerful king -agt. him-dat. 

kadha-nia r'3khi calya (1825, Vll) 

shoulder-in place-cp move-pp-3MH 

'after placing him on his shoulder again, the great and 
powerful King Bikramasena left' 

c . tahi basi rahyaka coraharu-le basumatT-k'a 

there stay-cp remain-pp-MP thief-pl . -agt. -gen.-O 

jar a puruga-kana jhudai marT aphuharu 

adulterer man-dat. hang-impf . -emp. kill-cp oneself-pl. 

bhIgT gayachan (1825, V3) 

escape-cp go-pp-be-3MP 

'the thieves who had remained behind had themselves fled 
after they killed Basumati's lover by hanging' 

(43) a. jnani jana-le dhan-jiv $arccikana pani paropak'ar 

wise man-agt. wealth-life spend-cp also charity 

garchan (1776) 


'having used up their lives and wealth, wise men do charity' 

b. tyo suni bicara gari mai-le chini diul'a (1825, V3) 
that hear-cp thought do-cp I-agt. resolve-cp give-lS-fut. 

'having heard that and thought about it, I shall decide' 

c. taha pugy" pachi garuda-le bhaksana gari 
there arrive-pp after -agt. eating do-cp 

rakhnya' chaina (1825, V16) 

place-inf. be-neg.-3S 

'after you have arrived there, Garuda will not keep you 
without eating you' 

In (42), the main verbs, ubhirahyo , calya" , and gay'achan , are all intransitive, 
but the surface subjects, v'agh-le , bikramasena- le , and coraharu-le , respect- 
ively, are all marked as ergative; and each sentence has at least one de- 
pendent, participial clause which is transitive and so would have its under- 
lying subject marked as ergative. In (43), the main verbs are all transitive 
and nonperfective, and for these, too, we would expect nominative subjects; 
yet, in each sentence, there is a conjunctive participial phrase the ergative 
NP of which has apparently controlled deletion of the main clause subject. 
So, in (42) and (43), the main clause nominative subject has been deleted. 

Sentences in which the ergative NP of the main clause is deleted through 
identity with a nominative conjunctive participle subject also occur: 

(44) k^atisTla prasanna bhai raja-ko bahutai stuti garyo (1825, V25) 

pleased be-cp king-gen. -MS much-emp. praise do-pp-3MS 

'pleased, Kshamtishila praised the king greatly' 

In (44), the apparent surface realization of the underlying subject of the 
transitive, perfective verb garyo is kgatisTla , a nominative NP and subject 
of bhai , an intransitive conjunctive participle. 

All these OSN examples involve a conflict between the case of the 
surface subject and the expected case of the subject of the main verb, 
rather than the structure of the sentences being ambiguous because the sur- 
face subject would be appropriate as the subject of either the dependent 
or main clause. As we have seen, the only restriction on the application of 
conjunction reduction is that the underlying subjects of the verbs be identical, 
however, in the OSN data sentences in which the main clause subject controls 
conjunction reduction occur much more frequently than those in which the 
conjunctive participle subject controls deletion. Given that there are exam- 
ples in which the surface subject is ambiguously either the main clause or 
participial clause subject, and the fact that the surface subject is gener- 
ally the main clause subject, sentences like those in (42) and (43) could be 
misanalyzed as having surface structures in which the surface subjects are 
understood as being the subjects of the main clauses; thus, they could also 
be "ambiguous" and have two possible analyses like those sentences in (36), 
cf. (37) and (38). Thus: 

(45) a. mai-le pasT gari pa'tan ga'e' 

'having made the gift, I went to Patan' 

b. [ [ mai-le pas" gari ] patan ga'e ] 

S ' S 

c. [ mai-le [ pasa! gari ] patan gae* 1 

s ■ S 

(46) a. mai-le pasa* gari alo dinchu 

'having made the gift, I shall give the field' 

b. [ [ mai-le pasIT gari ] alo dinchu ] 

c. [ mai-le [ pasl gari ] alo dinchu ] 

The sentences in (42) and (43) could be reanalyzed as having le-marked sub- 
jects in both clauses, or as having a le-marked NP as the underlying subject 
of the main clause, which controls conjunction reduction and appears on the 
surface. Analyses such as those in (45c) and (46c) would be more consonant 
with the predominant sentence structure in which the surface subject is the 
main clause subject. 

Once le-marked NPs are reanalyzed as subjects of intransitive or non- 
perfective transitive verbs in these constructions with conjunctive partici- 
pial clauses, then by analogy they could appear as subjects of nonperfective 
and/or intransitive verbs in sentences without participial clauses. Consider 
the solutions to the four -part analogies below: 

(47) mai-le pasa gari "31o die* : mai-le "Slo die :: 
mai-le pasa gari alo dinchu : mai-le "Slo dinchu 

mai-le pas's gari patan gae : mai-le pl^an gae 

Surface subjects of sentences with conjunctive participles are gener- 
ally the subjects of the main verb, and so ambiguous sentences produced by 
the operation of conjunction reduction may be resolved in that direction, 
with misanalyses of surface structures like the solutions above resulting. 
The solutions to the analogy equations are sentences with nonhistorical 
ergative subjects. This provides an explanation for the occurrence of such 
subjects in Nepali and not, for example, in Hindi-Urdu; in the latter lan- 
guage, conjunction reduction operates in sentences with the kar construction, 
but it is always the subject of the dependent clause that is deleted, e.g.: 

(48) ram-ne khan'a kha" kar kitab partTT 

-agt. food eat- cp book-FS read-pp-FS 

'after he ate. Ram read a book' 

(49) * ram-ne khan'a kha kar dilli gayS 

-agt. food eat-cp Delhi go-pp-MS 

'after he ate. Ram went to Delhi' 

In Hindi-Urdu, the opportunity for misanalysis of such constructions would 
not arise, because only one of the two subjects can control deletion, and 
so there would be no ambiguity. 

So, one factor that may be contributing to the spread of le-marked NPs 
in historically nominative environments is the conjunctive participial clause 
which is formed from the perfective participle and thus is ergative. Through 
conjunction reduction the ergative agents of these conjunctive participles 
may appear to be the surface subjects of nonperfective and/ or intransitive 
verbs, thus, allowing speakers to analyze a le-marked ergative subject as 
being appropriate in a nominative context. 

Whatever trend there is in Nepali to neutralize the le/0 distinction in 
marking subjects, it appears to be toward the mai-le alo dinchu and mai-le 
patan gae types rather than toward the ma 51o di^ solution, arising from 
sentences like those in (44), which we have ignored up to now. That is, 
rather than falling out of use, the postposition le^ has been increasing the 
number of environments in which it is allowed (cf. Abadie 1974; Kachru § 
Pandharipande 1979). One reason why this solution may be favored is the 
second factor mentioned above--Nepali is spoken in contact with languages 
that do use an ergative marker for all transitive agents. 

6.3.2. The Tibetan languages bordering on the Indo-Aryan regions of India 
and Nepal have ergative morphology in most tenses (cf. Bauman 1979); and the 
Indo-Aryan languages of the Himalayan region do show a tendency to expand 
the use of the ergative case. 

For example, Newari , a Tibetan language, marks transitive agents in all 


(50) a. ^i paradesan vay'a 'I come from a foreign country' 

I-nom. foreign country-ins. come-prs. 

b. amo dhu jin mocake dhuno 'I killed this tiger' 

this tiger-nom. I-agt. destroy finish-pst. 

(51) a. thva bikramadit raja" thava rajy vanaip 

this king-nom. self kingdom come-pst. 

'Kins Bikramaditya went to his own kingdom' 

b. thathe raj^-syay ajna biyava 

when king-agt. order-nom. give-prs. 

'when the king gives an order' 

(Source: Ji^rgensen 1941) 

Among the lA languages the use of an agent-marker can be found through- 
out the Himalayan region without regard to tense. Even in Sanskrit chroni- 
cles written in the Newari-speaking Nepal Valley kingdoms, we find that the 
instrumental case is used for subjects of active transitive verbs, where 
the nominative case would be correct (Petech 1958: 117): 

(52) a. raj3 srT vijayadeva var^a 31 tena lalitapurT 

king year he-ins. 

arddha r'ajyaip karoti 

half reign-acc. do-prs.-3S 

'King Shri Vijayadeva ruled half the kingdom of Lalitapuri 
for 31 years' 

b. raja srT balavantadeva varsa 12 tena atyantasubhikaip 

king year he-ins. very-lucky 

rajyagi karoti / tena ca haripura kytaip 

reign-acc. do-prs.-3S he-ins. and do-pp-NS 

'King Shri Balavantadeva ruled Haripur and reigned 
benevolently for 12 years' 

(Source: Petech 1958: 220) 

And in Shina, an Indo-Aryan language of Northwest India also bordering on 
the Tibetan region, we find the Nepali situation carried one step further-- 
the agents of all transitive verbs, no matter what tense, appear in the 
ergative case. 

(53) a. ash ma bodi dlTre zho pe'adal vatus 

today I-nom. very far from walking come-pst. -IS 

'today I walked here from far away' 

pui^e bodu §idegas 

I-agt. that-gen. son-acc . much beat-pst.-lS 
' I beat his son badly' 

anise k^ryo mas anisei ?shpi fa take- 1 haremus 

this for I-agt. this-gen. horse-pl. pound-to take-prs.-lS 

'I am taking his horses to the pound for him' 

d. aly-o s'ab bahadur-se tu-^ rafali ga 

there-from -agt. you-dat. rifles and 

kartushe dei 

cartridges give-fut.-3S 

'Sahab Bahadur will give you the rifles and cartridges 
from there' 

CSource: Bailey 1924) 

The use and distribution of 1^ in Nepali could thus be influenced 
by Tibetan languages with which Nepali-speakers are in contact. The fact 
that contact languages have ergative markers has apparently affected the 
lA languages of the region so that they, too, use an ergative postposition 
to mark all transitive agents. However, this language contact evidence alone 
cannot explain why the ergative marker le_ is spreading to intransitive sub- 
jects as well, as we saw in the previous section. While contact with 
Tibetan languages may not be the source of the use and distribution of l£ 
in Nepali, such contact may have reinforced its use, so that a choice be- 
tween a marked or an unmarked subject in nonperfective and/or intransitive 
environments would be resolved in favor of the marked NP. 

6.4. In this section, we have seen that during the 16th century the Nepali 
postposition l_e was introduced into those environments which were histori- 
cally ergative in MIA. And, at this point, its distribution was probably 
similar to that of ergative postpositions in other Indian languages. We 
saw examples from the 18th and 19th centuries which indicated that the use 
of le^ was spreading to nonperfective transitive agents and also some intransi- 
tive subjects. It was then shown that syntactically such nonhistorical uses 
could arise through the reanalysis of ergative agents of conjunctive par- 
ticipial clauses as the le-marked subjects of nonperfective and/or intransi- 
tive verbs in main clauses. We also discussed that the conditions for the 
expansion in use of Le were present because of the close contact between 
Nepali speakers and speakers of ergative Tibetan languages. While language 
contact can explain the spread of le_ as an ergative marker in all tenses, 
the syntactic analysis of the interaction of conjunctive participles with 
main clauses offers an explanation for the neutralization of subject/agent 
distinctions in favor of le-marked NPs, for which there is some evidence in 
the language. 

7. The Nepali Perfective Participle . 

7.1. The perfective participle of the Indo-Aryan languages is the reflex 
of the Sanskrit ta-participle, and unless restructured, the perfective parti- 
ciple has ergative syntax. Using the term ergative to describe a participle 
means that its head or modified noun is either its subject if intransitive 
or direct object if transitive. The head noun would, of course, appear in 
its surface structure role in the main sentence, and the agent of a transi- 
tive participle would appear in the ergative/instrumental case. The parti- 
ciple itself could agree in number and gender with its head noun. This 
ergativity would apply whether the participle were used as an attributive 
participle or as a main verb perhaps in conjunction with an auxiliary. A 

participle with nominative syntax would, on the other hand, have as its 
head noun either the underlying subject or agent of an intransitive or 
transitive verb, respectively, and the direct object of such a participle 
would appear in the accusative/ objective case. 

7.2. The lA perfective participle in Nepali is found in four constructions 
which will be important to the following discussions of its use and develop- 
ment as a finite verb. First, a set of personal endings may be added to 
the perfective participle to form the perfective past or simple past tense, 
as in C54): 

(54) a. ma nepal-ma base 'I lived in Nepal' 

1 -in stay-pp-lS 

b. h'ami-le us-lai cithi pa^hayau 'we sent him a letter' 
we-agt. him-dat. letter send-pp-lP 

Second, the perfective participle appears in conjunction with the auxiliary 
verb as a compound verb; for example, the Modern Nepali eko- participle plus 
the auxiliary cha is used for the present perfect tense: 

(55) a. ma nep'al-ma' baseko chu 'I have lived in Nepal' 

I -in stay-pp-MS be- IS 

b. hami-le bhat khaeka chau 'we have eaten dinner' 
we-agt. meal eat-pp-MP be-lP 

Third, the perfective participle is used as the main verb of a dependent 
conditional clause, as the ^-participle in Modern Nepali: 

(56) a. ram gae I'yam jahcha 'if Ram goes, Shyam will go' 

go-pp go-prs.-be-3MS 

b. timi-le okhati na khae marne chau 
you-agt. medicine not eat-pp die-inf. be-2MP 

'if you don't take the medicine, you will die' 
Finally, the perfective participle may be used attributively: 

(57) a. hijo Teko mTnche "aj calyo 

yesterday come-pp-MS man today move-pp-3MS 

'the man who came yesterday left today' 

b. buva-le lekheko ci^hi kaha cha 

dad-agt. write-pp-MS letter where be-3MS 

'where is the letter dad wrote?' 

In the following discussions, I shall refer to these four functions as the 
perfective past , compound perfect , conditional , and attributive uses, res- 
pectively, for all periods of the language. 

We should note that there is a natural split in these uses of the 


perfective participle betweeen the perfective past and the other three. In 
the perfective past, the form of the participle is modified by the addition 
of personal suffixes, so the participle is treated as a verb stem, whereas 
the form of the participle in other contexts remains constant except for 
gender and number agreement, like an adjective, or the form of the participle 
never changes as in the conditional use. The history of the perfective past 
is also somewhat distinct from the histories of the other three uses of the 
participle, and so I shall first discuss the Nepali perfective past in 
section 8, and then turn to the history of the compound perfective tenses 
and other constructions in section 9. 

8. The Nepali Perfective Past Tense . 

8.1.1. The perfective or simple past tense in Nepali is formed by adding 
personal suffixes agreeing with the underlying subject to the perfective 
participle stem. The transitive agent is marked with le_, so the syntax of 
the simple past is agentive. This is true of Modern Nepali (58) and (59), 
as well as Old Shah Nepali (60) and (61). 

(58) a. ma net^l-ma base' 'I lived in Nepal' 

I -in stay-pp-lS 

b. timi nepal-m? basyau 'you lived in^ Nepal' 
you-MP -in stay-pp-2MH 

c. ram nepal-m? basyo 'Ram lived in Nepal' 

-in stay-pp-3MS 

d. sita nepal-ma basi 'Sita lived in Nepal' 

-in stay-pp-3FS 

(59) a. mai-le kitap paye 'I read the book' 

I-agt. book read-pp-lS 

b. timi-le kitap payyau 'you read the book' 
you-MP-agt. book read-pp-2MH 

c. ram-le kitap paryo 'Ram read the book' 

-agt. book read-pp-3MS 

d. sitS-le kitap pari 'Sita read the book' 

-agt. book read-pp-3FS 

(60) a. inu-lai gata jagyo (1751) 

these-dat. favor-MS rise-pp-3MS 

'these have been paid' 

b. 55 rupiy'a 11a 800 rupiya 15 koda kapad? 
-P each-MP -P cloth 

oda-tir lagya (1751) 

brick- toward attach-pp-3MP 

'the bricklayers received 55 rupees each, 800 rupees in all 
and 15 kors of cloth' 

c. hamra k^ji-ka sneha-le vahutai banyo (1757) 
our ~gen.-0 love-ins. much-emp. be raatfe-pp-3MS 

'much has occurred through the devotion of our kaji' 

d. ma thamina 'I was satisfied' (1766) 

I halt-pp-lMS 

e. aru sardar-bhandaT timi mukhya bhayau (1770) 
Other -than you -MP chief be-pp-2MH 

'you became more important than the other sardars' 

(61) a. kai-le paso solo gelyo bhanya" (1591) 

someone-agt. dice play-pp-3MS if 

'if anyone gambles' 

b. samasta karma jas-le garyo (1670) 
all duty who-agt. do-pp-3MS 

'whoever did all the duties' 

c. sri raghau josi-le ^ri vi^ve/vara sthapana" garya (1712) 

-agt. enshrinement do-pp-3MH 

'Shri Raghau Joisi established this shrine for Shiva' 

d. bho^ya-saga mai-le yati bolya (1766) 
Tibetan-with I-agt. thus speak-pp-lS 

'I said this to the Tibetans' 

e. v?ki bhaju-deu ta^4ika nev'ar-lca dui hajFr 

rest -gen. -MP two thousand 

mah'idramali vapat ma'pha gari vaksyau (1766) 

mohor transaction pardon do-cp give-pp-lP 

'of the rest we have approved two thousand mohors for the 
Newar Bhaju Dev Taudik' 

f. tati gharyadi-samet h"ami-le vaksyau (1767) 
thus house-etc. -with we-agt. give-pp-lP 

'we have given thus for the house and other things' 

We can see in these examples that the same personal suffixes are used for 
intransitive (58) and (60) and transitive (59) and (61) verbs. 

8.1.2. The perfective participle stems and the personal suffixes used in 
Nepali may be broken down as in Table 2, at the top of the next page. The 
data that clearly present the nonthird person personal endings are mostly 
from the 18th century onward, and it may be that by this time the perfective 
stem and the personal suffixes are treated as a single morphological unit. 

There is some evidence, however, that the personal suffixes are 


1 ya + m 

ya + u 

2 i + s 

y a + u 

3M yo + 

ya + 

3F i + 

i + n 

Nepali Perfective Past Personal 


Table 2 

independent of the stem even at this time, because, as in the verb forms 
in (62) and (63), the suffixes remain constant no matter what form the stem 

(62) a. aru sardar-bhanda timi mukhya bhaya-u (1770) 

other -than you-MP chief be-pp-2MH 

'you became more important than the other sardars' 

b. tara alika 4^ilo gare-u / cado garny'a kam ho (1766) 
but somewhat slow do-pp-2MH fast do-inf. work be-3MS 

'you have worked somewhat slowly; the work must be done faster' 

(63) a. vidhivistar sunya-u (1774) 

decree-detail hear-pp-lP 

'we have heard the contents of the decree' 

b. bhanyako suni-u (1766) 

say-pp-MS hear-pp-lP 

'we have heard what was said' 

c. ra uhi sayat-ma mahamapdala-ma" ukle-u? (1775) 
and that time-in great-circle-in ascend-pp-lP 

'and did we enter the alliance at that time?' 

I shall assume that in OSN or at some earlier time the personal suffixes 
were separable, and thus, that they are additions to the perfective partici- 
ple. The Modern Nepali personal endings have arisen from the fusion of 
a form of the perfective stem plus these personal suffixes. 

8.2. In the formation of the simple past, Nepali disagrees with most other 
Western Indian languages. In Hindi -Urdu, for example, the perfective parti- 
ciple used as a past tense is ergative, and it retains the adjectival charac- 
teristic of only showing agreement in gender and number: 

(64) a. maT cala 'I left' 

I-MS go-pp-MS 

b. turn cale 'you left' 

you-MP go-pp-MP 


-MS go-pp-MS 

d. sita call 'Sita left' 

-FS go-pp-FS 

(65) a. ma'T-ne kitab pafhT 'I read a book' 

I-agt. book-FS read-pp-FS 

b. tum-ne kitab payhT 'you read a book' 

you-agt. book-FS read-pp-FS 

c. ram-ne khana khaya 'Ram ate' 

-agt. food -MS eat-pp-MS 

d. sita-ne khanS kHaya 'Sita ate' 

-agt. food-MS eat-pp-MS 

8.3. Nepali, however, is not the only lA language to have developed non- 
ergative agreement in the perfective past; and there are languages which, 
in some respects, combine ergative verbal patterns with subject/agent 
agreement. Since all these languages have evolved from MIA languages with 
basically the same characteristics in this tense, all the nonergative lan- 
guages have evolved from earlier ergative stages. In order to see how the 
syntax of Nepali may have evolved from an ergative stage to its current 
agentive stage, it will be useful to examine briefly other forms of non- 
ergative systems in the lA languages. (More complete discussions may be 
found in Hock 1981 and Stump To Appear.) 

8.3.1. The Northwestern Indian languages- -Sindhi, Lahnda, and Kashmiri- -have 
retained the Old Indo-Aryan enclitic pronouns and use them as pronominal suf- 
fixes on both nouns and verbs. Sindhi has three sets of these suffixes for 
nominative, agentive, and other oblique agreement; Lahnda has two- -one nom- 
inative, one any case; and Kashmiri has two, one nominative and one oblique 
(Grierson 1919). 

These suffixes may be added to the perfective participle when used as 
a main verb. The perfective participle itself, like that in Hindi, agrees 
with the ergative subject, i.e. an intransitive subject or transitive direct 
object. The pronominal affixes may be used to mark the ergative subject as 
in Lahnda: 

(66) a. jateu-m 'I knew' 

know-pp-MS-lS nom. 

he-agt. beat-pp-MS-lS nom. 

(Source: Grierson 1919, i: 270) 

And they may be used to mark the transitive agent, as in the Lahnda examples 
in (67). 

he-obj . beat-pp-MS-lS agt. 

b. ga ditthT-m 'I saw the cow' 

cow-FS see-pp-FS-lS agt. 

c. kis-nu mare'?-i 'whom did you beat' 
who-obj. beat-pp-MS-2S agt. 

d. nia-nu marea-s 'he beat me' 

I-obj. beat-pp-MS-3S agt. 

(Source: Grierson 1919, i: 270) 

The perfective past here is still ergative in that the ergative subject 

still controls agreement in gender and number with the participle; and 

different suffixes may be used in marking intransitive and transitive 
subjects, as in Sindhi: 

(68) a. ih'a rate miore* tikiu-se 'this night I stayed in Moro' 
this night moro-in stay-pp-MS-lS nom. 

b. una-kha pucchiu-me 'I asked him' 

him-dat. ask-pp-MS-lS agt. 
(Source: Grierson 1919, i: 71, 91) 

The pronominal affixes are optional in Sindhi and Lahnda; however, in Kash- 
miri the nominative agreement suffixes with intransitive verbs are obliga- 
tory (Grierson 1919, ii: 291), and the second person suffixes indicating 
transitive agents are also obligatory (312). 

These pronominal affixes represent a modification of the basic lA erga- 
tive agreement found in Hindi, because the underlying subject may be marked 
on the perfective participle used as a main verb. As in Lahnda, underlying 
intransitive and transitive subjects may be marked alike (cf. 66a, 67a, b), 
neutralizing the syntactic distinction between subject and agent. Further, 
the obligatory use of suffixes on intransitive verbs introduces person and 
number agreement for the participle in Kashmiri, like that used for other 
finite verbs; and the obligatory use of the second person agentive suffixes 
introduces agent-verb agreement into an otherwise ergative system. 

8.3.2. Marathi has also developed pronominal suffixes for the perfective 
participle, although their connection with those of the Northwestern lan- 
guages is uncertain. In Old Marathi, these were ergative in agreement (Hock 
1981; Master 1964: 130-132), but in Modern Marathi, they agree with the 
underlying subject. In Standard Marathi, pronominal affixes may optionally 
be added to the ergative participle in the second persons (Kachru ?, Pandhari- 
pande 1979: 199): 

(69) a. mi kame kelT 'I did the jobs' 

I job-NP do-pp-NP 

you-S job-NP do-pp-NP(-2S) 

kelT(s) 'you did the jobs' 

c. tumhT kame kelT(t ) 'you-all did the jobs' 

you-P job-NP do-pp-NP(-2P) 

In the Konkani dialects, the use of the pronominal suffixes on the perfect- 
ive participle is more extensive, and in some cases, obligatory (Grierson 
1905: 163ff.)-^° For example, in Chitpavani (Grierson 1905: 210-216), -£ 
is used to mark second person singular agents, -t_ second person plural agents, 
and -n third person singular agents. 

(70) a. ^ te-che-sathT ek mejvanT dilTs 

you-agt. him-of-for one feast-FS give-pp-FS-2S 

'you prepared a feast for him' 

b. t'e-che bap an te-la baghitlan . . ■ te-che gale- la 

him-of father-agt. him-dat. see-pp-NS-3S him-of neck-dat. 

mT^hT marl in ai;ii te-tso mukcT ghetlon 

embracing-FS strike-pp-FS-3S and him-of kiss-MS take-pp-MS-3S 

'his father saw him, and then embraced and kissed him' 

As we can see in (69) and (70), the ergative agreement pattern is retained, 
as the participle agrees in gender and number with the direct object; but 
in Konkani, agent-verb agreement has been introduced as well, for the agent 
suffixes are usually present. 

One factor influencing the spread of pronominal suffixes in Marathi is 
the development of a set of personal endings for intransitive perfective 
participles, probably through borrowing the nonthird person endings of the 
present tense, and adding them to the perfective participle. Thus, we find 
gelo * ' I (M) went', gelo 'he went', gelos 'you(M) went', etc., unlike in Hindi 
in which gay"? would be used for all three of the preceding persons. 

In addition, there has been a leveling of the distinction between nomi- 
native and agentive pronouns in Marathi (Hock 1981; Master 1964: 85), so 
forms like tn represent nominative and agentive cases. Bloch (1914: 262-264) 
states that the use of nominative pronouns and underlying subject-verb agree- 
ment patterns with the perfective participle is found with a certain class 
of verbs, effectively creating a nominative, active paradigm. 

While Standard Marathi still has ergative syntax in the perfective past, 
several modifications have occurred in the standard language and various 
dialects. The use of pronominal suffixes produces partial verb agreement with 
the transitive agent, while intransitive participles agree in gender, number, 
and person with their subjects. The nondistinctness of nominative and agent- 
ive pronoun forms may contribute to analogical verb agreement with the agent. 

8.3.3. The past tense in Shina is formed by adding personal suffixes to the 
past stem, which agree with the underlying subjects, and there is no agree- 
ment between the direct object and perfective participle. For example: 

(71) a. resai nialu-s daru ge rese-^ nasiat th^gu 

his father-agt. out go-cp him-dat. advice make-pst.-3S agt. 

'his father went out and talked to him' 

"~ ~ ginTga 

this you-agt. whom from price take-pst.-2S agt. 
'from whom did you buy this?' 

(Source: Bailey 1924) 

While both intransitive and transitive verbs use personal endings, there 
are different endings for singular subjects and singular agents of the same 

(72) a. ash ma bodi dure zho peTdal vatus 

today I very far from walking come-pst.-lS nom. 

'today I walked here from far away' 

b. mas khuda ga thai hake-r gun"? thegas 

I -agt. God and your right-in sin make-pst.-lS agt. 

'I have sinned before you and God' 

The personal endings are the same for plural subjects and agents of the same 

The subjects of intransitive verbs appear in the nominative, unmarked 
case, while the agents of transitive verbs are marked with -_s or -S£. And 
as mentioned in an earlier discussion, all transitive verbs- -whether perfect- 
ive or not--have subjects marked with the agent marker. 

Assuming that the past tense in Shina reflects an earlier ergative sys- 
tem, like that found in other lA languages, it has made significant changes 
in this tense. Personal affixes agreeing with the underlying subject have 
developed for both intransitive and transitive verbs, and these have replaced 
any former agreement between the direct object of transitive verbs and a 
perfective verb stem. The intransitive and transitive verbal suffixes are 
distinct in the singular persons, but they have been neutralized in the 
plural. The ergative marker has been extended to all transitive agents, so 
the past tense is not the only tense in which there is a difference in 
marking between subjects and agents. 

8.3.4. The Eastern Indian languages have established nominative agreement 
in the perfective past, completely replacing the old ergative system. For 
example, in Bengali: 

(73) a. tumi tar janya bara bhoj dile 

you-nom. him for big feast give-pp-2S 

'you gave a big feast for him' 


b. tate se ta'r bishay tadige bha'g-kare 

hereon he-nom. his property to them sharing 

'so he divided his property between them' 

(Source: Grierson 1903, i: 67-68) 

Pronominal affixes also developed in the Eastern languages, which, although 
they may not have had the same source as those in the Northwestern languages 
(Grierson 1895, 1896; Chatterji 1926: 973), developed in the same way, as 
affixes on the ergative perfective participle (Chatterji 1926: 973-987). So 
the suffixes agreeing with the underlying subject simply replaced whatever 
agreement patterns there were. In some early forms of the Eastern languages, 
there were some distinctions between the suffixes used for transitive and 
intransitive subject agreement, but these have generally merged in the 
modern languages. 

The distinction between nominative and agentive cases was lost also in 
the Eastern languages in both pronouns and nouns, so the patterns of agree- 
ment between subject and verb in the perfective past are nominative, with 
both transitive and intransitive verbs agreeing in person and number with 
the subject/agent. Thus, the syntax of the simple past in the Eastern lan- 
guages has been completely changed from ergative to nominative both in 
subject marking and verb agreement. 

8.4. If we now compare the perfective tense paradigms of all these lan- 
guages, we can find further similarities: 































call lam 






























niari lum 



mar Ion 



mari lam 













Comparison of Perfective Past 



•Not necessarily in use in all dialects; forms in ( ) 
may not be used in dialects using unmarked forms. 

Table 3 

In this distribution of forms, we can see a general pattern of development 
in all the languages except for Hindi: nonthird person contexts are those 
most susceptible to modification. The adjectival base forms of the third 
person singular and plural--intransitive and transitive--have been retained, 
for the most part, in Marathi , Nepali, Shina, and Bengali, despite the re- 
structuring that may have taken place in other persons. Nepali Y^JlE. ^^^^ 
once used to agree with an ergative subject, i.e., the intransitive subject 
or transitive direct object, and now they are used to agree with the third 
person underlying subject, i.e., intransitive subject and transitive agent. 

There are, of course, various similarities in form between the Nepali 
personal affixes and those of the other languages, e.g., nasalization in the 
first singular agentive suffix in Nepali, Lahnda, Marathi, and Bengali. 
Perhaps the closest match of these languages presented is between Nepali and 
Marathi--both show -£ in the second person singular, both have nasalization 
in the first persons, etc. There are no apparent geographical or political 
explanations for this similarity, so contact, except prior to any migrations 
from the Central Hindi area, does not seem to be a satisfactory explanation. 
It has been suggested (cf. Grierson 1895, 1896; Stump To Appear) that both 
Marathi and Nepali extended personal suffixes from the OIA radical present 
tense to the perfective participle. This explanation raises some problems 
for Nepali, because the verb system in the language has undergone a complete 
restructuring, so most tenses are periphrastic in origin, and the remnant of 
the OIA present, now the injunctive, does not have similar personal endings. 
We do find similar personal endings in use for the present of the auxiliary 
verbs cha C chu , chas , cha, chau , chau , chan) and h£ (Hu, hos , ho , hatj , hau , 
hun) , and there is no evidence to suggest that an older borrowing could not 
have taken place. In fact, there is really no evidence to indicate definite- 
ly one way or another what the origin of the Nepali personal suffixes is. 

Given the issues that must be raised and the evidence that must be pre- 
sented for such a discussion, the origins of the Nepali personal suffixes of 
the perfective past and their relation to those of the other languages cannot 
be dealt with adequately in this study. I shall continue to use the generic 
term pronominal affix to refer to all the cases looked at in this section. 

Whether the origin of these pronominal suffixes is in the OIA enclitic 
pronouns, or new MIA pronouns (as Chatter ji 1926 suggests for the Eastern 
Indian languages) , or endings borrowed from the OIA radical present is not 
as crucial as the fact that we see the same syntactic pattern recurring in 
all the languages, that is, the pronominal affixes are suffixed to the 
ergative perfective participle, and from this beginning, modifications may 
develop. Let us then assume that the syntactic origins of the pronominal 
affixes in these languages are similar, and discuss the mechanisms by which 
the ergativity of the perfective past is lost or weakened. 

8.5. By comparing the development of pronominal affixes in these languages, 
we can see that there are certain steps that some lA languages have gone 
through in order to modify the OIA ergative system in the perfective past. 
These include: 

1. development of pronominal affixes on the participle (of 
some uncertain type and origin) 

2. use of pronominal affixes agreeing with the intransitive 

subject (S) and transitive agent (A) 

3. neutralization of distinctions between affixes for S and A 

4. obligatory use of pronominal affixes for S and A agreement, 

perhaps partial at first, but which essentially 
establishes a personal agreement paradigm for S 

5. neutralization of morphological distinctions between the 

subject NP and agent NP 

6. neutralization of syntactic distinction between subject NP 

and agent NP (i.e., the distinction that while S may 
agree with the perfective verb, at best A only partially 
agrees with the perfective verb) 

7. replacement of DO-verb agreement with A-verb agreement for 

transitive verbs, thus establishing a personal agreement 
paradigm for A 

These are not intended to be discrete nor strictly chronological, and the 
list excludes other factors which may have intruded in the cases discussed 
here, for example, development of an ergative postposition separate from 
agentive inflection, or extension of a direct object postposition which 
developed in all lA languages and could block verb agreement with the per- 
fective participle, or influence from neighboring languages such as Tibetan 
on Shina and Eastern Indian on Marathi dialects. 

All these properties are attested in various stages of the languages we 
have looked at in this section, although not all the languages participate 
in them all. The Eastern Indian languages have mostly gone through all seven 
steps, although not all morphological distinctions between S and A have been 
neutralized in all those languages. Shina has also acquired most, but it 
has only partially neutralized the morphological distinction between S and 
A in the perfective by extending the ergative marker to all agents. Nepali 
has developed along similar lines, but has also not resolved the S vs. A 
morphological distinction, although there is at least some evidence that 
points to 1^ spreading to all transitive agents and some intransitive sub- 
jects. Sindhi and Lahnda participate in 1-3; Kashmiri 1-4; while Marathi 
at least partially 1-5, and in some dialects 1-6. 

This comparison of the lA languages that modify the ergative perfective 
past (assuming Hindi represents the basic system inherited from MIA) has pro- 
vided us with some information about the steps Nepali may have gone through 
to develop its agentive s>'ntax. However, the acquisition of these non- 
ergative properties does not explain how the critical step is taken, i.e., 
how agent-verb agreement replaces DO-verb agreement for transitive verbs. 

8.6.1. The question then remains: what role did these factors play in the 
loss of ergativity in the Nepali perfective past? Obviously, ergative syntax 
is not inherently unstable, because Hindi and other Western Indian languages 
have retained the ergativity of the perfective past. Also, the introduction 
of pronominal affixes alone is not sufficient to weaken an ergative system, 
cf. Sindhi, Lahnda, etc. 


As Hock 1981 and Stump To Appear suggest, it must be a combination of 
factors that result in the elimination or weakening of ergative syntax in 
the lA perfective tenses. In each language, there are probably certain de- 
velopments which have combined to allow more of these properties to be 
acquired. And the interaction of developments independent of verb morphology 
or underlying subject marking and the acquisition of more of these properties 
may result in the weakening of ergative syntax. 

In Nepali, we might consider the development of these properties in 
conjunction with (1) the drastic reduction in the lA grammatical gender sys- 
tem, and (2) the introduction of an animate object postposition. The former 
seems to be a Nepali-specific development, as many lA languages have retained 
grammatical gender; and the latter is a general development in the Indian 
languages. But both are independent of change in the perfective past. 

8.6.2. Nepali now has a two gender system: all female animate beings are 
considered feminine , and all other nouns are considered masculine . Complete 
development of this system appears to have occurred in the modern lA period, 
as traces of grammatical gender in nouns can be found in OMN and OSN data. 
This reduction in grammatical gender would sharply curtail the frequency with 
which adjectives, in general, and participles, in particular, would mor- 
phologically differentiate the noun to which they refer in any given sentence. 
While some Indian languages may distinguish among masculine, feminine, and 
neuter nouns by morphological marking on adjectives and participles, Nepali 
can only distinguish between masculine and feminine, and the latter repre- 
sents only a very small class of nouns compared with the former. So, during 
the development of Nepali, the possibility that any two given nouns will be 
distinguished by the agreement markers on the adjectives or participles that 
refer to them has been reduced. 

Most Indian languages have developed a postposition used to mark dative/ 
accusative animate objects obligatorily, and other definite objects optional- 
ly. In Nepali, this postposition is currently 13*1 , but several different 
postpositions were used throughout the older periods (cf. Wallace 1981). 
The appearance of this postposition with a direct object in a sentence with 
a perfective verb blocks verb agreement with that direct object, as in the 
Hindi-Urdu examples (4d,f) in which ko blocks agreement of the perfective 
participle with the direct object of the sentence. When ergative verb agree- 
ment is so blocked, the verb appears in the neutral form, usually the mascu- 
line singular. 

As a result of these two developments, most feminine nouns in Nepali 
when used as direct objects would be marked with 1*31 ; and most unmarked direct 
objects would be masculine. A transitive perfective participle used as a 
main verb would thus hardly ever make gender distinctions, if ergative syn- 
tax were employed. Most feminine nouns with which it could agree would be 
marked with iTi , and so agreement would be blocked. Thus, whether the noun 
were feminine or masculine, the morphology of a transitive perfective parti- 
ciple would be masculine. Actually, the most common form of the perfective 
participle during the OSN period is the verb stem plus fK, the masculine 
oblique or plural form. 

Given that Nepali developed a simplified gender system, an obligatory 
postposition to mark animate direct objects, and pronominal affixes agreeing 
with the underlying subject/agent for the perfective participle when used as 
a main verb, we would expect forms like those in (74) to occur: 

(74) a. * (mai-le) bakhro maryF-iu '1 struck the goat' 
1-agt. goat-MS strike-pp-MS-lS 

b. * (ta-le) kitap dekhya-s 'you saw the book' 

you-agt. book-MS see-pp-MS-2S 

c. * (hami-le) chori-lai rakhya-'u 'we kept the girl' 

we-agt. girl-FS-dat. keep-pp-MS-lP 

In such sentences, the only analyzable distinction made by the verb is that 
the pronominal affixes vary according to the person of the agent; the gender 
and number of the direct object is of minimal importance even though the 
syntax of the sentence with the perfective participle is ergative. Thus, 
the fact that the perfective participle is ergative is not very salient on 
the surface. Since verbs in other tenses are distinguished according to the 
person of the subject/agent, such forms might suggest that it is the under- 
lying subject that agrees with the perfective participle, and not the direct 
object, especially if similar pronominal affixes had become obligatory for 
intransitive verbs (as has happened in Kashmiri and Marathi) . Thus, rein- 
terpretation of ergative perfective participles plus subject/agent pronominal 
affix as perfective verb stems plus personal endings agreeing with the under- 
lying subject would make the perfective past tense more regular with respect to 
verb agreement in other tenses. 

8.7. In this section, we have seen that Nepali developed personal endings in 
the perfective past tense agreeing with the underlying subject from "pronomi- 
nal affixes" of some uncertain origin. Comparing data from other lA lan- 
guages in which perfective tense ergativity like that in Hindi is modified 
in some way, we saw that these languages may develop nonergative from erga- 
tive syntax and in so doing, acquire certain properties of verbal morphology 
and subject noun marking. I then suggested that the acquisition of these 
properties was due to (a) their own interaction and development and (b) 
changes/factors independent of the perfective tense system. And in Nepali, 
the loss of grammatical gender and development of the dative/accusative post- 
position would have created a situation in which the morphology of the 
■perfective participle when used as a main verb would have been drastically 
reduced. Then the most salient distinction the participle would have made 
would be the differences in the pronominal affixes. Reanalysis of these 
distinctions as being made obligatorily by the participle could lead to non- 
ergative agreement. Thus, the properties of languages with pronominal af- 
fixes and independent developments in the languages may conspire to intro- 
duce nonergative syntax in the perfective tense inherited from the OIA erga- 
tive system. 

9. The Nepali Compound Perfective Tenses . 

9.1. As mentioned in section 7, the Nepali perfective participle is used as 

part of compound verbs , as an attributive participle, and as the main verb 
of condtional dependent clauses. As a compound verb and attributive parti- 
ciple, the perfective participle functions as an adjective, agreeing with 
its head noun in gender and number; the conditional participle functions 
as a main verb, but unlike the participle in the simple past, it does not 
change form for agreement in person with its subject. The compound verb, 
attributive, and conditional participles are also grouped together because 
the OSN-developed perfective participle formed with the verb stem plus yako 
seems to appear in all these uses at some point, whereas this new form of 
the perfective participle never is used for the perfective past tense. 

The Nepali perfective participle would have inherited MIA ergative syn- 
tax in these functions as well as in the simple past, but there have been 
changes in the syntax of the participle in these environments during the 
history of the language. In this section, I shall discuss the development 
of the perfective participle in these three functions, and examine the 
changes that have taken place in its syntax. 

9.2.1. When we look at the perfective participle in older forms of Nepali, 
we find that all three functions discussed above are present, but that OMN 
and OSN differ in the forms of the perfective participle used. 

In the OMN data, the y^-participle, that which was inherited from MIA, 
is used as a main verb (75) and as an attributive participle (76): 

(75) tai bh'asa' ma pas? ki akry'5 chu (1376) 

that-emp. promise-FS I gift do-pp-FS exempt-pp be-lS 

'I have made this promise a gift' 

(76) piutharpu raija kari akrya* bhakhl pas? bhai (1336) 

reign do-cp exempt-pp promise-FS gift be-pp-FS 

'having been authorized in Piutharpu, this exempted promise 
has become a gift' 

There are no clear examples of the conditional ya-participle in the OMN data. 

9.2.2. In OSN, all three functions are attested with the yT-participle. 
Examples are given below of intransitive (77) and transitive (78) compound 
perfective tenses, attributive participles (79), and conditional clauses with 
the perfective participle (80). 

(77) a. sakram s'ai-ki yesi may? bhai-cha (1604) 

-gen.-FS such gift-FS be-pp-FS-be-3S 

'such has become a gift through Sangram Shah' 

b. rupaya* dasa rayT k§an (1679) 

ten remain-pp-MP be-3P 

'ten rupees remained' 

c . mathesa-vata niahajan-ka" manisharu rupiya likana 

-from -gen. -MP man-pl. take-cp 

•a< ~ 
iha ai rahya chan (1766) 

here come-cp renain-pp-MP be-3P 

'having taken money, having come here, Mahajan's men from 
Mathesa have stayed' 

d. vairi-k? cafjo nabujhi pansiiitar-k'a palta-cheu 

enemy-gen. -0 plan not-understand-cp -gen. -MP next-near 

hamra guha'r jahya manis pltlai pugna gaya-ch'a (1792) 
our-0 help go-inf. man few-emp. to arrive go-pp-MP-be-3P 

'not understanding the plan of the enemy, the men who arrived 
to help us near Pangsingtar came in few numbers' 

(78) a. 2 ala motipur-k'a 2 ala vaksi diya chau (1529) 

field-MP -gen. -MP field-MP give-cp give-pp be-lP 

'we have given two fields in Motipur' 

b. buijh'a vagh-le va4a sima-maha pari vatuva-kana 
old-0 tiger-agt. big-0 marsh-in trap-cp traveller-dat. 

maryo thyo (1776) 

kill-pp-MS be-pst.-3S 

'having entrapped him in the big swamp, the old tiger 
killed the traveller' 

c. pachillT cithi-ko javaph hami-le leji-r'akhya-thyau (1792) 

last-0 letter-gen. -MS answer we-agt. write-cp-place-pp-be-pst.-lP 

'we wrote our answer to the last letter' 

d. timiheru-le aneka prakar-ka* pap gari rasy?-chau (1798) 
you-pl.-agt. many way-gen. -MP sin do-cp place-pp-be-2P 

'you have committed sins in many ways' 

(79) a. tilaga-vata lyayl tilauro gho4a 1 gayako cha (1590) 

-from bring-pp black-white horse go-pp-MS be-3S 

'one black-and-white horse brought from Tilanga was presented' 

b. gai na-bhaya mola diiju (1723) 

cow not-be-pp price to give 

'to give a price that isn't a cow' 

c. strido^a bhaya stri-ko culatho dhoi pina 

woman-sin be-pp woman-gen. -MS braid wash-cp drink-inf. 

dinu (1752) 

to give 

'to give to drink water washed through the hair of a woman 
having a woman's disease' 

d. ka'thmadau-le khosya adha gorkh?-kana dinu (1757) 
-agt. open-pp half -dat. to give 

'to g