Metrical Phonology
Quantity and Prominence


<< Back | Home | Next >>

2.0. Introduction

A syllable is a unit of pronunciation typically larger than a single sound and smaller than a word. Since at least the time of Chaucer the term syllable has been there in the English language and it is frequent in linguistic descriptions at different planes; yet a precise definition of the term is still lacking.

In fact, in linguistic descriptions the term syllable is often used in an imprecise intuitive way. Such usage reflects an intuitive reality available in the graphological and phonological systems of a language. For example, in English a word may be pronounced one syllable at a time, as in /ne-v"-D"-les/ 'nevertheless', and in writing words are hyphenated as per their syllabic divisions. At this level, the notion of syllable is intuitively real to native speakers.

Although this pretheoretical notion of syllables is to some extent obscure and intuition-bound, this notion seems to have become one of the indispensable and irreducible units on at least two levels of representation, the phonetic level and the phonological level. As a result attempts have been made to define the concept of syllables at both these levels. I shall now survey some of these attempts.

As a phonetic unit, the syllable was first defined by the psychologist R. H. Stetson in 1928. In his motor or pulse theory of syllable production he argued that each syllable corresponds to an increase in air-pressure, air being released from the lungs as a series of chest pulses. This can often be readily felt and measured, particularly in emphatic speech. But subsequent experimental work has shown that there is no simple correlation between pulses and intuitive syllables. The problem is especially obvious in cases like going, which is two syllables, but is usually uttered in a single muscular effort.

An alternative phonetic approach, viz. the prominence theory, instantiated in the work of Pike (1943), argues that in a string of sounds, some are intrinsically more 'sonorous' than others, and that each peak of sonority corresponds to the center of a syllable. Peaks are best illustrated by more sonorous sounds like vowels, whereas less sonorous sounds such as stops mark the valleys of prominence. In a more recent formulation of the theory Ladefoged (1982) defines sonority as the loudness of a sound relative to that of other sounds with the same length, stress and pitch.

This approach, in terms of peaks and valleys of sonority, gives a useful general guideline, but it does not always indicate clearly where the boundary between adjacent syllables falls.

Some other useful proposals regarding the phonetics of syllables and related units were made by J. C. Catford (1977a). Catford suggests that speech is produced in measured bursts of initiator power, or feet, which are the basic rhythmic units of a language. In English, for example, each initiator-burst corresponds to a stressed syllable, and the intervals between stressed syllables are roughly equal. So English is said to be 'stress-timed', as opposed, for instance, to French, where there seems to be one burst per syllable regardless of accent. Thus French and Spanish (and possibly Bangla - a matter to be investigated) are 'syllable-timed'.

Accepting these proposals as workable ones, it may be said that, since in syllable-timed languages each initiator-burst generally coincides with a syllable, the concept of 'phonetic syllable' seems reasonably straightforward in such languages.

Phonetic approaches of this kind attempt to provide a definition of the syllable valid for all languages. In contrast, theories concerned with the phonological aspect of the syllable focus on the ways sounds combine in individual languages to produce typical sentences. In fact, the phonological syllable, for my purpose, requires a more precise definition, especially with respect to boundaries and internal structures.

According to Lass (1984: 250) "the phonological syllable might be a kind of minimal phonotactic unit, say with a vowel as a nucleus, flanked by consonantal segments, or legal clusterings, or the domain for stating rules for accent, tone, quantity and the like."

Thus the phonetic syllable is a 'performance' unit, whose entire reality is phonetic; the phonological syllable is a structural unit, perhaps with non-phonetic properties as well.

Keeping in mind these two aspects of the syllabic unit, the present chapter will discuss the length, quantity and prominence of Bangla syllables.

Length, according to Lass (1984: 254), is a durational property of segments, and thus purely phonetic. Hence syllable length, in the terminology I adopt, will stand for the phonetic duration of a Bangla syllable, the arithmetical aggregate of the durations of the constituent segments.

Quantity, again following Lass (1984: 254), is a structural property of syllables, and thus phonological. Hence syllable quantity, here, will signify the phonological quantity of a Bangla syllable.

Syllable prominence corresponds to syllable stress, which, in Bangla, is a purely predictable phonetic phenomenon.

Bearing in mind the observation by Robins (1964: 137) that

"The linguist decides by criteria that may include purely phonetic factors what sequences comprise the syllables of the language he is describing in the course of devising a full phonological analysis thereof and it is normally found that there is a considerable correspondence between syllables established on purely phonetic criteria and those established for the purpose of further phonological description.",

in the present chapter I wish to investigate the correspondence between two levels, viz. phonetic and phonological, regarding the length and quality of Bangla Syllables. To be more specific, I shall try to apply the above assumption of correspondence of the syllabification between the two levels to syllabic elements, viz. the length and the quantity of the syllable.

Moreover, I shall attempt to locate the position of phonetic stress in Bangla words in terms of phonological information.

Section 1 will illustrate the syllabification rules in Bangla as they are available in the existing literature; section 2 will give an account of syllable types, relevant for my purpose; section 3 will provide an account of the phonetic length of Bangla syllables; section 4 will account for the phonological quantity of Bangla syllables; section 5 will be an attempt to correlate sections 3 and 4; and syllable prominence will be dealt with in section 6 which will be followed by the concluding section.

2.1. Syllabification rules in Bangla

The rules of syllabification in Bangla, as they are available in the literature, will be illustrated in this section.

Bangla syllabification has hardly been the subject of primary interest in the studies of Bangla phonology as almost all the works in this field develop their studies on the basis of so-called intuitive syllables.

Even in the field of Bangla phonetics, the two most representative works, viz. Chatterji (1928) and Kostic and Das (1972), dealing with the articulatory and acoustic aspects of Bangla phonemes respectively, do not discuss Bangla syllabification at all.

As a result, not even a handful of studies deal directly with the syllabification of Bangla words. Among these few studies I shall mention here three, viz. Hai (1964), Sarkar (1979) and Sarkar (1986). All these three works are descriptive in principle and appear to agree with Pike's (1943:116) statement that "Real syllables are those, which the ear is psychologically capable of distinguishing". However, among these three, Sarkar (1979), for the present purpose, is more useful than Hai (1964), as Sarkar (1979) strictly deals with the Standard Colloquial Bangla of Kolkata, rather than of Dhaka or Rajshahi; and since Sarkar (1986) is an improved version of Sarkar (1979), here I shall mainly follow Sarkar (1986).

A fairly straightforward recapitulation rather than a critical discussion of Sarkar (1986) is presented below.

According to Jakobson (1941) CV is the unmarked syllable pattern. In Bangla also the most preferred as well as the most natural syllable pattern is CV, a fact that conforms to the syllabic universals proposed by Jakobson (1941) and others.

The syllable boundaries in Bangla too, in conformity with Hooper's (1972, 1976) 'Universalist' approach, are assigned to the beginnings and ends of words, though the rules of syllabification cut across morpheme boundaries and thus facilitate a higher frequency of occurrence of the unmarked syllable pattern CV than others. For example, the word nace, i.e. nac 'dance' + e 'third person, inferior, present' is syllabified as na-ce.

The phonetic restrictions involved in syllabifications are as follows:

R-1. Except for vowels no other sounds in Bangla bear the feature [+syll] which denotes the ability to be the peak of a syllable.

R-2. In Bangla the unitary V+V sequences, i.e. the true diphthongs,1 never consists of a first member which is higher than the second and interpretable as being [-syll].

R-3. In Bangla only semivowels can occur as the second member of a diphthong.

R-4. Word initial semivowels and glides are not quite frequent in Bangla and hence the syllabification of a word internal sequence of diphthong followed by a vowel often is as diphthong plus syllable boundary plus vowel, e.g. OYon 'orbit' is syllabified as OY-on.

The preceding restriction is not an uncontroversial one, but rather a problem area in Bangla phonology. This will be dealt with in chapter 5.

The treatment of consonant sequence is given below.

In the existing literature there are two quite exhaustive lists of Bangla consonant sequences, viz. Mallik (1960) and Hai (1964).

One notable point regarding the terminology of both Mallik (1960) and Hai(1964) is that neither of them uses the term consonant sequence. Mallik (1960) does not distinguish between sequence and cluster, rather he prefers to name all the sequences, including the geminated sounds, as clusters whereas Hai (1964) prefers the term compound consonant for all types of consonant sequences.

Mallik (1960) gives a list of over 270 consonant sequences, whose distribution could be tabulated as follows:


CC 25 either 1st 235 ------
member is s,
or 2nd member is r / l

CCC 1 str 8 3rd member is
always r

CCCC -- -- 1 NSkr

Mallik (1960) does not note the word initial spr cluster which is also there in Bangla. Sarkar (1986) shows that the words consisting of CC(C) clusters belong to the borrowed level of Bangla vocabulary. In case of final CC sequences, Sarkar (1986) records that, on the whole, SCB phonology tends to avoid more than one consonant word finally, even in the case of borrowed items except for a few ones.

According to Hai (1964: 323) Bangla has 36 consonant clusters which remain intact word initially; 26 homorganic consonant sequences and 250 other consonant sequences.

But, for syllabification, the data used by Hai (1964) is less useful than that of Mallik (1960), because Hai (1964) considers the consonant sequences in terms of continuous speech and as a result quite often there is a word boundary between the members of a consonant sequence.

In contrast, the data used by Mallik (1960) are drawn only from intraword consonant sequences, truly relevant for syllabification.

The above discussion establishes the fact that Bangla has various types of intraword consonant sequences, and the syllabification rules of such sequences may be listed as follows:

R-5. The word initial clusters2 become the onset of the following syllable, e.g. the cluster tr is the onset to the peak i in the word tri-no 'grass'.

R-6. Word final CC sequences3 form the coda of the preceding syllable, e.g. the sequence rD is the coda to the peak a in the word hom-garD 'home guard'.

R-7. Word medial intervocalic CC sequences, except those with r/l as the second member occurring in tatsama items, i.e. Sanskrit words that are more or less unmodified in Bangla, may be either homorganic or heterorganic. However, all intervocalic CC sequences (regardless of whether they are homorganic or heterorganic) are heterosyllabic, the first member of the sequence being the coda of the syllable to the left and the last member being the onset of the following one, for example, the sequences rb, Sc, tt and kkh are divided between two syllables in the words pur-bo 'east', poS-cim 'west', ut-tor 'north' and dok-khin 'south'.

R-8. In the case of a word medial CC sequence with r/l as the second member, the first member of the sequence is geminated and thus results in a CCC sequence. Then the first member of the CCC, i.e. the CCr/l sequence, becomes the coda of the preceding peak, and the second and the third members, i.e. Cr/l, together form a cluster which becomes the onset of the following syllable. For example, the sequences ttr and mml in the words put-tro 'son', mat-tra 'mora', Om-mlan 'untarnished' etc.4

R-9. The word medial CCC sequences, as Mallik (1960) observes correctly, have r as their third member.5 In such sequences the syllable boundary is placed right after the first member. In other words, the first member becomes the coda to the preceding peak, and the second and the third members together form a cluster which becomes the onset to the following peak, for example, the sequences str, ntr, and Spr, in the words Os-tro 'weapon', mon-tri 'minister', niS-pran 'lifeless' etc.

R-10. There is only one word medial CCCC sequence, which is NSkr (Mallik, 1960). The syllable boundary is placed between the second and the third members of the sequence, that is here the first and second members form a 'closer combination than sequence',6 which becomes the coda to the preceding peak, and the third and fourth members form a cluster which becomes the onset to the following peak, e.g. SONS-kriti 'culture'

Among the above rules, rule 7 has the highest number of exponents in the language in terms of both types and tokens.

On the basis of the above 10 rules, Sarkar (1979, 1986) established 16 canonical syllable patterns in Bangla, which, arranged in descending order of frequency, as worked out in Sarkar (1986), are as follows: CV, CVC, V, VC, VV, CVV, CCV, CCVC, CVVC, CCVV, CCVVC, CVCC, CCCV, CCCVC, VVC, and CCCVV.

Among these 16 patterns, CV has the maximum number of exponents, approximately 54% in the language.

In this list VV sequences stand for diphthongs. Sarkar (1985-86) shows that the second members of these VV sequences are necessarily semivowels, thus non-syllabic. Hence, for the purpose of generalization, at places he treats these second members as C as below.

On the basis of these 16 canonical shapes Sarkar (1986) establishes the general structure of the SCB syllables as C30VC20 the optimal pattern of which is CCCVCC, which does not have any exponent in the language.

2.2. Types of syllables

According to the traditional view 'a syllable which ends in a vowel is called open, and a syllable where the vowel is followed by one or more consonants is called closed' (Malmberg, 1963: 65). Examples of open and closed syllables in Bangla are:

1. O-po-ra-ji-ta 'name of a flower'
ta-ra 'star'
E-ka 'alone'

2. jON-gol 'forest'
rok-tim 'red'

3. Son-dha 'dusk'
ma-dol 'a kind of drum'

The three examples of 1 consist entirely of open syllables; those of 2 consist of two closed syllables each; and in 3 the syllables dha and ma are open and Son and dol are closed.

As long as a syllable ends in a clear vowel or a clear consonant no problem arises regarding the above binary classification. But in case of a syllable ending in a diphthong it is difficult to decide whether it is open or closed. For example,

4. Siw-li 'name of a flower'

Is Siw, the initial syllable of 4, open or closed?

In the present section I shall discuss the Bangla diphthongs, as enumerated by Sarkar (1985-86); two existing treatments regarding the classification of syllables containing diphthongs, from two different planes, along with counterarguments; and then I shall propose to adopt the idea of classifying Bangla syllables in terms of light and heavy, a classification which appears to be more suitable than the open-closed dichotomy for the present problem.

2.2.1. Bangla diphthongs

On the basis of the principle 'a diphthong must necessarily consist of one semivowel' (Jones, 1962), Sarkar (1985-86) establishes 17 falling diphthongs in Bangla, viz. iy, iw, uy, ey, ew, oy, ow, oY, oW, ay, aw, EW, EY, aY, aW, OY, and OW and justifies them.

The second members of these 17 diphthongs are always semivowels. The Bangla semivowels are controversial: according to Chatterjee (1962: 25) non-syllabic and predictably so, therefore non-phonemic; whereas according to Ray et al. (1966: 4), Bangla semivowels are non-syllabic but phonemic. Moreover, not only the semivowels, but also the diphthongs of Bangla are not free from controversy. After establishing and justifying the 17 Bangla diphthongs Sarkar (1985-86) himself had to make a statement that in Bangla, a diphthong followed by a final single consonant becomes longer and TENDS to be disyllabic, i.e. does not remain a diphthong any more; whereas a diphthong followed by a single consonant followed by another vowel is shorter and maintains its diphthongal character, e.g.

5. de-ul 'temple'
go-ur 'a proper name'

6. dewl

7. dew-Ti 'lamp'
gow-ri 'a proper name'

According to Chatterjee (1962: 26) another possible syllabification of 5 and 6 is 6a,

6a. dew-ul

However, the plausibility of 5, 6 and 6a suggests that in a particular environment the diphthongal character of a VV sequence may be in doubt; whereas in 7 the diphthongal character of such a sequence is quite clear.

The above discussion indicates the controversial nature of the Bangla semivowels and diphthongs. In fact, in the linguistic literature, the very concepts of semivowels and diphthongs, unlike vowels and consonants, are still to some extent undefined, overlapping and language specific.

2.2.2. Two treatments of diphthongal syllables

Of the two treatments we consider, one is by Prabodh Chandra Sen (1974, 1986), dealing with different types of syllables in respect of their relevance as metrical units, and the other is by Sarkar (1979, and personal communication) dealing with different types of syllables as the phonological units of the language.

Though these two studies belong to two different disciplines, metrics and phonology, both of them apparently touch the problem of diphthongal syllables and prescribe the closed category for them. I shall discuss first the metrical study and then the phonological one. The study of metrics

Sen (1974, 1986) captures the three basic patterns of Bangla meter, defined in terms of two different types of units, viz. the syllable and the mora, and accounts for the moric styles in terms of open and closed syllables. A detailed discussion of his treatment is to be found in section 4.

In his analysis, diphthongal syllables show the same moric properties as closed syllables; the open syllables behave differently. For example, in the simple moric meter pattern both the closed and diphthongal syllables count as dimoric, whereas the open syllables are monomoric. Let us consider a concrete instance, cited from Sen (1974: 54).

8. phalgun cOncOl phoTa phul rOY na
Obohele dEY phele puSper gOYna

(cOncOl/ probaSi/ Satish Chandra Roy)

"spring restless blossomed flower stay not
with-ease give throw flower-gen. ornament"

'in the restless Spring time the blossomed flowers do not last long, and it appears as if the restless Spring is throwing off its floral ornaments with ease.'

The scansion of 8 is 9. The small groups shown in 9 are feet, formed in terms of mora counting,

| | | | | | | | | | | | | | |
9. phal-gun cOn-cOl pho-Ta-phul rOY-na

| | | | | | | | | | | | | | |
O-bo-he-le dEY-phe-le puS-per gOY-na

Here the three diphthongal syllables, viz. rOY, dEY, and gOY are considered as dimoric like the other seven consonant-final closed syllables, viz. phal, gun, cOn, cOl, phul, puS, and per.

On the basis of the above behaviour Sen considers the diphthongal syllables as closed syllables and terms the vowel nuclei of the syllables as assreta 'sheltering' sounds and the syllable final consonants along with the semivowels of the diphthongs as assrito 'sheltered' sounds (Sen, 1974).

An improved version of Sen (1974) is Sen (1986), where he terms the semivowels and the syllable final consonants as vowel particles and consonants respectively, which together form a class called dependent segments. The phonological treatment

According to Sarkar (1985-86, and personal communication), diphthongal syllables are closed syllables and their syllabification is as follows:


onset rime

peak coda
| |
Thus he claims that unlike vowels, and like consonants, Bangla semivowels have the [-syll] features. Even Chatterjee (1962: 30 fn) hints similarly.

He justifies his claim in terms of a morphological construction, viz. the genitive construction.

Bangla genitive suffix has two phonologically conditioned allomorphs, viz. r, occurring after vowel ending stems, and er, occurring after consonant ending stems, e.g.

10. pakhi-r 'bird's'
alo-r 'of the light'

11. *pakhi-er

12. deS-er 'of the country'
nij-er 'of one's own'

13. *deS-r

11 and 13 show that the addition of er and r forms with vowel ending and consonant ending stems respectively yields unacceptable outcomes.

14. bhay-er 'brother's'
bow-er 'wife's'

15. *bhay-r

14 shows that like stem final consonants semivowels too take er, rather than r (cf. 15).

But such a distributional demarcation of r/er does not always hold good. At least two sets of exceptions, both of which involve vowel ending stems, may be cited in this regard.

The first set is taken from Dasgupta (1985: 44) and it consists of three subsets involving stems ending in phonetic [o], e.g.

16. alo 'light' alo-r 'light's'
gero 'knot' gero-r 'knot's'

17. kendro 'centre' kendr-er 'centre's'
groho 'planet' groh-er 'planet's'

18. broto '(quasi-) religious brot-er / broto-r
vow' '(quasi-) religious vow's'
SumOntro 'a name' SumOntr-er / SumOntro-r

The second set is also from Dasgupta (1985: 46) and it consists of stems ending in a, e.g.

19. ma 'mother' ma-r / ma-er 'mother's'
pa 'foot' pa-r / pa-er 'foot's'
ca 'tea' ca-r / ca-er 'tea's'

The examples of 16, the first subset, perfectly agree with the distributional demarcation of r/er; whereas those of 17 and 18 pose problem for the same.

Dasgupta (1985: 45) accounts for the examples of 17 and 18 as follows:

The examples of 17 have tatsama stems ending in underlying /O/, that gets deleted when followed by a vowel. This process feeds the rule that deletes the e of the genitive desinence er if the stem to which it is attached ends in a vowel.

The examples of 18 have two underlying forms each, one with stem final /o/, and the other with stem final /O/, depending sometimes on non-learned (e.g. brotor) vs. learned (e.g. broter) discourse, and some other times on a redundancy rule accounting for free variation (e.g. SumOntror / SumOntrer).

The above treatment of the first set agrees with the distributional demarcation of r/er.

Dasgupta (1985: 46) treats maer, paer, and caer, the exceptional forms of 19, as lexical relics, "which don't reflect current forces".

The stems belonging to these two sets, however, end in phonetic [o] and [a] and thus they do not provide any counterexample to Sarkar's (1985-86) claim that Bangla semivowels have the [-syll] feature.

2.2.3. A proposal

However, on the basis of the foregoing discussion, the point I would like to make here is that the diphthongal syllables hinder a clear-cut binary classification of syllables and such syllables have to be accommodated in the theory along with additional justifications.

In order to avoid the controversies of these "additional justifications", I prefer to follow McCarthy (1979: 451), Lass (1984: 252), Hogg and MaCully (1987: 35) and others, all from the field of metrical phonology and classify Bangla syllables on the basis of their internal hierarchical structure.

A syllable (s) consists of an onset (O) and a rime (R); and the rime consists of a peak (P) and a coda (Co), e.g. in the monosyllabic words such as din 'day', rup 'form', cokh 'eye', and dOS 'ten' of Bangla d, r, c, and d are the onsets; i, u, o, and O are the peaks; and n, p, kh, and S are the codas respectively.

Except P, any of the other two categories, viz. onset and coda, may be empty, e.g. in pa 'leg' the coda is empty; in am 'mango' the onset is empty.

However, the internal structures of these syllables, in terms of onset, rime and then rime going to peak and coda, may be conventionally represented in terms of branching tree diagrams as follows:

s s s s s s

P Co P Co P Co P Co P P Co
| | | | | | | | | | |
d i n r u p c o kh d O S p a a m

din rup cokh dOS pa am

Now, in terms of the traditional classification, viz. open and closed syllables, what we observe here is that for all the closed syllables the rime branches (e.g. din, rup, cokh, dOS and am); whereas for the open syllables it does not branch (e.g. pa).

Moreover, the concepts of branching and non-branching rimes may easily be equated with those of canonical heaviness and lightness respectively; and thus a syllable consisting of a branching rime may be termed a heavy syllable, whereas a syllable consisting of non-branching rime may be termed a light one.

The above classification of syllables into heavy and light on the basis of their canonical quantity, which in tern is a structural property, applies to the diphthongal syllables in a quite straightforward manner. Since a diphthongal syllable has a branching rime, it may be considered heavy, e.g. the internal structures of bhOY 'fear', tuy 'you (inf.)', bow 'wife', mEW 'cat's call' etc. are as follows:

21. s s s s

| | | |
bh O Y t u y b o w m E W etc.

bhOY tuy bow mEW

Thus the branching of R is the only factor responsible for classifying the syllables into two categories, viz. light and heavy. The identity of the terminal nodes under the branching node R will be dealt with in chapter 5.

Thus, the 16 canonical forms of Bangla syllables of sec. 2.1 fit into this heavy/ light categorization without any problem, as follows:


2.3. Phonetic length of syllables

Hai (1964) made a few observations regarding the phonetic length of vowels and consonants in Bangla which bear directly on the topic of this section.

The observations are as follows:

R-11. Hai (1964: 145) observes that in normal pronunciation the vowel of the final syllable is longer than others, for example, the vowels of monosyllabic words like ta 'that', aj 'today' etc. are long. But the same vowels in polysyllabic words are reduced and the ratio of reduction is proportionate to the position of the syllable concerned in the word, starting from the end and going towards the beginning.

R-12. In terms of length, final single consonants are longest; word initial single Cs are of medium length; and intervocalic single Cs are shortest (pp. 160).
A notable point here, which has already been discussed in sec. 1, is that Bangla phonotactics does not allow word final CC, except in borrowed items.

R-13. In terms of length, the unreleased first member is longer than the released second member of the medial CC sequences (pp. 285).

Kostic and Das (1972) also observes that the Bangla vowels show normal length in polysyllabic words, whereas they show the tendency of being elongated in monosyllabic words, which confirms R-11.

However, at this point let me mention two instrumental studies of the phonetic length of Bangla syllables, viz. (1) the kymograph tracing by Hai (1964: 357), and (2) the digital sonagraph study by me, which is a pilot survey.

2.3.1. Kymograph tracing

Among the examples of Hai only three items are composed of one heavy syllable followed by one light syllable, while all the others are heavy monosyllabics. The items are:

23. bonn-hi 'fire'
pak-na 'let him get'
pakh-na 'wing'
kal 'yesterday / tomorrow'
khal 'canal'

The time duration of each of the syllables of 23 is as follows:

24. 1st syll. time in seconds 2nd syll. time in seconds
heavy light
a) bonn .48 hi .40
b) pak .37 na .20
c) pakh .33 na .24
d) kal .58 - -
e) khal .60 - -

24 shows that, although, following R-11, the initial heavy syllables of (a), (b), and (c) are shorter than the heavy monosyllables of (d) and (e), still they are longer than the final syllables of (a), (b), and (c), which, being final syllables, again according to R-11, are supposed to be longer than any preceding syllables. Hence the conclusion may be drawn here that in terms of duration, in normal speech, heavy syllables are longer than light ones.

2.3.2. Digital sonagraph

The time duration of the syllables of twelve words has been studied with the help of digital sonagraph. These words, comprising ten disyllabics and two monosyllabics, are as follows:

25. khe-tam 'I used to eat'
khel-tam 'I used to play'
a-ta 'custard apple'
al-ta 'liquid red cosmetic item'
pa-ta 'leaf'
pan-ta 'stale rice'
ja-ga 'to rise up from sleep'
jaY-ga 'space'
ja-gaY '(he) wakes up someone'
khaW-aY 'feeds'
praY 'almost'
aY 'come'
Whenever the initial sound of the second syllable is a plosive, I took a separate reading of the occlusion period because the occlusion period of the initial plosives of the initial syllables could not be measured reliably.

The time reading of the second syllables is exclusive of the occlusion period (whenever there is an initial plosive), because the time reading of the initial syllables, with initial plosives, also excludes the occlusion period. The time reading of the syllables of 25, in milliseconds, is as follows:

26. token 1st syll. occlusion 2nd syll.
a) khe-tam .180 .154 .342
b) khel-tam .300 .115 .351
c) a-ta .145 .170 .269
d) al-ta .290 .128 .236
e) pa-ta .165 .170 .266
f) pan-ta .362 .084 .222
g) ja-ga .370 .097 .312
h) jaY-ga .403 .066 .258
i) ja-gaY .323 .067 .436
j) khaW-aY .323 - .384
k) praY .547 - -
l) aY .456 - -

The time ranges of heavy and light syllables at different positions are as follows:

27. (a) Heavy syllables: total 11

initially finally isolatedly

.290 to .403 .342 to .436 .456 to .547

(b) Light syllables: total 11

initially finally

.145 to .370 .222 to .312

A comparison between 27(a) and 27(b) shows that heavy syllables are consistently longer than the light ones.

Among these twelve words only five consist of one light and one heavy syllable. The relative time duration of light and heavy syllables in these five words is that mostly the duration of the heavy syllables is between 50% and 100% greater than the contiguous light syllable. And the remaining seven words, consisting of two heavy syllables, or two light syllables, or one syllable each, conform to R-11.

The phonetic study of syllabic length shows only that the length of the heavy syllables generally oscillates between 1½ and 2 times that of the light syllables.

2.4. Phonological quantity of syllables

The phonological quantity of Bangla syllables can best be studied with the help of indirect evidence, especially the evidence from poetry. Bangla poetry is chiefly of two types, viz. rhythmic verse and prose verse. According to Tewari (1988: 6) Bangla rhythmic verse consists of regular and well-measured foot formation; whereas Bangla prose verse is comparatively free from such regular and well-measured foot formation, though other than rhythm it has all the qualities of being poetry. In fact, rhythmic verse is that field of language use where the canonical quantity of syllables matters directly. Hence I shall keep the prose verse out of the scope of the present chapter and use the two terms, viz. rhythmic verse and poetry, interchangeably.

In the present section I shall discuss the findings regarding the syllable quantity in Bangla metrics.

The existing corpus of indigenous work on Bangla metrics as such is quite rich. Quite a few people, for example, Tagore, Dwijendralal Roy, Satyendranath Dutta, Pramathanath Choudhury, Sasanka Mohan Sen, Ajit Kumar Chakraborty, Prabodh Chandra Sen, Amulyadhan Mukhopadhyay and so on have contributed to the analysis and theorization of Bangla metrics. Among the above I shall follow Prabodh Chandra Sen for the following reasons:

i) His work provides (a) a fixed set of terminology used scientifically, and (b) ample and clear mechanism for analysis.
ii) It is the most exhaustive among all the available works.
iii) All along the course of its development, viz. from 1922 to 1986, his theory was always receptive to innovations, as it was always open for discussion.

Moreover, it has been so extensively accepted and appreciated by scholars and poets that for me the wisest approach would be to talk in terms of the Sen school - a major metrical theory which has developed in Bengal independently of work done elsewhere.

Sen captures three basic styles of Bangla metrics, viz. (1) syllabic style, (2) moric style, and (3) composite style, depending especially on the behaviour of the heavy syllables. The light syllables, in terms of their quantity, however, are less interesting than the heavy ones, as they behave in a similar way in all the three styles, that is, in all these three styles the light syllables are always considered to be monomoric. In fact, all the significant metrical findings in terms of syllable quantity cluster around the heavy syllables; they are presented below.

2.4.1. Syllabic style

In this style both light and heavy syllables are considered to be monomoric, and thus equimoric. Sen (1986: 2) points out that because of the equimoric status of both heavy and light syllables, in syllabic style the unit of measure is the syllable rather than the mora. For example,

28. apatoto ey anonde gOrbe bERay nece
kalidaS to namey achen ami achi beMce
(Sekal/ konika/ Tagore)

"for-the-time-being this joy-with pride-with roam-around dancing
Kalidas particle name-in-emp. is-hon. I am alive"

'For the time being let me dance around and be happy and proud being satisfied with the fact that Kalidas exists only in name but I exist in life.'

28, Cited from Sen (1986: 36), is an example of tetra-syllabic measure/group whose scansion is 29:

| | | | | | | | | | | | | |
29. a-pa-to-to ey-a-non-de gOr-be-bE-Ray ne-ce

| | | | | | | | | | | | | |
ka-li-daS-to na-mey-a-chen a-mi-a-chi beM-ce

Both the lines of 29 consist of 14 units each; the first line has 10 light and 4 heavy, i.e. total 14 syllables, representing 14 moras, and the second line has 11 light and 3 heavy, i.e. total 14 syllables, representing 14 moras.

2.4.2. Moric style

In the moric style the light syllables are, as mentioned earlier, monomoric, whereas the heavy syllables are always dimoric. For example (cf. examples 9 and 10),

| | | | | | | | | | | | | | |
30. phal-gun cOn-cOl pho-Ta-phul rOY-na

| | | | | | | | | | | | | | |
O-bo-he-le dEY-phe-le puS-per gOY-na

30, the scansion of the above tetra-moric group, shows that the first line consists of 6 heavy syllables, i.e. 12 moras and 3 light syllables, i.e. 3 moras, i.e. total 15 moras, and the second line consists of 4 heavy syllables, i.e. 8 moras and 7 light syllables, i.e. 7 moras, i.e. total 15 moras.

2.4.3. Composite form

Both types of syllabification, viz. the syllabic type as well as the moric type, are available in the composite style. In this style the light syllables are, as they are in other styles, monomoric, whereas the heavy syllables may be monomoric or dimoric. The distribution of heavy syllables, in terms of quantity, in this composite style is as follows:

R-14. Heavy syllables à dimoric/ - #
Heavy syllables à monomoric/ elsewhere.

R-14 says that a heavy syllable is dimoric when it is immediately followed by a word boundary, or in other words, word finally and in monosyllabic words heavy syllables are dimoric, as they are in the moric style; and non-word-finally heavy syllables are monomoric, as they are in the syllabic style. For example,

31. aSSine utSOb-Saje SOrot SundOr Subbhro kOre
Sephalir Saji nie dEkha dibe tomar OngOne
(Sottendronath dOtto/ purobi/ Tagore)

"the-month-of-Assin-in festive-dress-in Autumn beautiful white hand-in
the-flower-of Sephali-of basket taking look-give your courtyard-in"

'In the month of Assin Autumn will appear in her festive-dress in your courtyard with basket full of Shephali flowers in her beautiful white hands.'

The scansion of the above diclausal verse, consisting of 8+10 moras in each line, cited from Sen (1974: 160), is 32:

| | | | | | | | | | | | | | | | | |
32. aS-Si-ne ut-SOb Sa-je SO-rot Sun-dOr Sub-bhro kO-re

| | | | | | | | | | | | | | | | | |
Se-pha-lir Sa-ji nie dE-kha di-be to-mar ON-gO-ne

In 32 the first line consists of 8 light syllables, 4 non-final heavy syllables, and 3 final heavy syllables, i.e. a total of 8+4+6=18 moras; and the second line consists of 11 light syllables, 1 non-final heavy syllable, and 3 final heavy syllables, i.e. a total of 11+1+6=18 moras.

On the basis of the discussion of the above three styles the present section shows that the phonological quantity of the structurally light syllables is 1, which is fixed and discrete; whereas that of the structurally heavy syllables is not so, as it oscillates between 1 and 2 depending on the particular moric style.

2.5. Correlation between the physical and psychological realizations of syllables

Section 2.3 discussed the syllable duration at the physical phonetic level in terms of the physical realization of syllable length. Section 2.4 provided the moric representation of syllable quantity; they may be regarded as psychological realizations of syllable quantity.

Section 2.3 showed that the heavy syllbles are phonetically longer than the light syllables. Section 2.4 showed that, depending on the particular style chosen, heavy syllables of Bangla may have three different distributions of phonological quantity assignment. Now the problem is that of correlating these two findings.

In the present section I shall raise a few questions regarding this problem, as it is too early to propose any definitive solution to this problem which is sure to remain on the agenda of Bangla phonology for a long time to come.

2.5.1. Discrete or non-discrete

The indigenous research tradition has shown that phonologically Bangla heavy syllables are ambivalent between single and double mora. Their ambivalent nature is more prominent especially in the composite style, where the heavy syllables may freely count as either monomoric or dimoric, depending on particular environments. Now, in terms of quantity hierarchy, these three styles may be arranged as in 33.

33. --------------------
(H=2) (H=?) (H=1)

33 indicates that M, i.e. the moric style, where the heavy syllables are dimoric, occupies one end of the scale; and S, i.e. the syllabic style, where the heavy syllables are monomoric, occupies the other end of the scale; and C, i.e. the composite style, occupies the mid-point between M and S, from where it has access to both the points M ans S, as in C the heavy syllables exhibit both types of mora counting, the M type and the S type.

At the mid-point C, one would most naturally wish to assign the numerical value 1½ to H, thus claiming that, at C, a heavy syllable has a theoretical moric value of 1½, realized in practice variably, sometimes as 1 and sometimes as 2.

The question here is what is the psychologically real moric phonology of the heavy syllables. Should the phonology be constructed in terms of discrete concepts like 1 and 2, considering the M and S points of the scale, or in terms of a non-discrete concept like ½, considering the C point of the scale?

2.5.2. Regarding the impressionistic grouping of languages

Conventionally, on an impressionistic basis, languages are categorized into two groups, viz. syllable-timed and stress-timed. The more widely studied syllable-timed languages, such as Spanish, French etc. show impressionistically only equimoric syllables. Bangla appears to be a type of syllable-timed language where heavy syllables often have an extra mora compared to light syllables.

The question here is how to accommodate this characteristic of Bangla in a theory proposing a binary classification of languages, viz. syllable-timed and stress-timed. Does the theory need to be modified?

2.5.3. Relation between speech and metrical styles

I like to deal with the relation between speech and metrical style, mainly with respect to the tempo of speech.

In order to establish some speech tempo norms for Bangla I would like to postulate a four-point scale which is as follows:

34. *-----*--@--*-----*
L An (N) Al P

The L and the P points, i.e. the first and the fourth points of this scale, represent the largo and presto values of the speech tempo parameter. An and Al, i.e. the second and third points, stand for the andante and allegretto values. And the N point, N for norm, is a notional average intermediate between An and Al and represents the normal tempo of Bangla speech.

In terms of this four-point scale, on an impressionistic basis, which of the three metrical styles corresponds to the normal speed of speech in Bangla? Let us consider all the three styles.


The moric style, impressionistically, belongs to some point between L and An of the scale, because a dimoric value for every heavy syllable calls for a carefully slow and thus somewhat artificial pronunciation. Hence, option (i) may be eliminated.


Regarding the syllabic style Sen's (1974: 14) own response is as follows:

"The syllabic style is neither borrowed from nor is it influenced by any other language…..It is even free from any carry-over from the Sadhu/ high variety of Bangla, which is supposed to be the affair of the educated class….Rather this style records the typical mood of the colit/ low/ colloquial Bangla, which may be considered as every one's language….The syllabic style is the only popular style for folk-rhymes, folk-songs, religious folk-verses, i.e. folk-literature, which are mostly composed by the so-called uneducated common folk."

The above intuition of Sen does recommend the syllabic style as the representative of normal speed of Bangla speech quite strongly.


The composite style, where heavy syllables are dimoric word finally and monomoric elsewhere, also agrees with R-11, R-12, and R-13, which are the impressionistic rules of syllable length of normal speech. R-11, R-12, and R-13 imply that the heavy monosyllables and word final heavy syllables are longer than any other syllable, which closely corresponds to the mode of mora-counting for the composite style.


The justification given in (ii) and (iii) show that both these styles are plausible candidates for the status of the true representatives of the normal tempo of Bangla speech.

Does any one of these two styles consistently represent the normal speed of Bangla speech? Is one of them more artificial in its elocution than the normal speech rhythm and tempo?

We prefer to leave these questions open at this point. They cannot be answered on the basis of the tools developed in this book.

2.6. Prominence

Bangla, as has been discussed, is impressionistically a syllable-timed rather than a stress-timed language. Moreover, stress or prominence in this language is a predictable feature of a word, a point that several scholars have made.

Chatterji (1928: 22), for example says:

"Stress is not significant in standard Bengali so far as the individual word is concerned. In the standard colloquial, stress is dominantly initial in isolated words. This stress, however, is not so strong as in English. Word-stress is always subsidiary to sentence-stress, and in conversation the sentence-stress is most commonly on the initial syllable of the first important word in a sense-group…The stress on individual words disappears in favour of the sentence-stress, since one sense-group has only one dominant stress at or near the beginning."

And Bykova (1981: 29) writes:

"It is largely the first syllable that takes the stress. Yet in a sentence which falls into rhythmical groups or syntagms, the stress system is dominated by the rhythmical organization of the whole sentence. In this case it is the first syllable of the first word of the syntagm that is stressed, while all the other words of the syntagm are unstressed. Word stress is thus of no phonological value in Bengali since it is not a distinctive feature of the word and cannot serve as a means of singling out a word in a sentence; it only serves to achieve the syntagmatic segmentation of a sentence."

The treatment of "the rhythmical organization of the whole sentence" largely depends on the "syntagmatic segmentation of a sentence", which in turn depends on the non-distinctive word stress of Bangla; accordingly in the present section, I wish to highlight the predictable word stress or word prominence of Bangla in terms of two suggestive prosody-related processes of Bangla, viz. vowel harmony and O-o alternation within the word domain.

2.6.1. Vowel harmony

This section on vowel harmony and the following section on the alternation between O and o are concerned with word level phonological processes affecting vowel height.

In section 2.6.1 we shall look at a process whereby a high vowel trigger raises (by one degree) the vowel height of the syllable immediately preceding it, and at some related phenomena. Section 2.6.2 will examine the distribution of surface O, which manifests underlying O in relatively prominent syllables, and of surface o, which merges the manifestations of (i) those instances of underlying o that are not affected by process like vowel harmony, (ii) underlying O raised to o by a high vowel trigger, and (iii) underlying O weakened to o in relatively non-prominent syllables.

We begin the work of the present section by considering the basic vowel system of Bangla:

i u
e o

One of the most important factors affecting the pattern of vowels in Bangla words is a process that we may call Vertical Vowel Harmony.

In Bangla words, depending on the occurrence of a following high vowel trigger, the vowel of the preceding syllable goes one notch upward. This phenomenon operates in terms of vowel height assimilation; the literature describes it as anticipatory vowel harmony (Chatterji, 1926: 395) or vowel height assimilation or vertical vowel harmony (Sarkar, 1983-84) or vowel harmony or vowel attraction (Nath, 1997: 22).

I prefer the term vertical vowel harmony or simply vowel harmony for my purpose because otherwise it becomes problematic to indicate the degree of assimilation as sometimes there is complete assimilation (high to high) and in other cases the vowel height is modified towards, but not quite upto, the height of the triggering high vowel. In the case of the vowels e and o, i.e. the mid vowels, it applies fully, e.g. lekh+i = likhi 'I write', So+i = Sui 'I sleep' etc. In case of E and O, i.e. the low vowels, assimilation applies partially, e.g. dEkh+uk = dekhuk 'let him/ her see', kOr+i = kori 'I do' etc.

I dismiss the term anticipatory vowel harmony as this term includes the regressive direction of application but excludes the progressive direction, which is also there in Bangla.

Moreover, Bangla also shows another different type of phonological process in which, as opposed to vertical vowel harmony, a high vowel is often followed by a low vowel, e.g. milOn 'union' etc. This phonological process I shall describe as vowel disharmony and discuss later on. Vertical vowel harmony operates at least in three different grammatical categories, viz. verbs, nouns, and pronouns in Bangla. The examples are as follows:

Nouns: In case of nouns the synchronic rule (the term is from Sarkar, 1983-84: 53) says that the following high vowels i/u pull the preceding vowels, viz. e, o, E, and O, one notch upward to i, u, e, and o respectively. For example,

35. nOd vs. nodi 'big river' vs. 'river'
khoka vs. khuki 'baby boy' vs. 'baby girl'
pEMca vs. peMci 'male owl' vs. 'female owl'
deS vs. diSi 'country' vs. 'country made'

Pronouns: In case of pronouns too the above synchronic rule applies, e.g.

36. tora vs. tui 'you-inf.-plu.' vs. 'you-inf.-sg.'
tomra vs. tumi 'you-ord.-plu.' vs. 'you-ord.-sg.'
eMra vs. ini 'they-formal-prox.'vs.'s/he-formal-prox.'

Verbs: In case of verb roots too the above mentioned vowel alternation takes place as an obligatory phonological process, for example, Bangla verb roots always show two alternate forms (the form with the higher vowel occurs before a high vocoid):

37. lekh - likh 'to write' ken - kin 'to buy'
dEkh-dekh 'to see' khEl- khel 'to play'
So - Su 'to lie down' oTh - uTh 'to rise'
bOl - bol 'to speak' cOl - col 'to move' etc.

In fact, because of the existence of the phonological process of vertical vowel harmony, the vowel phonology of Bangla appears to constitute quite a complicated system. This particular phonological process, as it applies to the sector of Bangla verbs and results in two alternate forms of the verb roots, will be dealt with in the next chapter in terms of phonological rules, though the applicability of exactly the same rules to the other sectors, viz. nouns and pronouns, is questionable.

In order to account for the process of vertical vowel harmony in different grammatical categories in Bangla, we may even have to go along with Chatterjee's (1962) hypothesis that the phonological system may be different for different grammatical categories of a particular language and combine this hypothesis with the proposal of Nath (1997: 21) that like Telugu in Bangla too the process of vowel harmony is conditioned by grammatical categories.

Following Ramarao (1976) where it has been pointed out that in Telugu in the process of vowel harmony, though conditioned by vowels, the type and direction of change are determined by grammatical categories, Nath (1997: 40) prefers to deal with some cases of vowel harmony within the CVC type of verbal category of Bangla in terms of grammatical conditioning while with the rest in terms of phonological conditioning within the same category.

Here we prefer his concept of grammatical categories as conditioning factors for vowel harmony and suggest that in Bangla the process of vertical vowel harmony applies differently to different grammatical categories. We, however, do not subscribe to his arguments as in the next chapter we shall show that except for some irregular forms and a few exceptional forms in the whole gamut of verb morphology in Bangla the process of vowel harmony is phonologically conditioned.

In fact, the vertical vowel harmony of Bangla remains a fascinating open area for future investigation.

The limited point that I want to make here is that vertical vowel harmony is quite a widely distributed phonological process, and has the effect of raising many vowels in Bangla words.

This process is not the only one that raises vowel height. Another process of the same type is Sarkar's (1983-84: 52) progressive assimilation, exemplified below:

38. dhula dhulo 'dust'
khucra khucro 'change'
iccha icche 'wish'

Yet another process responsible for moving vowels up by one degree of height, though by a different mechanism, is the weakening of low O to mid o in non-prominent syllables, a phenomenon which is discussed in detail in section 2.6.2 and which may be observed in the pair Onto 'end' vs. Ononto (underlyingly /OnOnto/) 'endless'.

We turn now to a counterbalancing process, the vowel disharmony phenomenon mentioned earlier. In some words, where normal patterns of prominence would lead us to expect the weakening process to turn certain instances of low /O/ to mid o, we observe that a high vowel in the preceding syllable inhibits this process; thus, instead of the expected utSob, cikon, iSot, iSSor, ujjol, digonto we get:

39. utSOb 'festival' iSSOr 'god'
cikOn 'delicate' ujjOl 'bright'
iSOt 'little' digOnto 'horizon'

Contrast these forms with cases where the preceding syllable has a non-high vowel and thus the expected weakening affects the second syllable:

40. bastob 'real' : *bastOb
kaMkon 'bangle' : *kaMkOn
ator 'perfume' : *atOr
bakol 'bark (of a tree) : *bakOl
pOrob 'festival' : *pOrOb
morog 'cock' : *morOg
kEmon 'how' : *kEmOn
SekoR 'root' : *SekOR
Ononto 'endless' : *OnOnto

In fact, the following pairs of Bangla words too show that the two opposite directional processes, viz. vowel disharmony and vowel harmony, are current in Bangla.

41. pitOl vs. petol 'brass'
SikOl vs. Sekol 'chain'
SikOR vs. SekoR 'root'

One of the members of each pair, containing high vowel in the preceding syllable shows vowel disharmony, while the other member shows some sort of mutual vowel harmony.

One notable point that contrasts our vowel disharmony from Sarkar's progressive vowel harmony is that the process of vowel disharmony applies to underlying O and stops it from being raised to o; whereas the process of progressive vowel harmony applies to underlying a and raises it to o.

2.6.2. O-o alternation

As is mentioned in the preceding section the present section examines the distribution of surface O and o in different environments.

Section 2.6.1 has already shown the 7-way vowel distinction of Bangla phonology. Among these 7 at least two, viz. E and O occur freely as the syllable nucleus of the word initial syllables and of monosyllabic words. But in other environments in a word the frequency of occurrence of E and O is remarkably low, compared to that of e and o.

Of these two pairs, viz. O-o and E-e, it is the O-o pair which calls for investigation in the present context, as it has been suggested that O becomes o in prosodically weak syllables.

In word final position, except for a few monosyllabic words, O is simply unacceptable according to the phonotactic rules of the language, a fact that I shall use crucially in section 2.6.3. In medial positions, both O and o are available but with a very low frequency of occurrence of O. For my present purpose I shall differentiate among these three environments and study the factors conditioning the occurrence of O-o in these three environments, viz. in word initial, word medial and word final syllables.

The basic distribution of O is as follows:

In word initial syllables O shows both contrasting and complementary distribution with o. For example,

42. ghORa vs. ghoRa 'big water pot' vs. 'horse'
gOla vs. gola 'throat' vs.. 'to stir'
dOla vs. dola 'lump' vs. 'swing' etc.

In the forms of 42, O contrasts with o in word initial syllables.

The examples of 35, 36, and 37 of section 2.6.1, i.e. vowel harmony, show that in word initial syllables O is quite often in complementary distribution with o too.

Word finally O and o are neutralized into o.

Word medially O is not so frequent as o. The related forms of the pair bhugol 'geography' and gol 'round' show that an underlying o may surface as o word medially.

A set like Onto 'end', Ononto 'endless', and digOnto 'horizon' reveals that word medially an underlying O may surface either as o or as O under specific conditions. The phonological process of vowel harmony (of section 2.6.1) occurs even beyond the word initial syllables and raises underlying O to o, e.g.

43. binOY vs. binoYi 'modesty' vs. 'modest'
ObonOto vs. Obonoti 'lowered' vs. 'decline'

Though word medially both o and O are available underlyingly, the underlying O surfaces mostly as o. The two consequences of this phenomenon are as follows:

Firstly, the word medial occurrence of surface O is very scanty, and secondly, very clean contrast between word medial surface O and o is rarely available.

In other words, word medially O and o are mostly in complementary distribution.

Since the main concern of the present section is the word medial distribution of O, hence here I wish to exclude the cases of underlying o surfacing as o word medially and consider the cases of underlying O surfacing as o as well as O. In other words, the o that is in contrast with O are excluded; whereas the o that is in complementation with O are considered here.

The distribution of O in all the three positions, word initially, word medially, and word finally, that is considered here is as follows:

Word initially a 7-way distinction of Bangla vowels is considered here, whereas word medially and word finally a less than 7-way distinction is considered. Word medially, except for the cases of underlying o, O is in complementary distribution with o; and word finally also, except for a few monosyllabic content words, O is neutralized with o.

In brief, the distribution of O and o that I wish to account for here is as follows:

a) Word initially Bangla shows both O and o. Some cases of word initial o are underlyingly simply o; while some other cases, viz. the output of the rule of vowel harmony, are underlyingly O.

b) Word medially both O and o are underlyingly O.

c) Word final o too is underlyingly O.

Though the complementary distribution of O and o in word medial syllables is often explained in terms of a following high vowel trigger, as is shown in section 2.6.1, and also in the examples of 43, there remains the unexplained bulk of exceptions to the process of vowel harmony, e.g.

44. OkkhOm 'unable'
OtOl 'bottomless'
OnurOkto 'fond of'
OnobOroto 'continuously' etc.

I wish to account for the O of the word medial syllables of 44 in terms of prosodic strength and weakness.

I assume that the predictable phonetic stress of Bangla has a strong bearing on these occurrences of O-o and claim that word medially O becomes o in prosodically weak syllables.

The predictable phonetic stress in Bangla always occurs word initially, i.e. after a word boundary. According to Chatterji (1928: 23),

"the dominant initial stress of SCB has given rise on a large scale to Loss of Vowels in the interior of words, to Vowel Modification."

In other words, the vowels of the word initial syllables generally have the predictable stress protecting them. For example,

45. pagol vs. pagli 'mad (male)'vs. 'mad (female)' instead of *pagoli
Sobuj vs. Sobje 'green' vs. 'greenish' instead of *Sobuje
holud vs. holde 'yellow' vs. 'yellowish' instead of *holude

But SOt vs. Soti 'honest' vs. 'chaste' instead of *sti etc.

Except for the fourth one, in all the other examples the vowels of the unstressed syllables are lost; whereas in the fourth one the vowel of the stressed syllable, i.e. the initial syllable is not lost. Forms of the fourth type with permissible sequences, however, are not very frequent in the language.

However, Bangla is one of those few New Indo Aryan languages that lacks a schwa in its phonological inventory; instead Bangla has O and o. Schwa in quite a few NIA languages represents the unstressed vowel of the language. Hence it may not be unjustified to assume that in Bangla, word medially, O is the stressed counterpart of o. I do not see any negative effect of such an assumption, rather I feel it is capable of throwing light on the word formation processes of Bangla. Moreover, it apparently conforms to Chatterji's (1928: 23) Vowel-Modification in unstressed syllables, of which he does not give any example.

Even Tagore (1395 B.S., i.e. 1988: 607) mentions this O member of the O-o alternation pair as the short o, and a non-stressed vowel is indeed a short one in terms of phonetic length.

However, in this chapter I shall proceed along with the assumption that in word medial positions O surfaces as O when stressed; and as o elsewhere.

In brief, the phonological factors that condition the O to surface as o in different positions in a word may be shown as follows:

Oà o

word initially word medially word finally
Vowel harmony P P X
Prosodic weakening X P X
Phonotactic rule X X P

2.6.3. Formal and informal styles of speech

Before handling the actual data, I wish to comment briefly on the stylistic variations of pronunciation of Bangla.

In terms of the degree of artificial pronunciation three distinct styles could be observed in the language, viz. the colloquial style, the reading style, and the style of recitation and song. Among these three the first one shows least artificial pronunciation, while the last one shows most, and the second one stands somewhere between these two, e.g.

46. 1st 2nd 3rd
aboron aborOn abOrOn 'cover'
aradhona aradhona/ aradhOna 'prayer'
astabol astabOl astabOl 'stable'
mon mon mOn 'mind' etc.

Broadly speaking, Kolkata Standard Bangla comprises all these three styles though the phonological processes of each visibly differ from those of the other.

In fact, the style for recitation and song, especially of those poems and songs composed by Tagore, form a major part of the Bengali cultural life. The phonetics of this should be treated under Malmberg's (1963) normative phonetics.

However, keeping aside styles two and three, here I shall concentrate mainly on the colloquial style.

Consider the following examples with word initial O.

47. OkkhOY 'non-decaying'
48. OkkhOm 'unable'
49. OtOl 'bottomless'
50. OSObOrno 'not of the same caste'
51. akOrno 'upto the ear'
52. agOto 'come'
53. OkoSSaMt 'suddenly'
54. OSTom 'eighth'

In 47 to 50 there are (we shall conjecture) word boundaries between the negative prefix O and the morpheme following it. These morphemes are, as usual, initially stressed and so O, rather than o, occurs at these stressed positions. 50 has one more reading as OsobOrno, which will be accounted for later on.

According to the above assumption 47 and 48 would look like 55 and 56 respectively (I shall indicate word boundary by ::):

55. ::O::kkhOY::
56. ::O::kkhOm::

49 too could be taken care of in a similar way.

The stress rule could be formulated as:

R-15 The syllable immediately following a word boundary is a stressed syllable.

R-15 accounts for 50 and 51 as follows:

57. ::O::SO::bOrno::
58. ::a::kOrno::

53 and 54 are accounted for by R-15 as:

59. ::OkoSSaMt::
60. ::OSTom::

R-15 is justified by the following set of compound words where the word internal word boundaries have an intuitively stronger existence than they have in the above examples.

61. ::ain::SON::gOto:: 'according to law'
::lok::bOl:: 'population-strength'

Even the words with the adpositions and classifiers too speak in favour of R-15:

62. ::SOntoS::jOnok:: 'satisfactory'
::jOno::gOn:: 'people'

The assumption that the stressed position is always immediately preceded by a word boundary, and O cannot occur unless that position is a stressed position could be justified even by the monosyllabic words as follows:

63. cO 'let's go'
thO 'shocked'

64. to 'particle'
go 'particle'

Examples of 64 are the function words of the language which do not bear stress; hence they do not have O; whereas those of 63, i.e. the words containing O, are stressed content words of the language.

One notable point here is that the examples of 63 seem to violate the phonotactic rules of the language. But if we keep in our mind that the monosyllabic words often behave differently compared to the words of other configurations then those forms may not apear to be problematic any more.

Now let us consider the forms that stand as counterexamples to

65. OSohaY / OSOhaY 'helpless'
OSoNkho / OSONkho 'numberless'

The forms of 65 have two realizations, one with stressed O following the second word boundary, which could be accounted for in terms of R-15; and the other is withput this stressed O, which according to my analysis is without the word internal word boundary. One notable point is that both the forms here are tatsama items. For another case where such free variation between realizations of tatsama lexical items exists, see Dasgupta (1985: 45).

In the case of 65, the realizations with stressed O suggests a concious handling of the negative derivation of the items concerned; whereas in the case of the realizations with o the speakers have neutralized the items to such an extent that the negative derivation lost its transparency; hence the word boundary too does not exist. Thus 65 may have two simultaneous structures as follows:

66. ::O::SOhaY:: / ::OSohaY::
::O::SONkho:: / ::OSoNkho::

Now let me consider the following sets of data:

67. SOSTho 'sixth'
SoSThi 'the sixth day of the lunar month'

68. Ogonito 'countless'
Orthokori 'lucrative'
Obodomito 'suppressed'

69. OdriSSo 'invisible'
Olikhito 'non-written'

In 67, in comparison to the first tem it could be seen that the second item is a case of vowel harmony.

In 68 the vowel positions preceding the conditioning vowel i are stressed positions as they are preceded by word boundaries. But here the following vowel i, in terms of vowel harmony, pulls up the height of the preceding vowel and we get o, rather than O.

69 shows that the rule of vertical vowel harmony, which henceforth I shall refer to as R-16, does not operate across word boundary. The full formation of R-16, as it applies to different grammatical categories in Bangla, is beyond the scope of this chapter. However, a portion of this rule,viz. as it applies to verb morphology, will be formulated in chapter 3.

I shall assume that in 67 and 68 between conditioned and conditioning vowels of R-16 there may be some boundary which is weaker than a word boundary. Even Dasgupta (1984: 5) shows that in Bangla morphology the rule of vowel harmony oprates across morpheme boundary but not across word boundary.

70. bilOmbo 'delay'
sriNkhOla 'discipline'
uttOm 'best'
urbOr 'fertile'
biSOY 'matter'

In the examples of 70 the process of vowel disharmony operates. Henceforth I shall refer to the rule behind the process of vowel disharmony as R-17, the approximate form of which is as follows:

R-17. A preceding high vowel does not pull the following vowel one notch upward.

Moreover, I shall assume that like R-16, R-17 too does not apply across word boundary, rathet it applies across morpheme boundary. Such an assumption is justified also in terms of the examples of 70, which do not show any kind of intuitive word boundary in between.

Now I shall consider an interesting pair 71 and 72:

71. nirOkto 'bloodless"
72. birokto 'annoyed'

71 seems to be a potential output of both the rules R-15, and R-17. If I assume 71 to be the output of R-15, then the input structure of 71 to R-15 is 73:

73. ::ni::rOkto::

whereas in respect with R-17, the input structure of 71 is 74, (I shall mean some sort of weaker boundary, may be the morpheme boundary, by :) :

74. ::ni:rOkto::

However, the existence of the frequently used lexical item rOkto 'blood' in Bangla justifies the assumption that 71 is the output of R-15, rather than of R-17; and the input structure of 71 is 73 rather than 74.

72 does not agree with the above analysis. R-15 does not apply to 72 as here the stressed O of the rOkto has become unstressed o. Hence we can assume that there is no word boundary between bi and rokto, which can protect the stressed position of rOkto. Nor does R-17 apply to 72. R-17, as is shown in 70, applies across morpheme boundaries. Hence the non-application of R-17 here suggests that in 72 between bi and rokto there is neither word boundary nor morpheme boundary. But some sort of boundary is there which stops the application of R-17, and thus which is obviously weaker than a word boundary and stronger than a morpheme boundary. In other words, in cases like 72, I assume some sort of intermediate boundary which in terms of opacity stands somewhere in between the word boundary and morpheme boundary.

Such an assumption of different types of boundary of different strength agrees with the mechanisms provided in SPE (1968: 364).

Even the very nature of the item of 72 justifies this assumption. Both 71 and 72 share a similar derivational process. But compared to 71, 72 is no doubt a more frequent item in Bangla. And as a result speakers are no longer conscious of its derivational structure, rather they use it as a derivationally opaque form. In other words, the sublexical structure of this form became fossilized and thus the word boundary between bi and rokto is no longer existent in the conscious mental processes of the speakers. However, some sort of remnants of word boundary is still existent which stops R-17 from applying, and hence I claim such boundary to be weaker than a word boundary and stronger than a morpheme boundary.

The fossilized sublexical structure of 72 could be realized in terms of the items of 75:

75. birokto 'annoyed'
birag 'non-favour'
onurOkto 'fond of'
onurag 'fondness'

Though these four items apparently form a morphemic square, to the speakers of Bangla the relationship between onurag and onurOkto is more transparent than that between birag and birokto.

However, now let us consider the following examples, which too have fossilized sublexical structures and which are problematic for all the three rules, viz. R-15, R-16, and R-17.

76. a) OnobOroto 'continuously'
b) OnorgOl 'non-stop'
c) OnoSOn 'starvation'

To a Bangla speaker the forms 76 are derivationally opaque yet the plausible and partial sublexical structures of 76a, b, and c are 77a, b, and c respectively,

77. a) ::On::Obo::rOto::
b) ::On::Orgol::
c) ::On::OSon::

The discrepancy between 76 and 77, i.e. the lexical items and their sublexical structures, is precisely because of the shift of stress. The initial stress of the second items of 77a, b, and c should have been reflected in the second syllables of the examples of 76, but the stress has been shifted to the third syllables.

In other words, R-15 did not apply here, as according to it the second syllables in 76 should have been stressed. R-16 too does not apply here, as there is no visible segmental conditioning factor. Nor does R-17 apply here, as there are no high vowels in the second syllables, preceding the Os of the third syllables.

Then how do we account for the forms of 76?

In fact, the forms of 76 may be accounted for in terms of the theory of metrical grid as discussed in Hogg & McCully (1987: 130).

The phonological theory of metrical grid was developed mainly during the first half of the eighty's decade of the last century in order to handle the rhythmic structure of the stress-timed languages, especially English, by Liberman & Prince (1977), Prince (1983), Selkirk (1984b), Hayes (1983,1984), Giegerich (1985) and so on.

Metrical grid indicates metrical levels projecting metrical strength in terms of relative stress-strength at different prosodic levels, like syllable, stress-foot, word, phrase etc., in accordance with the principle 'strong is stronger than weak'. For example, the stress pattern of the phrase thirteen men is derived as follows:

78. x L3 / word or phrase level
x x L2 / stress-foot level
x x x L1 / s-level
thirteen men
w s s
\ /

In term of the stress-strength, shown in the tree diagram underneath, each terminal node of the tree is assigned a position on the metrical grid, i.e. the levels shown above. L1 simply aligns the terminal nodes of the tree one-to-one with the grid positions. The 'strong' syllables (terminally s) of the tree are defined on the L2 as having more metrical strength than the 'weak' (terminally w) syllables; and therefore -teen and men are grid-marked on L2, whereas thir-, being a weak syllable, is not. On L3 men gets a grid mark as it is defined on the tree as having more strength than -teen, which is indicated as the solitary topmost x in 78.

Now in terms of the three basic concepts of the theory of metrical grid, viz. adjacent, alternating, and clashing, the stress-shift of the phrase thirteen men is accounted for in 79.

Liberman & Prince (1977: 314) define these three basic concepts as follows:
"Elements are metrically adjacent if they are on the same level and no other elements of that level intervene between them; adjacent elements are metrically alternating if, in the next lower level, the elements corresponding to them (if any) are not adjacent; adjacent elements are metrically clashing if their counterparts on the level down are adjacent."

79.a) L3 x b) x
L2 x------x à x x
L1 x x x x x x
thirteen men thirteen men
(w)(s) s s w s
\ / \ /
w w

In 79(a) (cf. 78) at L2 the grid marks are clashing, which triggers the movement rule called Iambic Reversal, that reverses the circled nodes of the tree of 79(a) and results in 79(b).

However, among the phonologists there are differences of opinion regarding the status of the metrical grid in a theory of rhythm phonology. For example, Prince (1983) and Selkirk (1984b) advocate a 'grid-only metrical phonology' as they believe that the prominence relations and rhythmic structures are better captured in grids alone; and that therefore tree structure is unnecessary.

A second view was held by Hayes (1983, 1984), who argues for both trees and grids in a theory of metrical phonology as he believes that these two tools, viz. trees and grids, play sharply two distinct roles, tree being related to stress and grid being related to rhythmic structures of the language. Hence both the trees and the grids are necessary.

A third view, held by Giegerich (1985), advocates a 'tree-only metrical phonology' as he believes that phonological constituency and prominence relations are best expressed in binary branching metrical trees.

However, instead of going into these major controversies I shall attempt to account for the forms of 76 with the help of the three concepts, viz. adjacent, alternating, and clashing, of the theory of metrical grid.

The sublexical structures of 76a, b, and c, as are shown in 77a, b, and c, according to R-15, have the stress positions as 80a, b and c. In the absence of a theory of stress-tree formulation for the so-called syllable-timed languages here I work in terms of partial trees, showing only stressed positions.

80.a) ::On::Obo::rOto::
| | |
s s s

b) ::On::Orgol::
| |
s s

c) ::On::Oson::
| |
s s

I assume that at some temporal point the process of fossilization of the sublexical structures of the items of 80a, b, and c has either deleted or weakened the intraword word boundary therein.

Now in the absence of the internal word boundary the assumed metrical grid of 80a, b, and c will be 81a, b, and c respectively:

81.a)Lx x--x x x x--x x x x
L1 x x x x x x x x x x x x x x x
::OnOborOto:: à ::OnobOrOto:: à ::OnobOrotO::
| | | | | | | | |
s s s s s s s s s

à phonotactic rule à OnobOroto

b) Lx x--x x x
L1 x x x x x x
::OnOrgol:: à ::OnorgOl::
| | | |
s s s s

c) Lx x--x x x
L1 x x x x x x
::OnOSon:: à ::OnoSOn::
| | | |
s s s s

where L1 is the s-level and Lx is higher than L1 but no higher than word-level. On Lx clashing grid marks (indicated by dotted lines) triggered the movement rule, viz. R-18, which shifts the stress of the second syllable to the third one.

The approximate formulation of R-18 is as follows:

R-18. Move the non-initial clashing stress to the next syllable.

R-18 can even account for the form OSobOrno, the alternative reading of 50. With the fossilization of the sublexical structure of the said item the internal word boundaries therein are either deleted or weakened. Thus the metrical grid of OSobOrno is 82:

82.a) Lx x---x---x b) x x
L1 x x x x x x x x
::OSObOrno:: à ::OSobOrno::
| | | | |
s s s s s

In 82(a) since there is no place for the right most non-initial clashing stress to shift so we start working with the penultimate non-initial clashing stress. R-18 applies to it and shifts the stress of the second syllable to the third one where it does not clash any more rather coincides with the rightmost clashing stress, as is shown in 82(b).

The theory of metrical grid as it applies to Bangla agrees with a long attested tendency of Bangla pronunciation. This tendency has been affirmed in different terms by different linguists. For example, Chatterji (1921) was the first one to attest this tendency and he preferred to term it as bimorism and dimetrism; Sen (1968) preferred to call this tendency as bimorism or bisyllabism though his description differs a bit from that of Chatterji (1921); whereas Sarkar (1982-83) prefers to call it second vowel loss.

Sarkar (1982-83: 85) cites from Chatterji (1921):
"Isolated words tend to take up a standardized time-beat or mora. A normal Bengali word takes two time-beats, or units of time, or morae…..polysyllables are cut short or divided into groups of syllables which take each the normalized length of time", e.g. Opra-jita for OpOrajita 'a flower' etc.

According to Sen (1968: 218):
"However, long a word may be according to the current tendency of Bangla colloquial pronunciation of West Bengal it will be divided into disyllabic, i.e. dimoric divisions", e.g. Op-ra + ji-ta for O-po-ra-ji-ta etc.

According to Sarkar (1982-83: 92):

Firstly, the specific process behind the bimorism or dimetrism of Chatterji (1921) and bimorism or bisyllabism of Sen (1968) is the loss of second vowel in a word under specific conditions. In other words, the loss of second vowel is the process while bimorism / dimetrism / bisyllabism is the effect of that very process.

Secondly, if this particular 'tendency' of Bangla pronunciation is termed as bimorism / dimetrism / bisyllabism then this process will have a good number of counterexamples. But the principle of the loss of second vowel applies more viably and thus produces less counterexamples than the process of bimorism / dimetrism / bisyllabism.

Though Sarkar (1982-83) apparently is more convincing than Chatterji (1921) and Sen (1968) currently I do not wish to participate in this particular debate. Rather the limited points that I wish to make here are:

1. None of these proposed processes, viz. bimorism, dimetrism, bisyllabism and loss of second vowel, protects the second vowel of a word.
2. Stressed vowels are not lost.
3. In terms of the principles of the theory of metrical grid, accounting for the prominence relations and rhythmic structures of Bangla, the stress of the second vowel is shifted to the third one.

In brief, the effects of both the proposals, viz. bimorism / dimetrism / bisyllabism / loss of vowel, and metrical grid, are in agreement with each other.

2.7. Conclusion

Bangla syllables and their quantity and prominence have been dealt with in the present chapter.

The rules of syllabification, types of syllables, their phonetic length and phonological quantity have been discussed, to a large extent, as they are available in the literature, both phonetic and phonological, and to a small extent, in terms of the tools of metrical phonology. Whereas syllable prominence has been dealt with entirely in accordance with the tools of metrical phonology as except for a few scattered lines here and there in the literature hardly any treatment of the predictable stress of Bangla is available.

The present metrical treatment of Bangla word stress is quite tentative and partial in nature; and it is at least consistent with the claims made in Hayes and Lahiri (1991), henceforth HL, the first as well as the only attempt at formally characterizing the intonation in Bangla that we know of.

HL (1991) deal with a higher level phonology, viz. the phrasal phonology. In that course they deal with Bangla stress, viz. word stress and phrasal stress, as according to HL (1991: 55) "since the docking sites for pitch accents are stressed syllables, the first task in an intonation analysis is to determine where stress is located".

HL (1991) follow the simple word stress rule of Chatterji (1921) and Klaiman (1987) to build up their intonational theory. The rule says "stress the initial syllable of a word" (HL 1991: 55). In our work we accept this rule and we further investigate other minute stress rules depending on evidence from segmental phonology. Thus we formulate a theory of Bangla word stress in a much more specific fashion than the existing ones.

Moreover, our formulation apparently feeds this higher level intonational phonology of HL (1991) without any problem as follows:

Firstly, in both the formulations, viz. the intonational phrasal phonology of HL and the word stress phonology of ours, there exists no contradictory assumption, either in the rule sector or in the bulk of examples.
Secondly, both the formulations apparently depend on one basic principle of metrical phonology, viz. 'strong is stronger than weak', as follows:

The present formulation aims to account for the rhythmic structure of Bangla word stress in terms of metrical grid. The concept of rhythm entails the concept of stress undulations in a word. Even the partial tree structures used for capturing the rhythmic word stress directly implies the existence of more than one stress level with different strength.

The treatment of HL (1991: 56), though it imposes a caveat, viz. "stress in Bengali is usually quite weak phonetically, sometimes to the point of being almost inaudible", also hints about more than one stress level with different strength in Bangla as follows:

HL (1991: 55) "Compound words also have their strongest stress on their initial syllables. There may also be a weaker stress on the initial syllable of their second member, but this is hard to hear."

HL (1991: 79) "In general, heads, which are weakly stressed, have smaller pitch excursions than the main-stressed nuclei."

Apart from the length, quantity, and prominence the other aspects of the syllable unit of Bangla will be dealt with in chapter 5 in a framework of metrical phonology.


1. See Sarkar (1985-86).

2. Words consisting of CC(C) clusters belong to the borrowed level of Bangla vocabulary (Sarkar, 1986).

3. Only a few borrowed elements show final CC sequences. On the whole SCB phonology tends to avoid more than one consonant word finally, even in the case of borrowed items (Sarkar, 1986).

4. This rule of gemination applies only to the level of tatsama vocabulary as is mentioned in rule 7. In case of typical Bangla words such a rule does not apply, e.g. SaMt-ra 'a surname', Sap-la 'a type of water weed'.

5. The only exception to his observation is the sequence NSk to which the application of rule 9 produces quite counterintuitive results. As for example, SON-Skar 'renovation' is less acceptable than SONS-kar.

The syllabification of this particular tatsama sequence, viz. NSk may be justified, firstly, on the basis of its simplified 'epenthesized' counterpart in Bangla, e.g., the simplified epenthesized Bangla version of SONSkar is SONoSkar, which is syllabified as SO-NoS-kar in accordance with rule 7.

Secondly, the word initial evidence shows that the sibilant+k cluster becomes sk in Bangla, rather than Sk, e.g. skOndho 'shoulder'. Hence it is reasonable to count S as part of the preceding syllable and treat Sk as a sequence, rather than a cluster.

However, since this particular controversy is not directly relevant in the concept of heavy/ light syllables in Bangla (cf. 2.3), I leave the problem open.

6. I prefer to call such a syllable final but not word final sequence a 'closer-combination-than-sequence' because they differ, on the one hand, from the clusters (which may be considered as the closest combination) as unlike clusters this does not have any initial or onset occurrence; and on the other hand, from the word final CC sequences, as unlike the second member of the word final CC sequences, the second member of this CC sequence has choice between the preceding and following peaks to join with - which amounts to say that the second member of the syllable final but not word final CC has become the part of the coda of the preceding peak by choice rather than by force.