The experimental phonology
O presente artigo mostra que o uso de métodos experimentais permite formular hipóteses sobre a categoria fonológica e seu primitivo, bem como sobre a maneira como o falante controla seus articuladores. O objetivo é demonstrar que problemas e hipóteses fonológicos podem ser formulados e testados através do método experimental. Hipóteses falsificáveis são parte do interminável progresso do esforço científico, do qual o estudo da linguagem e a fonologia são partes inegáveis.
A century after Rousselot’s publication of ‘Principes de phonétique expérimentale’ (1904) the experimental method is finally taking its approriate place in linguistics. Experimental or laboratory phonologies (OHALA & JAEGER, 1986; KINGSTON & BECKMAN, 1990; DOCHERTY & LADD, 1992; KEATING, 1995; CONNELL & ARVANITI, 1995; BROE & PIERREHUMBERT, 2000; GUSSENHOVEN & WARNER, 2002; LOCAL, OGDEN, & TEMPLE, 2003; GOLDSTEIN, BEST, & WHALEN, 2005; COLE & HUALDE, 2007; FOUGERON, KÜHNERT, D’IMPERIO, & VALLÉE, 2010) are now well established and are gradually becoming dominant in the field. A new journal, ‘Laboratory Phonology,’ has been founded to promote this new paradigm.
Fundamental issues such as the systematic and quantified description of sound systems and sound phenomena are now evaluated differently than when phonetics and phonology were considered separated by the structuralist and generativist frameworks (e.g. TRUBETSKOY, 1939; CHOMSKY & HALLE, 1968). The search for adequate primitives, the types of evidence considered, the nature of explanation, the nature of phonological representations, and the types of experimental paradigms used in phonological research are also central issues in Laboratory Phonology. Rousselot expressed similar concerns in his various publications (1891, 1904, 1923). The ‘Leçon d’ouverture au Collège de France’ (1923) is probably the best synthesis of his ideas and shows that the founder of experimental phonetics had anticipated much of what is now becoming routine in linguistics. Two thirds of a century later he was followed by OHALA (1987) who argued for the establishment of phonology as an experimental discipline. Ohala’s first statement was expressed as a reaction “…to escape the endless and agonizing cycle of birth and death of trendy theories, schools, frameworks, etc. and achieve oneness with the spirit and principles that guide all scientific endeavor”. COHN (2010) calls for integrated theoretical models in laboratory phonology. CROOT (2010) suggests that some findings are becoming central to the emergence of a paradigm in laboratory phonology. This is the occurrence of linguistic categories identified and analyzed using verbal/symbolic categories. This is also the case for gradience that appears at all levels of analysis: the probabilistic nature of sound structures (PIERREHUMBERT, 2001).
Most phonologists would likely accept that phonology studies the logical, functional and behavioral aspects of speech sounds. Such studies require the categorization of sounds or features, and imply mental representations and other cognitive aspects of speech sounds. Phonology is thus concerned with the description and the comparison of the sound systems of human languages. The discipline also aspires to a set of explanatory first principles whereby the sound phenomena found in languages may be understood. Like any scientific endeavor, the discipline is characterized by questions that researchers are trying to answer. Even if the following list is not exhaustive, most phonologists would probably consider these questions as part of their research activities: How are acoustic features categorized? How do we explain the sources of sound change? How does speech perception influence sound change? What can we say about the direction of sound change? How are allophones controlled and categorized? Do we account for sounds better in terms of features or in terms of gestures? How can we account for articulatory control? What is the minimal distance between segments to be distinguished in perception? How can we account for the emergence of sound patterns in ontogeny and phylogeny? What are the correlates of syllables? Are typologies of any use to explain sound patterns? What are the best primitives? What kind of explanation is required by the observed phenomena? What are the constraints acting on phonetics and phonological processes? How do we explain universals? What are the universals? Obviously, to answer to these questions our knowledge of speech production and speech perception need to be included in an integrated field of phonetics and phonology.
Between physics and cognition
The interaction between the physical and the cognitive aspects of speech sounds is emphasized by KINGSTON & BECKMAN (1990) in their introductory note to the first volume of Laboratory Phonology. The model of articulatory phonology (BROWMAN & GOLDSTEIN, 1989, 1992) promotes similar views in a different framework. Whatever the limits of articulatory phonology and whether or not one agrees with the model, it is difficult not to acknowledge that it is a serious attempt to integrate the domains of phonetics and phonology. Indeed, in articulatory phonology, phonological units are discrete gestures having both an abstract and a concrete (dynamic) side. This model of phonology takes into account time (the dynamic aspect of gestures) in phonology and allows consideration of processes such as assimilation and epenthesis, for example, as variations in the execution or phasing of gestures. HUME & JOHNSON (2001) also emphasize the role of perception in phonology. Their proposals on the interplay of speech perception and phonology enable the integration of the cognitive aspects of speech sounds in phonology, and they show how phonological systems influence speech perception, for example in that listeners are more adept at perceiving sounds of their native language than those of a second language. HUME & JOHNSON also show several influences of speech perception on phonological systems, including the failure to perceptually compensate for articulatory effects, the avoidance of weakly perceptible contrasts, and the avoidance of noticeable alternations. The influence of speech perception in phonology is particularly obvious on what they call phonological repair strategies that can either preserve contrasts (epenthesis, dissimilation and metathesis) or sacrifice contrasts (assimilation and deletion). What is important in HUME & JOHNSON’s model (2001: 20) is the emphasis given to the fact that the interplay between speech perception and phonology must be defined in a way to include the cognitive and formal representations of phonological systems.
1. Experimentation in phonology
1.1. A bit of history
Experimental methods and the theory of evolution, the two main pillars of contemporary science, have long been used to study speech. In this respect Rousselot’s work still provides an excellent example of the benefits of experimentation for the study of many aspects of speech and phonology, from the physics of sound to dialectology. However, the results and methods of experimental studies have not been adequately incorporated into the framework of mainstream phonology, maybe because of what Rousselot considered almost a century ago:
…les procédés des sciences expérimentales sont assez étrangers aux linguistes. Une sorte de terreur superstitieuse s’empare d’eux dès qu’il s’agit de toucher au mécanisme le plus simple. Il fallait donc…leur faire entrevoir le champ immense que l’expérimentation ouvre devant eux (1904: 1).
This still applies to generative phonology and several other contemporary approaches to phonology. ROUSSELOT (1923) stated a crucial point (that is still heard occasionally today) about the relation between science and linguistics and the status of experimentation:
On a refusé à la linguistique le titre de science, en alléguant pour motif qu’elle emprunte sa méthode à l’histoire, qu’elle enregistre simplement les faits sans pouvoir les reproduire, impuissante par conséquent à atteindre la certitude que donnent les sciences proprement dites (1923: 17).
Rousselot was strongly questioning this position and was promoting his opinion of the science of language, of which phonetics (and therefore phonology in his view) was a part. Debating issues related to experimental phonetics, Rousselot advocated a program that any speech scientist can still adopt nowadays:
Elle [la phonétique expérimentale] demande à l’organisme lui-même de lui en révéler les conditions physiologiques; elle dégage les éléments actifs, qui, à un certain moment de l’évolution ont étés mis en présence, puis elle cherche à les reconnaître dans le trésor du parler humain ; enfin, quand elle a été assez heureuse pour les rencontrer dans une même bouche, elle les réunit ; et alors, aussi sûrement que s’il s’agissait d’une manipulation de chimie, elle voit se reproduire le phénomène attendu. C’est là l’oeuvre propre de la phonétique expérimentale (1923: 17).
The last part of this quotation shows that Rousselot clearly understood the necessity to be able to recombine elements of speech and to be able to reproduce them in laboratory conditions. This is similar to OHALA’s statement (1974) that one of the main goals of such an endeavor is to reproduce the phenomenon under investigation in controlled laboratory conditions. The intent of both men is that the experimental method should be used in phonology as it is in any other scientific discipline. The multiple dimensions involved, i.e. ranging across both the physical and cognitive dimensions of phonological systems, make the enterprise anything but trivial.
1.2. Why experiments?
The question of experimentation can be discussed in a way very similar to that evoked by Claude BERNARD (1865) when he established the principles of experimental medicine. For Bernard, it was much harder to carry out experimentation in medicine than in any other science and because of this experiments were indispensable. For BERNARD (1865: 2-3): “Plus la science est complexe, plus il est essentiel, en fait, d’établir une bonne norme expérimentale, de manière à obtenir des faits comparables, libres de sources d’erreur”. The comparison with language and phonology is striking and we may be in our own field at a time comparable to the state of medicine in Bernard’s. No one will doubt that language is a very complex phenomenon and that, to understand the observed phenomena, multiple disciplines should be invoked. Many examples could be given to demonstrate that without combining physiology, acoustics, aerodynamics, and a variety of experimental paradigms treating perceptual and cognitive aspects of speech, it would be difficult to find any satisfactory explanations for the phenomena that we observe. The basis of experimentation lies in the fact that the world is not necessarily what it seems to be. In the world of speech this is sometimes expressed by saying that “The human ear does not perceive everything that is recorded by a machine. How does this affects the work of phonologists?” The answer is simply that the acoustic details or cues that are recorded by machines are not always proven relevant in the language but neither do they always prove irrelevant, and in either case, machines allow examination of the details that in fact occur. Indeed this was the starting point of Rousselot’s studies in his own dialect (ROUSSELOT, 1891). A good example of this is provided by the emergent bursts that can be observed in languages (see section 3 for more details). Most of the time they go unnoticed, but if they are, they can explain the emergence of stops in those languages. Another example is provided by clicks, which are made by all humans, but are found as phonemes only in one small language family (TRAILL, 1985). When clicks are phonologically relevant, it is important to be able to give an objective account of the phenomenon. Generative phonologists sometimes raise the question: “Did any machine ever change the work of phonologists?” The answer is, of course, yes. Just to take one obvious example, the sound spectrograph led to the recognition of formant transitions, VOT, and noise spectra, features that are essential to identifying place of articulation and to processing the categorical aspects of speech perception.
1.3. Phonology vs. Phonetics
Since the early days of structuralism there has been a tendency to consider phonetics as separate from the main core of language (this attitude has wrongly been attributed to Saussure, who was by training a Neo-grammarian and therefore aware of the importance of phonetic evidence to solve linguistic problems). This separation was stated explicitly by TRUBETZKOY (1939) who considered phonetics to be in the domain of the natural sciences and phonology as in the domain of linguistic studies. From the beginning this view was shared by generative phonology. For phonology to be an experimental discipline, in my view, phonetics and phonology must be integrated. This requires that phonologists derive fundamental units and processes deductively from independent premises anchored in physical and physiological realities. Issues such as the innateness of phonological features must be considered as working hypotheses. Specifically, the assumption that speakers’ knowledge is innate and part of their genetic endowment, an assumption common to generative phonologists (e.g. HALLE, 1990), has yet to be proven. Of course, no one challenges the assumption that humans have a genetic endowment accounting for some aspects of language. There is no question about the major role played by our biological inheritance determining our physical form and our behavior, but innateness in the sense of a specific link between genetic variation and some grammatical outcome has yet to be demonstrated (ELMAN et al., 1996: 372).
We must still understand the nature of the interaction between nature and nurture in linguistics. Substance based works (i.e. founded on empirical data) of phonological nature such as (just to cite a few) MADDIESON (1984), LINDBLOM & MADDIESON (1988), VALLÉE (1994) and ROUSSET (2004), are fundamental to understanding generalizations about how phonological systems are shaped and distributed. Whatever the model of phonology adopted, phonological theory must be based, as it is in these works, on models that incorporate parameters coming from the sub-systems involved in speech communication. Among these are principles relating vocal tract shape and acoustic output, certain known aerodynamic principles, and finally certain of the principles governing our auditory extraction of information from the acoustic signal (OHALA, 1990).
In addition, feedback and control processes, such as those proposed by PERKELL (1981), MACNEILAGE (1981), and KINGSTON & DIEHL (1994) should be incorporated in such a theoretical framework. In sum, phonological theory must acknowledge and incorporate well-established facts from models of speech production and speech perception.
Within a scientific study of language, phonology without the phonetic dimension is an illusion. In the same way, phonetics without phonology brings nothing to the understanding of categories upon which language is built. About this relation OHALA (1990: 168) proposed the following:
My own view is that between phonology and phonetics, phonology is the super-ordinate discipline, not because it has accomplished more or is better developed – the opposite may be true – but simply because it looks at and seeks answers to a much broader range of phenomena involving speech behavior.
Phonetics is thus an inescapable component within phonology, while Ohala’s allusion leaves us to infer that phonology is still wanting in empirical, experimental paradigms for exploring the cognitive aspects of speech sounds. It would seem that the very rapid development of psycholinguistics and cognitive science offers phonologists a path toward such paradigms. Indeed if one defends that there can be no interface between phonetics and phonology because the two domains must be integrated, i.e., experimental models and theories must incorporate the abstract sides of speech such as representations and categorization.
1.4. Theories and models
Some fundamental points must be raised about models and theories. Considering phonetics and phonology as one domain assumes that models from speech production and speech perception offer a good basis for testing phonological hypotheses if phonological problems are formulated using physical primitives. Models are usually expressed in mathematical terms, to render explicit the relevant parameters involved in particular domains of the field under study, in this case speech. A reasonable definition of what constitutes a model is given by BENDER (2000): “a mathematical model is an abstract, simplified, mathematical construct related to a part of reality and created for a particular purpose”. This means that the use of models in phonology will not produce a global explanation of a system, but will instead help to formulate a particular problem, discard unimportant details and specify the interactions between the variables. Using a model can help to make predictions that can be checked against data, or even against common sense; using a model also allows the generation of simulations to compare with observed facts. Phonological studies are essential for systematizing the data and for rendering explicit the observations made in various languages of the world. This is a time consuming job, and there is no other way to accomplish it than the traditional methods of phonologists for describing the sound system of an unknown language. To confirm this, consider all the steps necessary to describe the sound system of an unknown, unwritten language. It requires the determination of the finite set of phonemes, the mapping of their distribution and phonetic variation, and in addition the detection and understanding of any phonological processes. Neither tools nor any machine can accomplish such tasks, and there is still no better method available to linguists than taking a piece of paper and a pencil to write down observations (i.e. start by making good, reliable phonetic transcriptions). Only when this is done can acoustics and other tools allow refinement of the description and the search for explanations of the observed phenomena. One of the best examples of this and of the cumulative nature of experimental work is provided by the study of clicks. Looking at the first systematic description of clicks given by DOKE (1926) and BEACH (1938), it is possible to see that Doke and Beach’s main tools were the kymograph and palatography to explain the articulation of clicks. It is only much later in the work developed by phoneticians such as TRAILL1 that acoustic, articulatory and aerodynamic aspects of clicks were fully understood. Traill’s work added deeper and more general explanations to Doke and Beach’s original descriptions but the basic description of a click articulation remained unchanged.
1.5. Phonology in the laboratory
ROUSSELOT’s (1923) expectation that speech and language phenomena would ultimately be reproduced in the laboratory has eventually become true (e.g. OHALA, 1974, FOULKES, 1997). The recent development of sociophonetics and the integration of psycholinguistic paradigms into the phonetic and phonological components of language clearly go in the direction of the program he initiated a century ago. One of the major lessons from Rousselot’s work, one that other trends like generative phonology have failed to follow, is that whatever the linguistic phenomena to be explained, the linguist’s task includes developing the appropriate tools to find the correct explanation and the right theoretical framework. This implies the establishment of new methods of observations, the use of new tools, and the integration as appropriate of primitives established in other scientific disciplines.
A remark about the relation between laboratory work and spontaneous speech should be made at this point. This is sometimes heard that laboratory work is only a reduction what of exists in the ‘real world’ and that essential points about the behavior of speech are missed by laboratory work. According to this view, there might be little in common between spontaneous speech and laboratory work. On the contrary, working in a laboratory setting allows control of the parameters involved in experiments and is the essential point in the method and its main strength. There is in principle no essential difference between laboratory and spontaneous speech. The same principles apply to both. Understanding the difference between the two will eventually come from demonstrations of how the various parameters identified in the laboratory adapt to more natural conditions.
1.6. The experimental method
Discussing the experimental method in his ‘Principes de médicine expérimentale’, BERNARD (1942) made a distinction between two types of sciences: the observational sciences and the experimental sciences. From what has been said above and what is possible in modern laboratories, it is clear that phonology has shifted from an observational science towards an experimental science. Indeed, any phonological phenomenon, whether it involves sounds or processes, can be systematized by experimental methods. This permits quantitative descriptions, which can be used for statistical treatments to understand the data or an associated problem. Phonologists are thus able to make hypotheses about how sounds are produced and perceived or about how some particular process works. These can be tested in laboratories through various types of experiments. Rousselot and Ohala’s claims regarding phonology as an experimental discipline are therefore confirmed. There is however one point that has to be emphasized. That phonology at its core is about contrasts and categories in the sound system of a language cannot be reduced to the biophysical aspects of speech sounds. The explanation of phonological phenomena therefore requires a cognitive dimension, which naturally renders the enterprise very complicated.
Phonologists now have to formulate hypotheses about the relation between the biophysical aspects and the cognitive aspects of speech in order to explain the phenomena they study. The question of the control that speakers have on the production and perception of sounds within a given phonological system is one of these hypotheses, for instance. Of course phonologists don’t make hypotheses from scratch. As in any other scientific discipline, hypotheses are based on a theoretical basis. They are made from the knowledge of the various components involved in speech. Physical laws in acoustics and aerodynamics provide a solid basis to formulate some such hypotheses. The story becomes more complicated, however, when cognitive dimensions are involved, since similar laws in that domain have not yet been established. However it is important to note that phenomena like critical bands, masking and signal detection have cognitive dimensions. Probabilistic influence on acquisition, and anything invoking memory also have law-like aspects that are squarely cognitive. None of this may yet be ripe for phonological application but it will surely become so in the future. This is where the interplay between data and models become crucial. To conclude we can say that phonology has now shifted from an observational science towards an experimental science. However the complexity of the object with its many dimensions – physical, biological, psychological, cognitive, and social – makes clear that experimentation in phonology is still in its infancy.
This section illustrates the use of different methods for describing phonetic phenomena a for clarifying problems linked to the establishment of phonological categories, processes and primitives. Methods discussed in this paper address acoustics, aerodynamics, electropalatography, and perceptual tests. The phenomena studied are: perception of vowels in Karitiana, prenasalized stops in Rwanda, and geminated consonants in Amharic. Each subsection presents a problem and shows how it can be processed with a specific method, rather than presenting data as if for a full paper about the subject. However, references will be given to papers giving a complete treatment to the problems discussed.
2.1. Perception experiments
Perceptual tests to check observations made from speech production and phonology are very useful and can be undertaken to verify how a phonological feature or category is processed. Many protocols are now available for this purpose. For example, simple tests have been proposed by HOMBERT & PUECH (1984) and DEMOLIN (1992) for use in the field. They were elaborated to explore how tones and vowels are perceived and to estimate how much phonetic variability is tolerated within a single phonological category. A perceptual test of Mangbetu vowels (DEMOLIN, 1992) showed that speakers show a great deal of variation between their production and perception, specifically, they perceive as acceptable a much greater range than they can produce. This difference is a potential source of sociophonetic variation and, ultimately, sound change.
2.1.2. Vowels in Karitiana
Karitiana, a language from the Arikem family, Tupi stock, spoken in the state of Rondonia in Brazil, shows interesting phenomena concerning vowels. Indeed, like several other languages of this linguistic stock, Karitiana has a vowel system with 5 vowel qualities (Figure 1) and shows the typological rarity of not having a high back vowel in its phonemic inventory2 . In order to check how Karitiana speakers perceive their vowels, and if there was a compensatory effect for the absence of high back vowel in the system, a perceptual experiment was performed with three subjects. This experiment was done with stimuli corresponding to short oral vowels.
2.1.3. Material and Method
A set of 53 synthetic stimuli covering the full F1-F2 vowel space was presented to three literate subjects3. After training, the stimuli were presented 10 times in random order to the subjects. After listening to the stimuli, subjects had to point to one of five monosyllabic words containing one of the five Karitiana short oral vowels. Subjects pointed to an empty box when the stimulus did not correspond to any possible native vowel quality.
Vowels were considered to be correctly identified when they were recognized at least 90% of the time. Results of this test show that the subjects were able to identify the vowel qualities corresponding to Karitiana vowels among the stimuli presented. The areas in the F1/F2 space where these vowels were identified correspond to those observed in production, as shown in Figure 2. The main difference between the three subjects was that the areas in which the stimuli were identified were smaller for one of the subjects. Two striking features of the results are that no stimulus in the area of the high back /u/ was identified as a possible vowel by the Karitiana, and for one subject the central vowel /ɨ/ was not recognized more than 70% of the time (the dotted areas of Figure 2).
The absence of the high back vowel in Karitiana is a typological rarity and therefore requires careful investigation to understand why this basic vowel is missing. Karitiana is however not unique for this feature. CROTHERS (1978) reports five languages where such systems can be found. MADDIESON (1984) and LINDBLOM (1986) have noted that a system /i, a, o, ɛ, ɨ/, although rare, exists in the worlds’ languages, and this system is comparable to what is found in Karitiana. The measurements made with our 3 subjects show that Karitiana indeed has no high back vowel /u/ in terms of production and perception. The closest back vowel is /o/, with a mean F1 of 456 Hz for the short vowels and 464 Hz for the long. The results of the perceptual experiment show that /u/ is never identified among the stimuli submitted to the subjects and therefore this seems to confirm that this vowel is not acceptable in Karitiana. A final point needing discussion is that, although not achieving the 90% criterion for consideration as correctly recognized, the central vowel [ɨ] is nonetheless recognized 70% of the time. This is above chance but too low for our criteria. The reason may lie in the fact that central vowels are generally shorter than peripheral vowels. The fact that all stimuli had similar duration might have confused Karitiana speakers about the identification of this vowel. Measurements (Figure 3) show that the long central vowel [ɨː] has a duration similar to the short vowels [i, e, o]. This suggests that the similar duration of stimuli confused the subjects when asked to recognize the high central vowel [ɨ]. This has yet to be proven by another experiment taking into account the average durations of vowels. This further demonstrates the benefits of experimentation in phonological research: by analyzing the limits or failures of experiments, improvements can be proposed to better check the hypotheses investigated.
Preliminary as they may be, these data point to a phonological system where the units are neither abstract nor underspecified. Their concreteness is indicated by the fact that phonetic details such as formant patterns and acoustic durations decisively affect well-formedness judgments. This has important implications for question whether phonetics and phonology are distinct components of the grammar.
2.2. Prenasalized consonants in Rwanda
Rwanda and several other Bantu languages show variations in the articulation of complex consonants (prenasalized and velarized – plain and secondary) that render accurate description a challenge. The phonetic variation observed in the realization of these complex consonants is important for understanding and explaining the phonological patterning of consonants and syllables in Rwanda and in such other Bantu languages as Ikalanga (MATANGWANE, 1999), Shona (DOKE, 1931; MADDIESON, 1990), and Sukuma (MADDIESON, 1991). Rwanda
has three groups of prenasalized stops in its phonetic inventory,
i.e. (i) a set of voiced and voiceless prenasalized stops [mh, mb, mf, mv, nh, nd, ns, nz, nʃ, nʒ, ng, ŋh, ŋg ]; (ii) a set of voiced and voiceless labiovelarized prenasalized stops [mbg, mvg, ndgw, nzgw, nʒgw, ŋgw, m hn , n hŋ˚w, nskw, nʃkw, ŋ˚h] and
(iii) a set of voiced and voiceless palatalized prenasalized stops [mpfy, mbɟ, nhn , ndɟ, nstʃ, n hy, ŋɟ] (JOUANNET, 1983). The labiovelarized and voiceless sounds are quite unusual and present a number of problems that require an accurate description to understand their production and their phonological status. In the voiceless set of sounds [m h, n h, ŋ˚h, m hn , n hŋ˚w, ŋ˚hw, n hŋ ŋ˚hy] there are voiceless nasals both preceding and following the aspirated part of the consonant. This very rare phenomenon must be demonstrated and explained.
The words presented in table 1 were recorded in a small carrier sentence; vuga_____itchumi, ‘say ______ten times’. Each word was recorded 5 times in its carrier sentence. Seven speakers took part in the experiment.
|[m h ]||[im h amba]||food for travelling|
|[ŋʱ]||[iŋ ʱ a]||Cow|
|[n h ŋw]||[in h ŋwaro]||Weapon|
|[ŋ h w]||[iŋ h wano]||Dowry|
|[n h ]||[in h ooza]||eloquent person|
In order to understand the phenomenon, aerodynamic recordings were made using the Physiologia workstation (TESTON & GALINDO, 1990) linked to a data collection system equipped with the appropriate dedicated transducers. Oral airflow measurements were made with a small flexible silicon mask placed on the mouth. Nasal airflow was measured at the end of one nostril via a small tube linked to the data collection system. Pharyngeal pressure was recorded with a small flexible plastic tube (ID 2mm) inserted through the nasal cavity into the oro-pharynx. Acoustic recordings were made simultaneously via a High Fidelity microphone on the rig connecting the transducers to the computer. Spectrograms and audio waveforms were processed with Signal Explorer software.
Results show that voiceless nasals are actually rare in the language and are mainly observed before voiceless fricatives. Some of the so-called aspirated sounds are fully voiced rather than voiceless, as shown by DEMOLIN & DELVAUX (2001). Therefore these voiceless prenasalized stops of Rwanda should be described as whispery-voiced nasal stops. However, alternations with voiceless aspirated stops have been observed and must be taken into account. This might reflect dialectal variation. Table 2 sums up the results of the different parameters measured.
|[m h ]||108||115||120||50||1.36||145|
|[ŋ h ]||130||155||150||50||4.08||206|
|[ŋ h w]||164||131||170||50||2.5||210|
|[n h ŋw]||156||149||140||30||2.4||216|
|[m h ŋ]||151||160||100||50||1.3||181|
The table gives the acoustic duration of prenasalized consonants and the mean value of the different aerodynamic measurements. The duration of the increase in nasal airflow shows that this increase in airflow takes more time for whispery voiced nasal stops than for their non-whispery voiced counterparts (134 ms vs. 102 ms on average), in the oppositions [mh/mb, ŋh/ŋg , ŋhw/ngw]. The maximum value of nasal airflow is always much higher for whispery voiced nasal stops (mean = 146 ml/s) than for the voiced prenasalized stops (mean = 40 ml/s). The maximum value of oral airflow measured after the stop closure release shows that there is a higher oral airflow after the non-whispery voiced nasal stops (mean = 126 ml/s) than after the whispery consonants (mean = 50 ml/s). Pharyngeal pressure, which was measured at the maximum value observed during the production of these consonants, also shows that pressure was higher during the non- whispery voiced nasal stops (mean = 5.2 hPa) than during the whispery consonant (mean = 2.6 hPa). The total duration of positive pharyngeal pressure measured from the beginning of the increase in pressure to the return to the atmospheric pressure value is longer for the whispery consonants than for the non- whispery counterpart (means: 187.6 ms compared to 97.4 ms).
Two patterns have been observed as direct consequences of variations in the timing of articulatory gestures. These facts play an important role in the phonological status of complex consonants in Rwanda. The first is that in sequences of nasal consonants such as [mŋw] and [nŋw] a burst can appear between the contiguous nasal consonants and it is sometimes interpreted as the burst of a stop, homorganic to the first nasal. This burst is in fact a click that is not phonologized in the language. This click results from a temporal overlap between a front and back consonant where the front closure is released first. A good example of this is given at Figure 4 where a click burst appears between the nasals in the word [inǃŋ'waɾo] ‘weapon’. The second is the phonetic realization of a vocoid between two consecutive consonants the second being always velar. An example of this is given at Figure 5 for the word [iməga] (/imbga/) ‘dog’. The presence of a burst or a short vocoid depends of the timing of consonant gestures in sequences giving alternations such as [mŋw] > [mʘŋw] ~ [məŋw] or [nŋw] > [nǃŋw] ~ [nəŋw]. If the front closure is released first, when there is an overlap between the gestures of two consonants, one being front and the other being back, then a click is produced. This is interpreted as a stop burst that is homorganic to the preceding nasal e.g. [nŋw] > [ntw]. If no overlap occurs between the two consonants, then a short vocoid is produced.
Cases like those presented in Figures 4 and 5 are not merely a study of fine phonetic detail in the production of prenasalized consonants. They also give indications about the categorization of acoustic features and the dynamics of phonological gestures. This can only be done using experimental methods. Specifically, aerodynamic measurements (Pio, AFo, AFn) are crucial for making inferences about the dynamics of such gestures. These parameters show how the timing and overlap of articulatory gestures may affect the phonological structure of the language. Indeed one could ask why click bursts found in Rwanda are not interpreted as clicks. TRAILL (1994) has likely furnished the answer to this question. In a study on the perception of clicks, he showed that in cases of click loss, i.e. during sound changes that shift clicks to another category, the alveolar click shifts to a voiceless velar stop [ǃ > k]. This is because when an abrupt click (i.e. alveolar or palatal) has its articulatory setting weakened, the acoustic cost tends to weaken the burst. When they are reduced 15dB in amplitude, the bursts of alveolar clicks can be interpreted as those of voiceless velar stops. A similar case might happen with the interpretation of the Rwanda click bursts between front and back nasals with a partial articulatory overlap. The hypothesis is then that the bursts found in Rwanda are interpreted as bursts homorganic to the first consonant. The weak amplitude of these bursts does not allow them to be interpreted as clicks. The burst is interpreted as the burst of a voiceless consonant, as in the case shown in Figure 4, because the following consonant is voiceless. When it is followed by a voiced consonant it is interpreted as the burst of a voiced consonant.
When small vocoids appear in CC sequences, whether nasal or oral, they are the result of a sequence of consonant gestures. Data of Figure 5 show that no bilabial oral closure exists after the voiced bilabial nasal [m]. When the bilabial closure is released, there is a rearward movement of the tongue going to the velar place of articulation. This is detectable as the AFo trace that becomes negative. Since the velar closure is not formed yet, there is a short-lived resonance in the vocal tract, which results as the vocoid.
Variations in the temporal realization of gestures involved in the production of prenasalized consonants were also observed by DOKE (1931) and MADDIESON (1990) in Shona.
Eastern Shona dialects show the following pattern of variation in the word for dog: [imga] ~ [ibαɤ] ~ [iməga] ~ [imbga] ~ [imʘga]. This can be related to the diachronic evolution from Proto Bantu: *ɲ–bua > m-bwa > m-bαɤ > m-bga.
Consider now the theoretical economy of using gestures as primitives of phonological description. In traditional phonology- to-phonetics mapping accounts, two classes of intrusive segments would have to be posited for Rwanda: the weak click resulting from the overlap of the anterior and the posterior consonant, and the vocoid resulting from the short lag between the release of an anterior consonant and that of a velar. In both cases, some additional extra apparatus would be necessary to explain the non-recoverable nature of the click and the small duration of the vocoid. In a gesture-based description, this all falls off from timing specifications which are an inherent part of phonological representation.
2.3. Geminated fricatives and affricates in Amharic
Amharic, a Semitic language spoken in Ethiopia, has a set of geminated consonants in its phonological inventory. One important question about these consonants is their characterization by features. LADEFOGED & MADDIESON (1996: 92) remind us that unlike a sequence, geminates cannot be separated by an epenthetic vowel or any other interruption nor will either half undergo a phonological process alone. Amharic’s set of fricative and affricate geminates, both plain and ejective, is thus an interesting case to test these claims, as well as those made by HAYES (1986), LAHIRI & HANKAMER (1988). LADEFOGED & MADDIESON (1996: 92) say that geminate affricates are very clearly different from an affricate sequence. Geminates are expected to have one long stop closure followed by one fricative portion.
Aerodynamic recordings were made using the Physiologia workstation (TESTON & GALINDO, 1990) linked to a data collection system equipped with appropriate transducers. Oral airflow measurements were taken with a small flexible silicon mask placed against the mouth. Pharyngeal pressure was recorded with a small flexible plastic tube (ID 2mm) inserted through the nasal cavity into the oro-pharynx. Subglottal pressure (Ps) was measured with a needle (ID 2mm) inserted in the trachea. The needle was placed after local anesthesia with 2% Xylocaine, including the subglottal mucosa. The tip of the needle was inserted immediately inferior to the cricoïd cartilage.
A plastic tube (ID 2mm) linked to a pressure transducer was connected to the needle. Acoustic recordings were made digitally with the same materiel via a high fidelity microphone on the hardware rig. Spectrograms and audio waveforms were processed with Signal Explorer software. Seven speakers took part in the experiment.
A second dataset was acquired by electropalatography (EPG). This technique uses a special acrylic artificial palate (see Figure 6) in which is embedded an array of silver or gold electrodes that detect tongue contact. These “electropalates” are typically custom-molded to fit the speaker with each electrode connected to its own thin wire. Bundled these thin wires pass behind the back molars on each side of the electropalate and exit at the corners of the mouth. The principle is that the tongue serves as a conductor that connects an electric signal from a sending to receiver electrode. Each palatal electrode is a receiver. The sending electrode is the tongue itself. This is arranged by connecting the subject to an imperceptible current via an electrode generally on the subject’s hand or wrist. The entire oral region will then conduct the current so that when the tongue touches any of the electrically isolated pseudopalate electrodes, the circuit is completed. The electropalate is scanned via a high-input impedance amplifier for each electrode, and linguapalatal contact data are sampled at a rate of 100 Hz. The EPG data are also synchronized with the acoustic signal. Five speakers took part in the EPG experiments. Only one subject participated in both the subglott l and EPG measurements.
The words of the experimental corpus presented in table 3 were pronounced both in a short carrier sentence and in isolation by the speakers.
|[ləwəsə]||‘knead flour for bread’||[tʼətʃʼːi]||‘drunkard’|
|[bəsːa]||‘he pierced’||[lutʃʼːa]||‘smooth air’|
Mean duration measurements for all six speakers are shown in Tables 4 and 5.
Aerodynamic data given below are mean values of 6 measurements made with the speaker who participated in both the aerodynamic and EPG experiments. Note that there is no plain long affricate [tʃ:].
Acoustic measurements show that ejectives are shorter than their plain counterparts. As for affricates, there is a gradual increase in duration of both stop and frication: [tʃʼ] (94.3 ms + 30.5 ms) < [tʃ] (139.5 ms + 55.8 ms) < [tʃː] (196.4 ms + 67 ms).
Aerodynamic measurements show that there does not seem to be much difference in Ps between the ejectives and affricates except for [sʼː] and [tʃʼ]. However, the Po reading is at 19.9 for ejectives because the maximum setting was exceeded. The maximum was fixed at 20 hPa for the experiments, and that was clearly not enough. Of course this disallows comparison among ejectives, but it still shows that Po is generally twice or more for ejectives what it is for plain consonants.
Figure 7 shows an interesting finding about the difference between ejectives and plain fricatives. The coordination of the glottal gestures (closure and opening) differs in the two cases. Ejective fricatives are characterized by a glottal closure at the start, contrary to what happens with plain fricatives where there is glottal opening. This is visible on the Ps and AFo (oral air flow) curves where before and after the plain fricative there is a drop in Ps and an increase in AFo. Note that the same is true for the VOT when plain and velar ejective stops are compared as it is shown by the difference between [k] and [kʼ]. At the end of the ejective fricative the glottis remains closed until the next vowel and there is no drop in Ps. This is not the case for the plain fricative where the constriction’s release produces a drop in Ps before the following vowel. The drop in Ps naturally corresponds to an increase in oral airflow (AFo).
Some explanation may be necessary to interpret the EPG data of Figures 8 to 14. For each of the seven words presented, five EPG frames are given, followed by readouts for articulatory profile and articulatory symmetry, then finally the audio waveform. Profile, symmetry, and audio are temporally aligned, and each EPG frame is situated thereon by a vertical line and the frame number (1 to 5). The profile and symmetry displays use shading to summarize levels of contact in regions across and along the vocal tract, respectively. For the profile representation, row 1 summarizes EPG grid contacts at the limit between the hard and soft palates, and rows run successively forward until row 8 shows the area just behind the teeth. This is analogous to the orientation of the 5 EPG frames just above. For the symmetry representation, row 1 represents the left side of the grid and row 8 the right. If the EPG frames were rotated 90° counterclockwise, the grid and the symmetry orientations would match. For both representations, the darker the gray is between white (no contact) and black (full contact), the more electrode contacts in the summarized row. Parameters such as: the anteriority index, the centrality index, the dorso-palatal index, the total contacts and the center of gravity can also be measured from the EPG data. For a good survey of these methods see HARINGTON (2010) and TABAIN (2011).
Data presented in Figures 8 to 11 show that ejective fricatives are further front and have a narrower constriction than plain fricatives. They also have a smaller oral cavity (behind the constriction) than non-ejectives. Ejective fricatives have an anterior contact, but with leakage that is visible on the audio waveform. Therefore they are almost alveolar affricates (to which they sometimes sound similar, although this is quite rare in the data). Frication noise increases towards the end of the ejective fricatives compared to plain fricatives. This is the consequence of the larynx rising with a closed glottis to generate the ejective. Affricates show that there is a palatal closure followed by a constriction in the palatal region (Figures 12 to 14). The slight differences in the closure and constriction positions are likely due to different coarticulation patterns. Indeed the short ejective affricate [tʃʼ] is more front than the plain affricate [tʃ] but it is articulated after a high back vowel [u]. The long ejective affricate is produced between two open vowels [a].
The comparison between plain and ejective fricatives shows some important differences. Compared to the constant noise of plain fricatives, frication noise increases towards the end for ejective fricatives. This is due to the larynx elevation which is necessary to produce the ejective. In the case of [sʼː] the larynx rise is delayed, as can be seen in the audio waveform, showing an increase in the frication noise towards the end. As the air resources within the oral cavity are not extensible, it would seem at first glance difficult to geminate an ejective fricative, given that raising the larynx with a closed glottis expels all the air from the oral cavity for the singleton version of the ejective fricative. Producing a geminate ejective fricative seems to require a delay in the larynx’s elevation, which suggests that this might be under control by the speakers4. This delay is visible on the audio waveform (Figure 10), which has very low frication noise for about 2/3rd of the closure duration. Other important differences involve the coordination of glottal and oral gestures. For instance, the VOTs of the plain and ejective velar stops are different. The ejective has a noiseless VOT, which suggests that the glottis is still closed at release of the oral constriction. A similar coordination happens at the end of the fricatives. There is a glottal lag at the end of the ejective fricatives due to continued glottal closure at constriction release. This can be seen at Figure 7 where there is a drop in Ps at the end of the plain fricatives which is not found in the ejective. A similar effect of the closed glottis can be seen comparing the starts of plain and ejective fricatives. The drop in Ps at the start of plain fricatives is due to the wider glottal opening necessary to increase the volume velocity of airflow and thus generate the frication noise. This shows up as a drop of Ps simultaneous to an increase in AFo, as seen at Figure 7. This effect is not seen in ejective fricatives, as the glottis is closed. The comparison confirms that frication in ejective fricatives is produced only with the air available in the oral cavity between the sealed glottis and the constriction.
Phenomena such as these raise fundamental questions about the control and coordination of articulatory gestures, and notably about the kind and degree of control that speakers exert on articulations. These data about the affricates, plain and ejective, confirm LADEFOGED & MADDIESON’s (1996) claims about the unity of geminates. It is specifically the increase in duration of the stop that makes the main difference between these sounds, rather than an increase in the duration of frication noise.
Again, in this last example, details of the speech waveform, as well as the ordering and the timing of the gestures involved, make a crucial difference for two phonological distinctions – geminate vs. singleton, and ejective vs. plain – that are relevant not only to Amharic, but also to other languages.
The data, and the data analysis, in this paper show that the use of experimental methods allows generation of hypotheses about phonological categories and primitives, and about the control that speakers have over their articulations. Acoustic and aerodynamic methods show that the emergence of click bursts in Rwanda depends of the overlap of consonantal gestures. Their categorization as stop bursts, rather than clicks is a matter of amplitude. The emergence of vocoids in Rwanda’s complex consonants results from the greater separation of two gestures than in other cases overlap. Perception tests show that Karitiana speakers declare a no-vowel’s-land in the high back part of the vowel space. They also show that the intrinsic duration of vowel is an important feature for correctly categorizing central vowels in the language. Amharic data raises questions about the degree of control that speakers have on the coordination of gestures necessary to produce geminated consonants and ejectives. This paper does not delve into the statistical treatment of data, nor does it discuss problems related to the numbers of speakers needed for such experiments. These concerns are, of course, a fundamental part of the experimental method. However this paper aims simply to demonstrate that phonological problems and hypotheses, i.e. involving phonological categories, can be formulated and tested through the experimental method, and not only by ad hoc hypotheses produced by armchair work, as is still too often the case. Falsifiable hypotheses are part of the endless progress of the scientific endeavor of which the study of language and phonology is undeniably one part.
- Phonetics of the Hottentot Language BEACH D. M. Cambridge: Heffers; 1938.
- An Introduction to mathematical modeling BENDER E. A. Dover 2000.
- Introduction à l’étude la médicine expérimentale BERNARD C. Paris 1865.
- Papers in Laboratory Phonology V: Language Acquisition and the Lexicon BROE M, PIERREHUMBERT J. Cambridge: University Press; 2000.
- Articulatory gestures as phonological units BROWMAN C, GOLDSTEIN L. Phonology.1989;6:201-251.
- Articulatory Phonology: an overview BROWMAN C, GOLDSTEIN L. Phonetica.1992;49:155-180.
- The sound pattern of English CHOMSKY N, HALLE M. New-York: Harper and Row; 1968.
- Laboratory phonology: Past successes and current questions, challenges and goals COHN A. In: Fougeron C, Kühnert B, D’Imperio M, Vallée N, eds. Laboratory Phonology 10. Berlin: Mouton de Gruyter; 2010 .
- Laboratory Phonology 9 COLE J, HUALDE J. I. Berlin: Mouton de Gruyter; 2007.
- Phonology and Phonetic Evidence CONNELL B, ARVANITI A. Cambridge University Press; 1995.
- The emergent paradigm in phonology: Phonological categories and statistical generalizations CROOT K. In: CUTLER , BECKMAN , EDWARDS , FRISCH , BRÉASPHAN , KAPATSINSKI , WALTER . Laboratory Phonology. 2010 .
- Le mangbetu: étude phonétique et phonologique DEMOLIN D. 1992.
- The search for primitives in phonology and the explanation of sound patterns: the contribution of fieldwork studies DEMOLIN D. In: GUSSENHOVEN C, WARNER N, eds. Papers in Laboratory Phonology 7. Berlin: Mouton de Gruyter; 2002 .
- Whispery voiced nasal stops in Rwanda DEMOLIN D, DELVAUX V. Proceedings Eurospeech. Aalborg.2001;:651-654.
- Papers in Laboratory phonology II: Gesture, Segment, Prosody DOCHERTY G. J, LADD D. R. Cambridge: C. University Press; 1992.
- The Phonetics of the Zulu language DOKE C. M.. Bantu studies.1926;vol II(Bantu Studies, Special Issue).
- A Comparative Study of Shona Phonetics DOKE C. M. Johannesburg: The University of Witwatersrand Press; 1931.
- Rethinking Innateness. A connectionist perspective on development ELMAN J, BATES E. A, JOHNSON M. H, KARMILOFF-SMITH A, PARISI D, PLUNKETT K. Cambridge: The MIT Press; 1996.
- Acoustic Theory of Speech Production FANT G. The Hague: Mouton; 1960.
- Laboratory Phonology 10 FOUGERON C, KÜHNERT B, D’IMPERIO M, VALLÉE N. Berlin: Mouton de Gruyter; 2010.
- Historical laboratory phonology: investigating /p/ > /f/ > /h/ changes FOULKES P. Language and Speech.1997;40:249-276.
- Urban Voices: Accent Studies in British Isles FOULKES P, DOCHERTY G. London: Arnold; 1999.
- Laboratory Phonology 8: varieties in phonological competence GOLDSTEIN L. M, WHALEN D. H, BEST . Berlin: Mouton de Gruyter; 2006.
- Papers in Laboratory Phonology 7 GUSSENHOVEN C, WARNER N. Berlin: Mouton de Gruyter; 2002.
- Phonology HALLE M. In: OSHERSON D. N, LASNIK H, eds. An invitation to cognitive science. Cambridge: The MIT Press; 1990 .
- Phonetic analysis of speech corpora HARRINGTON J. Chichester: Wiley-Blackwell; 2010.
- Inalterability in CV phonology HAYES B. Language.1986;62:321-351.
- Espace vocalique et structuration perceptuelle: Application au Swahili HOMBERT J. M, PUECH G. Pholia .1984;1:199-208.
- The role of speech perception in phonology HUME E, JOHNSON K. New-York: Academic Press; 2001.
- The Sound Shape of Language JAKOBSON R, WAUGH L. Bloomington/Indiana: University Press; 1979.
- Phonétique et phonologie des consonnes du kinyarwanda JOUANNET F. In: JOUANNET F, ed. Le kinyarwanda langue bantu du Rwanda. Etudes linguistiques. Paris: SELAF; 1983 .
- Papers in Laboratory Phonology III. Phonological Structure and Phonetic Form KEATING P. Cambridge: C. University Press; 1995.
- Papers in Laboratory Phonology I: Between the Grammar and the Physics of Speech KINGSTON J, BECKMAN M. Cambridge: C. University Press; 1990.
- The timing of geminate consonants LAHIRI A, HANKAMER J. Journal of Phonetics.1988;16:327-328.
- The mental representation of lexical form: A phonological apparoach to the recognition lexicon LAHIRI A, MARSLEN-WILSON W. Cognition.1991;38:245-294.
- Phonetic Universals in Vowel Systems LINDBLOM B. In: OHALA J. J, JAEGER J. J, eds. Experimental Phonology. Orlando: Academic Press; 1986 .
- Phonetic universals in consonants systems LINDBLOM B, MADDIESON I. In: HYMAN L. M, LI C. N, eds. Language, Speech and Mind: Studies in Honor of Victoria. Fromkin. New York: Routledge; 1988 .
- Papers in Laboratory Phonology VI: Phonetic interpretation LOCAL J, OGDEN R, TEMPLE R. Cambridge: C. University Press; 2003.
- Feedback in speech production: an ecological perspective MACNEILAGE P. F. In: MYERS T, LAVER J, ANDERSON J, eds. The cognitive representation of speech. Amsterdam: North-Holland; 1981 .
- The Frame/Content theory of evolution of speech production MACNEILAGE P. F. Brain and Behavioral Sciences.1998;21:499-548.
- Patterns of Sounds MADDIESON I. Cambridge: C. University Press; 1984.
- Shona velarization; complex segments or complex onsets? MADDIESON I. UCLA Working papers in Phonetics.1990;72:16-34.
- Articulatory phonology and Sukuma ‘aspirated nasals’ MADDIESON I. In: HUBBARD K, ed. Proceedings of the 17th Annual Meeting of the Berkeley Linguistic Society, Special Session on African Language Structures. Berkeley Linguistic Society; 1991 .
- Ikalanga Phonetics and Phonology MATHANGWANE J. T. Stanford: CSLI Publications; 1999.
- Experimental phonology OHALA J. J, JAEGER J. J. Experimental phonology: Academic press; 1986.
- On the sue of feedback in speech production PERKELL J. In: MYERS T, LAVER J, ANDERSON J, eds. The cognitive representation of speech. Amsterdam: North- Holland; 1981 .
- Stochastic phonology PIERREHUMBERT J. GLOT 5.2001;6:1-13.
- Les modifications phonétiques du langage, étudiées dans le patois d’une famille de Cellefrouin (Charente) ROUSSELOT A. Paris: Welter; 1891.
- Principes de Phonétique expérimentale ROUSSELOT A. Paris: Didier; 1904.
- La phonétique expérimentale. Leçon d’ouverture au collège de France. Revue des cours et conférences ROUSSELOT A. Paris: Boivin & Cie Editeurs; 1923.
- tructures syllabiques et lexicales des langues du monde: données, typologies, tendances universelles et contraintes substantielles ROUSSET I. 2004.
- Acoustic Phonetics STEVENS K. Cambridge: MIT Press; 1998.
- Aspetcts of a Karitiana grammar STORTO L. 1999.
- Karitiana Phonetics and Phonology STORTO L, DEMOLIN D. Manuscript.
- The Origins of Generativity STUDDERT-KENNEDY M. In: HURFORD J, KNIGHT C, STUDDERT- KENNEDY M, eds. The evolutionary emergence of language. Cambridge: Cambridge University Press; 1998 .
- Design and development of a workstation for speech production analysis TESTON B, GALINDO B. Proceedings of VERBA 90.1990;:400-408.
- Phonetic and phonological studies of !xóõ TRAILL A. Hamburg: Bushman: Helmut Buske Verlag; 1985.
- The perception of clicks in !Xóõ TRAILL A. Journal of African languages and linguistics.1994;Vol. 15(nº2):161-174.
- Electropalatography data from Central Arrente: A comparison of the new articulate palate with the standard Reading palate TABAIN M. Journal of the International Phonetic Association.2011;43(3):343-367.
- Grundzüge der phonologie. Sl: 1939 TRUBETZKOY N. Berkeley/Paris: University of California Press/Klinksiek; 1969/1972.
- Les systèmes vocaliques: de la typologie aux prédictions VALLÉE N. 1994.