Linguistic and social constraints on the variable palatalization of alveolar stops by derived [i] in a variety of Brazilian Portuguese

Athany Gutierres,
Elisa Battisti


In this paper we analyze the variable regressive palatalization of /t, d/ in a variety of Brazilian Portuguese (BP) in contact with Italian dialects. We follow Coetzee’s (2016) assumptions regarding the distinctive effects of grammatical and non-grammatical factors on the grammar. The aim of the study is to test such claims with Noisy-HG (BOERSMA; PATER, 2008; COETZEE, 2012; 2016; COETZEE; KAWAHARA, 2013). Palatalization in BP affects /t, d/ and it is triggered by a following underived /i/ (t/i/jolo ‘brick’, d/i/nheiro ‘money’) or derived [i] from /e/ in unstressed word positions (t/e/atro ® t[i]atro ‘theater’, nád/e/ga ® nád[i]ga ‘buttock’). We focus on palatalization triggered by derived [i]. Besides tokens of non-palatalized consonant plus non-raised vowel ([te]atro, ná[de]ga) and of palatalized consonant plus raised vowel ([ʧi]atro, ná[ʤi]ga), the data set comprises tokens of non-palatalized consonant plus raised vowel ([ti]atro, ná[di]ga), an innovation of the analysis regarding prior studies (BATTISTI; DORNELLES FILHO, 2010; GUTIERRES; BATTISTI; DORNELLES FILHO, 2018). The data were extracted from sociolinguistic interviews by Battisti et al. (2007). The linguistic constraints interacting in the grammar of palatalization come from Battisti and Dornelles Filho (2010). The analysis demonstrates that a non-grammatical factor as ‘place of residence’ works as a scaling factor on faithfulness constraints, moving their weights up or down and so affecting the variable raising of /e/, a process that feeds palatalization.


In most of Portuguese varieties spoken in Brazil, the variable palatalization of alveolar stops turns consonants /t, d/ into affricates [ʧ, ʤ] when followed by a high front vowel, either underived /i/ (t/i/jolo à [ʧ]ijolo ‘brick’, d/i/nheiro à [ʤ]inheiro ‘money’) or derived [i] from /e/ in unstressed word positions (lei./te/ à lei.[ʧɪ] ‘milk’; on./de/ à on[ʤɪ] ‘where’). Although the process is virtually categorical in half of the capital cities in the country (CARDOSO et al., 2014[1]), it applies at different rates in the diverse Brazilian speech communities. It affects /t/ more often than /d/, and it is more frequently triggered by the underived high vowel than by the derived one (BATTISTI; DORNELLES FILHO, 2009; 2010[2,3]).

One can find this pattern in the speech community of interest in this paper, Antônio Prado, a small Brazilian city founded by Italian immigrants in Rio Grande do Sul, a state in the southern of Brazil. According to Battisti et al. (2007[4]), both linguistic and social factors affect the process: Brazilian Portuguese (BP) spoken in Antônio Prado exhibits moderate rates of palatalization, higher in the urban area than in the rural area, where BP is still in contact with Italian dialects1.

Even though variable processes as palatalization in BP have been conceived as an inherent feature of natural languages (LABOV, 1972[5]), conditioned by both linguistic and social factors, solely recently have they gained attention by generative scholars in constraint-based approaches that model variation by means of numerical weights (ANTTILA, 1997[6]; ANTTILA; CHO, 1998[7]; BOERSMA, 1998[8]; BOERSMA; HAYES, 2001[9]; BOERSMA; PATER, 2008[10]; COETZEE, 2012[11]; 2016[12]; COETZEE; KAWAHARA, 2013[13]). Constraint-based approaches assume that grammar is a parallel system of universal constraints of two kinds, faithfulness and markedness constraints, that is, constraints that protect input forms from change and constraints that require change in input forms2, respectively. The way languages are structured is determined by the interaction of faithfulness and markedness constraints in constraint rankings (PRINCE; SMOLENSKY, 2004[14]).

Constraint-based theories and any formal models of linguistic analysis face the same challenge when modeling language variation: handling the effects of social factors. That is why those models have focused exclusively on the internal aspects of variable processes. Coetzee and Kawahara (2013[13]) and Coetzee’s (2012; 2016[11,12]) are relatively recent attempts to approach both internal and non-internal factors with constraint-based models. They propose a grammar-dominant model to analyze the effects of grammatical and non-grammatical factors on linguistic variation: grammatical factors determine the scope of variation, and non-grammatical factors play a role in determining the frequency with which the forms are variably produced. In other words: phonology drives variation and social aspects are responsible for the diffusion of variable forms. One of the aims of the present study is to test such claims with Noisy Harmonic Grammar (BOERSMA; PATER, 2008[10]; COETZEE, 2012; 2016[11,12]; COETZEE; KAWAHARA, 2013[13]), examining palatalization in BP spoken in Antônio Prado (BP-AP).

Another aim of the present paper is to take a step further on a constraint-based approach to palatalization in BP-AP and analyze variable palatalization in the environment of derived [i] only. Besides tokens of non-palatalized consonant plus non-raised vowel ([te]atro ‘theater’, ná[de]ga ‘buttock’) and of palatalized consonant plus raised vowel ([ʧi]atro, ná[ʤi]ga), the data set comprises tokens of non-palatalized consonant plus raised vowel ([ti]atro, ná[di]ga), an innovation of the analysis regarding prior studies (BATTISTI; DORNELLES FILHO, 2010[3]; GUTIERRES; BATTISTI; DORNELLES FILHO, 2018[15]).3 It is a subset of the 26,600 tokens extracted from sociolinguistic interviews by Battisti et al. (2007[4]). The analysis of that specific subset is needed because, according to Battisti and Hermans (2008, p. 281[16]),

Most of the data (17,054) involve unstressed mid vowels candidate to be raised to [i], but the frequency of application of the rule [of palatalization] in that environment is solely 13%. The environment itself does not favor palatalization (0.23 relative weight). Although in many Brazilian speech varieties the reduction is very frequent, thus creating a context for palatalization, vowel reduction is not high in Antônio Prado. And reduced vowels do not always palatalize the consonant. For example, in addition to [i’dade] ‘age’, with [e], a variant with a reduced and devoiced vowel [i] ([i’dadi]) can also occur, but without palatalization. [Translated by the authors]4.

Battisti and Hermans (2008[16]) refer to Roveda (1998[17]) to explain the low rate of raising (reduction) of unstressed /e/ in BP-AP: in most of the communities located in the old Italian immigration area of the state of Rio Grande do Sul, BP is still in contact with Italian dialects, especially in rural areas. The consequence of such contact would be the preservation of /e/ from vowel raising in Portuguese due to the influence of Italian's morphological system (in which the quality of the final-word vowel is distinctive). Besides that, a brief inspection of the Battisti and Hermans (2008[16]) BP-AP data set, considering solely tokens of /t, d/ followed by unstressed /e/ to which palatalization did not apply, suggested that the rates of preservation of /e/ (i.e., non-application of the raising rule) should be much higher than the rates of raising of /e/ to [i] in BP-AP.

With that being said, this paper seeks answers to two primary questions:

a) Regarding the comprehensive constraint-based model by Coetzee and Kawahara (2013) and Coetzee (2016), which uses Noisy Harmonic Grammar (Noisy-HG) to bring together social and linguistic factors that co-determine phonological processes: is the model able to express the different contribution of grammatical and non-grammatical factors to the variable palatalization of /t, d/ in BP-AP?

b) Concerning the observed frequencies of palatalization of /t, d/ in the environment of unstressed /e/ from which [i] can derive in BP-AP, how could one model the contribution of a non-grammatical factor such as ‘place of residence’ to the process?

The paper is structured as follows: after this Introduction, section 1 ‘Variation in constraint-based models’ briefly reviews some core ideas of Optimality Theory – OT (PRINCE; SMOLENSKY, 2004[14]) and its developments, and discusses its potential to handle language variation5. Section 2 ‘Palatalization in Brazilian Portuguese’ reviews studies on palatalization of /t, d/ in BP and situates the present analysis. Section 3 ‘Methodological procedures’ explains the steps taken to conduct the study. Section 4 ‘Results and discussion’ presents and interprets the results. ‘Conclusion’ provides a final word considering the objectives of the paper and other related issues in terms of theory adequacy.

1. Variation in constraint-based models

Most phonological theories deal with the Chomskyan notion of I-language: phonology is part of an individual’s language competence. Variationist models, taking the I-language notion either explicitly or implicitly, work with facts about the E-language, that is, how an individual performs socially through language. Approaches that conjugate both internal and external aspects to describe language were considered uncommon among phonological theories until quite recently.

Anttila (1997[6]) and Anttila and Cho (1998[7]) have analyzed intraspeaker variation by combining partial/stratified ordering with universal constraints and constraint hierarchies. In these models, exemplified by the variable realization of plurals in Finnish and r-deletion in English, variation is expressed by means of partial/stratified grammars, since grammars are seen as a number of constraint strata: the strata are ordered in relation to one another, but the constraints in each stratum are not. In the literature, these models correspond to the ‘multiple grammars’ hypothesis, in which variation takes place depending on which grammar/stratum of constraints the speaker accesses when producing language.

For example, in Figure 1, the constraints FAITH, ONSET and *CODA6 are ordered in three different strata in dialects A, B and C. Each dialect contains three possible partial rankings7, and each partial ranking corresponds to a possible output. Dialect A predicts neither r-insertion nor r-deletion; dialect B predicts only r-deletion and dialect C predicts both r-insertion and r-deletion.

Figure 1. Model of grammar as partial rankings. Source: Anttila and Cho (1998, p. 38[7]).

Anttila and Cho (1998[7]) model variation by means of combining multiple invariant grammars. Boersma (1998[8]) and Boersma and Hayes (2001[9]), on the other hand, come up with an integrated model for variation which conceives grammars no longer as a matter of discrete ordering, but as constraint overlap in a continuous scale. In this model, linguistic constraints are given numerical weights of two kinds: (i) ranking values, more or less fixed weights that determine the range of selection points; and (ii) selection points, which are variable weights that correspond to the spoken realization of speech within the ranking values. Selection points move forwards or backwards within the limits of the ranking values, and their potential overlap in the scale (in a range of 5 points forwards or backwards) indicates variation. This stochastic version of OT operates through a learning algorithm, the Grammar Learning Algorithm (GLA), whose processing is mediated by plasticity (1.0 default) and noise values (2.0 default), added to each speech event.

In Figure 2, C1 is a constraint and C2 is another one. The numbers below the continuous line indicate the ranking values. The position of C1 and C2 displays their selection points in a particular moment of speech: C1, approximately 87, and C2, approximately 83. The common area between the two constraints (ranking values between 88 and 82) is where variation occurs, that is, the overlap region in the scale determines the range of variation that selection points might take.

Figure 2. Model of grammar as a continuous scale. Source: Boersma and Hayes (2001, p. 49[9]).

The stochastic language processing may be also expressed in harmonic grammars modelling. As it is for OT, Harmonic Grammar - HG8 (LEGENDRE; MIYATA; SMOLENSKY, 1990[18]; COETZEE, 2012; 2016[11,12]; COETZEE; KAWAHARA, 2013[13]) sees language structure as determined by the power of constraints. What differentiates the latter from the former is the manner by which variable outputs are selected in the grammar: while in OT some violations are not determinant to the selection of the optimal outputs (some weights do not play a part in determining the range of variation), in HG all violations matter to predict the harmony of each output in the grammar, considering the numerical weights assigned to the constraints. For each candidate, harmony is defined by the negative sum of the violated constraints multiplied by the number of violation marks. The most harmonic output is the one whose harmony is the highest in value (the least negative). Therefore, harmony expresses the relative well-formedness of candidate structures concerning the notion of optimization.

In Figure 3, each constraint is assigned a numerical weight (W-to-S = 0.9 and ALIGN-R = 0.6)9. The number of violations for each candidate is demonstrated by the negative number in the tableau, and it is multiplied by the weight of the constraint the candidates violate. For instance, [bá] contains one violation in W-To-S (0.9) and two violations in ALIGN-R (0.6 x 2), which gives it the lowest harmony value of -2.1 (0.9 + 0.6 X 2).

Figure 3. Model of grammar as weighted constraint ranking. Source: Boersma and Pater (2013, p. 03[19]).

The three cited models illustrate existing constraint-based approaches to variable processes by means of constraint interaction. A problem is that the models successfully represent variation taking into account solely the grammatical factors that influence observed frequencies, not the non-grammatical factors. The inclusion of non-grammatical factors to the formal representation of grammar is a turning point to the theory, if/as long as the theory attempts to provide a more comprehensive framework of phonological variation.

Even though suggestions have been previously made (OOSTENDORP, 1997[20]; BOERSMA; HAYES, 2001[9]) to foster a generative model for variation, it is Coetzee (2009a; 2009b[21,22]) that first attempted to develop a fully operational model of phonological variation in OT, followed by additional refinements in Noisy-HG (COETZEE, 2012; 2016[11,12]; COETZEE; KAWAHARA, 2012[13]). The Noisy-HG account for phonological variation allows both grammatical and non-grammatical factors to contribute to variation, but in different ways: grammatical factors are implemented as universal linguistic constraints, maintaining the basic property of HG, and non-grammatical factors act upon the movement of constraints up or down/forwards or backwards, influencing the frequency with which variable forms are observed, similarly to the Labovian approach (COETZEE, 2016, p. 215[12]).

That is to say that Coetzee (2016[12]) formally demonstrates the contribution of social factors to variable phonological phenomena: the promotion and diffusion of variable processes, while the grammatical constraints direct variation. Oostendorp (2008, p. 13[23]) himself states that this is “the only model that can be interpreted as a model of a variable grammar system”.

This paper follows the latest development of Coetzee (2016[12]) to model the variable palatalization in BP-AP. We assume Coetzee’s (2016[12]) premise that social factors scale the weights of faithfulness constraints in the grammar and so variation is produced. The analysis aims to formally demonstrate the role performed by non-grammatical factors in phonological variation. The Methodological procedures section of the present paper will bring the model to more detail.

The following section briefly describes the phenomenon of palatalization and the paths taken up to now in the analysis of the process in BP-AP.

2. Palatalization in Brazilian Portuguese

In terms of structure, there is evidence that palatalization is intrinsically motivated across world languages by the internal structure of the segments (BATTISTI; HERMANS, 2020[24]). Consonants /t, d/ are the most frequent targets of the process, vowel /i/ the most frequent trigger. Both typical trigger and targets present a high degree of constriction (consonantality) and are maximally similar in terms of elements (identical manner and place of articulation). In BP, only full palatalization is licensed, the one that requires maximal identity between trigger and target.

The variable regressive palatalization of /t, d/ in BP has been widely covered in different speech communities.10 Our focus in this section will be on BP-AP. General results of studies in Rio Grande do Sul will be just briefly mentioned.

In Porto Alegre, the capital city of the state of Rio Grande do Sul (RS), the process is virtually categorical (over 90%) (KAMIANECKY, 2003[25]) in both environments, of underived and derived triggering high vowel. In other regions of RS, total proportions of rule application are diverse and not as high as in Porto Alegre (PIRES, 2003[26]; PAULA, 2006[27]; DUTRA, 2007[28]; MATTÉ, 2009[29]). BP-AP conforms to speech patterns other than the Porto Alegre one, as Battisti et al. (2007[4]) verified.

Linguistic research on the variable palatalization of the alveolar stops in BP-AP was first carried out by Battisti et al. (2007[4]). Palatalization in this community displays moderate rates of rule application (at about 30%). The authors conducted a variable rule analysis of 26,600 tokens of palatalization. They found that the process tends to be triggered by the underived high front vowel /i/and that it is constrained by the place of residence of the speakers (urban area favors palatalization, rural area refrains the process), among other factors.11

Battisti and Dornelles Filho (2010[3]), inspired by McCarthy (2008[30]), developed a factorial typology to explain palatalization not only in BP-AP, but in the diverse BP varieties as well. From the factorial typology, they derive five patterns of variable rule application of palatalization of alveolar stops by the underived and the derived high front vowel12. The nearly 30% of application of palatalization in BP-AP is not evenly distributed among the patterns, as attested in the literature on palatalization in BP varieties such as in Mauri (2008[31]), Dutra (2007[28]), Paula (2006[27]), Abaurre and Pagotto (2002[32]), among others. The frequencies observed by Battisti and Dornelles Filho (2010[3]) in BP-AP conform mostly to the first pattern (no palatalization whatsoever) and to the third pattern (palatalization restricted to the obstruents /t, d/ followed by non-derived high vowel /i/). These two patterns were then modeled by the authors in Stochastic OT (BOERSMA; HAYES, 2001[9]) in terms of implicational relations among the contexts. The implicational relations to palatalization state that, for example, there would not be a BP speech community in which palatalization manifests in [t] contexts and not in [d] contexts, since it tends to occur more often in contexts whose input is /t/ and less often in contexts whose input is /d/. The authors managed to successfully represent the linguistic constraint interaction in the grammars of palatalization in BP-AP, but not with the non-grammatical effect of a social variable such as “place of residence”, attested as a social conditioning factor in Battisti et al. (2007[4]).

The independence of the effects of grammatical and non-grammatical factors on the variable palatalization in BP-AP was tested by Gutierres, Battisti and Dornelles Filho (2018[15])13 by means of a harmonic algorithm implemented in ORTO14 (DORNELLES FILHO, 2014[33]). The variable “place of residence” (rural or urban area) variable socially conditions the process of palatalization in the speech community and is implemented by the decrease or increment of numerical weights attributed to faithfulness constraints in the grammar. It has been demonstrated that depending on where the speakers live (in the rural or urban areas), the palatalization process is more or less likely to occur in the community.

Research on palatalization in BP-AP conducted so far provides us a quite clear picture of the process in BP-AP: it is variable; it is systematically conditioned by linguistic and non-linguistic factors; it appears to demonstrate some resistance to the application of the process when compared to the ways it manifests in other speech communities; its possible realizations in terms of formal modelling have been pointed out through a factorial typology; its internal structure motivation has been made explicit (BATTISTI; HERMANS, 2020[24]); and it has been represented in distinct constraint-based models (Stochastic OT, Harmonic Grammar) capable to cope with variation.

An aspect of the palatalization process in BP-AP that has not been fully covered concerns the occurrence of a “third variant” that can be mapped from /te/ and /de/ inputs. It has a derived [i] that does not trigger palatalization, as shown in bold in Figure 4.

Figure 4. FIGURE 4 - Variable output candidates for /e/ input. Source: The authors.

Figure 4 demonstrates that for each context [te] and [de] there is a faithful input /parte/ (‘part’) and /onde/ (‘where’), and three variable outputs, one faithful and two others unfaithful to the input, from left to right in the Outputs column: (i) no raising and no palatalization; (ii) raising and palatalization; and (iii) raising but no palatalization.

Assuming, then, (a) that palatalization interacts with the raising of /e/ in unstressed position and (b) that the non-grammatical factor of “place of residence” might be controlled over the weight of grammatical constraints, as proposed by Coetzee (2016[12]), Gutierres, Battisti and Dornelles Filho (2018[15]) tested hypotheses (a) and (b) in an analysis with ORTO (DORNELLES FILHO, 2014[33]). The authors demonstrated that there is a difference in the pattern of palatalization between rural and urban areas: the areas are distinctly affected by constraints *t[i] and *d[i] and by the scaling up and down of faithfulness constraints in patterns #1 (no palatalization at all) and #3 (palatalization of t/i/ and d/i/ tokens).15

The analysis of Gutierres, Battisti and Dornelles Filho (2018[15]) confirms what is stated by Coetzee (2016[12]) in terms of the frequency in which the variants are observed in the two areas, resulting in different grammars for the same speech community; that is, it clarifies the role of non-grammatical factors in the grammars of palatalization in BP-AP. However, the analysis has two relevant limitations: (i) “place of residence” was tested by dividing the corpus in two subsets of data, according to the area (urban and rural areas) – that was the way found to test the effect of the non-grammatical factor over grammar using ORTO; (ii) the analysis did not distinguish the third variant in the data set, as exemplified in Figure 4. The present analysis is an attempt to cope with those limitations.

3. Methodological procedures

The speech data here analyzed come from sociolinguistic interviews of BDSer (Banco de Dados da Serra Gaúcha “Gaucha Sierra Database”), a corpus belonging to the University of Caxias do Sul (UCS).

Our database is a subset of the BP-AP sample of palatalization examined by Battisti et al. (2007[4]). The first part of the analysis included hearing again seventeen16 out of the forty-eight (17/48) sociolinguistic interviews from which Battisti et al. (2007[4]) extracted their data17 in order to distinguish between derived [i] tokens that palatalized /t, d/ (totally unfaithful forms) and derived [i] tokens that did not palatalize (partially unfaithful forms) and, among the non-palatalized forms, the ones with [e] vowel (totally faithful forms).

The corpus comprises 4,243 tokens of unstressed /e/ following /t,d/: 1,318 /te/ tokens, 2,925 /de/ tokens. The data were recoded in order to distinguish three possible variant (output) forms of input /te/ and /de/ sequences: tokens in which both /e/ raising and palatalization occur (Outputs 3 in Table 1); tokens in which /e/ raising occurs but palatalization does not (Outputs 2); tokens in which /e/ is preserved and palatalization does not occur (Outputs 1). The observed frequencies of the three possible variants in our sample of BP-AP are shown in Table 1.

Input Outputs Frequency %
/parte/ N=1,318 1 par[te] 86.2
2 par[ti] 4.6
3 par[ʧɪ] 9.2
/onde/ N=2,925 1 on[de] 96.6
2 on[di] 1.4
3 on[ʤɪ] 2
Total N 4,243 tokens
Table 1. Output frequencies for /te, de/ data. Source: The authors.

The frequencies in Table 1 demonstrate that the output forms of /te/ and /de/ are highly faithful to the input in BP-AP: the observed frequencies of the faithful mappings /te/® [te] and /de/® [de] are 86.2% and 96.6%, respectively. The community tends to preserve unstressed /e/ from raising. When raising is verified, it occurs more often in the environment of /t/ (13.8%), where 9.2% of the tokens are palatalized, 4.6% are not palatalized. Solely 3.4% of the tokens of /d/ exhibit vowel raising, from which 2% are palatalized and 1.4% are not palatalized.

The frequencies obtained were informed in a script run in Noisy-HG (BOERSMA; PATER, 2008[10]), a stochastic version of HG (LEGENDRE; MIYATA; SMOLENSKY, 1990[18]) available in Praat (BOERSMA; WEENINK, 2020[34]). Noisy-HG comprehends the same architecture as OT, but instead of assigning constraints to rankings, it assigns them numerical weights that will reflect on the harmony values of the output candidates. Candidates that are the most harmonic (i.e., which have the least negative harmony values) and obtain close harmony values (<1) are the variant forms produced by the grammar18.

The script also contains the constraints used by Battisti and Dornelles Filho (2010[3]) to model the factorial typology of palatalization that they propose. The seven constraints can be read as follows: *ti, assign a violation mark for each non palatalized /t/ before /i/; *di, assign a violation mark for each non palatalized /d/ before /i/; *t[i], assign a violation mark for each non palatalized /t/ before [i] raised from unstressed /e/; *d[i], assign a violation mark for each non palatalized /d/ before [i] raised from unstressed /e/; *MID]σ, assign a violation mark to each mid vowel in unstressed syllable; IDENT(anterior), assign a violation mark to each input-output segment that does not have identical values for anteriority; IDENT(height), assign a violation mark to each input-output segment that does not have identical values for height.

The following step was to inform the script the input forms, the output candidates and the violations each candidate incurs considering the constraints, as shown in Figure 5:

Figure 5. Violations’ tableau. Source: The authors.

The constraints in the tableau (Figure 5) are not ranked. The purpose of the violations’ tableau is providing the script with the constraint violations incurred by each output form. Take, for example, the output forms of the input /parte/: candidate par[te] is faithful to the input, it does not violate faithfulness constraints ID(hei) and ID(ant). It violates only the markedness constraint *MID]σ̆ because this constraint does not allow output forms with mid vowel [e] in the output. Candidate par[ti] violates ID(hei) because the vowel in the output is not identical to the vowel in the input; it also violates *t[i] because the output was not palatalized by the high front vowel [i] derived from unstressed /e/. Candidate par[ʧɪ] is the least faithful: it violates both faithfulness constraints because both the vowel and the consonant were modified (the vowel was raised and the consonant was palatalized).

With data frequencies, constraints and violations, the script is ready to be run in Noisy-HG via Praat. Differently from what was done with ORTO (GUTIERRES; BATTISTI; DORNELLES FILHO, 2018[15]), we did not need to split the data in two subsets and run the algorithm twice, once with rural area data and once with urban area data, to check effects of ‘place of residence’. In Noisy-HG, faithfulness constraints are implemented with a scaling factor, which scales solely this group of constraints up and down, resulting in faithful constraints contributing less to the harmony scores of unfaithful candidates (COETZEE, 2016, p. 229[12]). Due to its stochastic implementation, the results the program provides are ranking values and selection values of the constraints, and negative harmony values for the candidates, which indicate the most harmonic output (the most frequent observed form) and the less harmonic outputs that may be variably produced with the “winner”.

4. Results and discussion

Before presenting and discussing the results of the present analysis (of the data subset in hand, i.e., only /te, de/ input forms), it is important to say some words about the average grammar of BP-AP regarding palatalization of /t, d/. Figure 6 by Battisti and Dornelles Filho (2010[3]) exhibits BP-AP grammar considering the whole database - 26,600 tokens of both palatalized (30%) and non-palatalized output forms (70%) from /ti, di, te, de/ input forms.

Figure 6. BP-AP palatalization grammar. Source: Battisti and Dornelles Filho (2010, p. 84[3]).

In Figure 6, the numbers below the continuous scale indicate the selection points/numerical weights19 of the constraints (the higher the stricter, from left to right): *MID]σ̆ is the highest ranked constraint (selection point: 108) and ID(hei) is the lowest (selection point: 94). The constraints are labeled above the numerical weights: *ti has a weight of 98, *di of 96, and ID(ant), *d[i] and *t[i] have all very close weights, around 100.

One can see that the markedness constraint *MID]σ̆ (mid-vowels must be raised in unstressed positions) plays a crucial role in the community grammar. The positions of *MID]σ̆ and faithfulness constraint ID(hei) (no vowel raising) in the scale show that some change in the input-output mapping of mid vowels is expected in BP-AP. On the other hand, faithfulness constraint ID(ant) (no palatalization) and markedness constraints *d[i] and *t[i] overlap in the scale due to their weights (all around 100), producing variation between palatalized and non-palatalized forms in /te, de/ contexts.

As Table 1 above shows, however, the present analysis verifies that the rates of raising of mid vowel /e/ in BP-AP are not high in both /te/ (13.8% vowel raising) and /de/ (3.4% vowel raising) data. The observed frequencies provided to the Noisy-HG script and the ranking values of the constraints obtained by the program are exhibited Figure 7.

Figure 7. Rankings values of the constraints. Source: The authors

The ranking values (Figure 7) lead us to the BP-AP community grammar of palatalization of /te, de/ forms (Figure 8).

Figure 8. BP-AP palatalization grammar of /te, de/ forms. Source: The authors.

Figure 8 indicates that markedness constraint *MID]σ̆ and faithfulness constraint ID(hei) occupy the same position in the scale as they do in the full average palatalization grammar of BP-AP (Figure 7), which includes /ti, di/ data. Markedness constraints *ti and *di assign violations to output forms with non-palatalized alveolar stops before underived /i/. They favor palatalization but play no role in the present analysis because it does not include data with underived /i/. The constraints in the gray area will be responsible for the generation of variable output forms from /te, de/ input forms because their numerical weight difference is <1. So when markedness constraints *d[i] or *t[i] are a bit higher than faithfulness constraint ID(ant), speakers produce palatalization (9.2% for /te/ structures and 2% for /de/ inputs); when ID(ant) moves up and gets a higher numerical weight than *d[i] and *t[i], speakers produce faithful forms, preserving /e/ and, as a consequence, not palatalizing /t, d/ (86.2% [te] and 96.6% [de]. Besides these two variants (palatalization vs. /e/ preservation) there is a third option: unstressed /e/ elevation with no palatalization, which will soon be explained.

The tableau in Figure 9 represents the constraint ranking and the processing of /te/ input forms in BP-AP grammar.

Figure 9. Constraint ranking and processing of /te/ input forms. Source: The authors.

The most harmonic candidate (par[te]) contains the highest (closest to zero, since harmony is shown as negative values) numerical value for harmony (-137.625). It is faithful to its input form (no violations in ID constraints) and incurs a single violation in *MID]σ̆, regarding the realization of [e]. The output par[ti] contains violations in the last two constraints to the right, *t[i], which penalizes non-palatalized /t/ before [i] raised from /e/, and ID(hei), which demands the output to be identical to the input in terms of vowel height. The palatalized candidate par[ʧɪ] presents more segmental transformations in the input-output mapping required by markedness constrains (/t/ à [ʧ]; /e/ à [i]), which is why solely faithfulness constraints are violated by this candidate. The [ti] and [ʧɪ] outputs have harmony values whose difference is <1 (-146.531 and -147.180, respectively), so variation is manifested between these two candidates. In terms of a continuous scale, the grammar is illustrated by Figure 10.

Figure 10. Candidates and harmony values in the grammar (/te/ input forms). Source: The authors.

Figure 11 is the tableau that represents the constraint ranking and the processing of /de/ input forms in BP-AP grammar.

Figure 11. Constraint ranking and processing of /de/ input forms Source: The authors.

In Figure 11, the form on[de] is the most harmonic output, with a harmony value of -137.625, violating solely a markedness constraint that does not allow mid vowels to emerge. On[di] has a harmony of -149.810, not close enough (>1) to be variably produced along with on[ʤɪ], whose harmony is -147.180. It violates a markedness constraint that demands palatalization of /d/ (*d[i]) and a faithfulness constraint that requires identical vowel quality between input-output forms (IDhei). Variable palatalization is less likely to occur with /de/ tokens, and the palatalized output is more harmonic than the output with raised /e/ but non palatalized alveolar stop. Figure 12 represents, in a continuous scale, the grammar of BP-AP that generates variable output forms from /de/ input forms.

Figure 12. Candidates and harmony values in the grammar (/de/ input forms). Source: The authors.

The strength of unstressed /e/ preservation in BP-AP is verified in the frequencies observed (86.2% /te/ and 96.6% /de/) and represented in rankings in figures 12 and 14, where the most harmonic outputs par[te] and on[de] lie on the extreme left side of the scale (the highest position), far enough from the other two marked candidates so that they do not engage in variation.

The proximity of par[ti] and par[t∫i], on[di] and on[ʤɪ] in the scales (Figures 10 and 12) indicate that the third variant ([i] raised from /e/) is competing with palatalized forms. This competition is due to the movement of the faithfulness constraint ID(ant), as we see in the following tableaux (Figure 13).

Figure 13. Tableaux for variable (palatalized and non-palatalized) outputs with derived [i]. Source: The authors.

The tableaux in Figure 13 basically present two constraint rankings, one that chooses the candidate that exhibits mid-vowel raising but no palatalization (/te, de/® [ti, di] ranking): *MID]σ̆ >> {*di, *ti, ID(ant), *d[i], *t[i] } >> ID(hei); and one that chooses the candidate with mid-vowel raising and palatalization (/te, de/® [ʧi, ʤi] ranking): *MID]σ̆ >> *ti >> *di >> *t[i] >> d[i] >> ID(ant) >> ID(hei). The constraints between { } in the /te, de/® [ti, di] ranking are the ones that move up and down the scale and bring the candidates’ harmony values near, thus producing variation. The constraints in the /te, de/® [ʧi, ʤi] ranking show a more stringent architecture, decrementing the value of faithfulness and thus generating the marked palatalized outputs.

The non-strictness observed in the /te, de/® [ti, di] ranking is due to the fact that the process of palatalization in fed by /e/ raising, since the realization of [i] creates the context for palatalization. We then understand that the production of palatalized forms stands for the ‘completion’ of the phonological process at hand, and that raised [i] raised outputs work as an ‘intermediate’ process, feeding palatalization. The low rates of palatalized forms and the high rates of /e/ preservation, both in the average grammar and in the subset grammar (/te, de/ corpus), indicate that palatalization is somewhat banned in unstressed /e/ environment for reasons that may not seem grammatical, according to our initial hypothesis.

Palatalization in BP-AP is socially constrained by the variable “place of residence” (BATTISTI et al., 2007[4]), where urban speakers tend to palatalize and rural speakers not to. The effect of “place of residence” is proven by the movement of ID(ant) in the scale, entailing variation. Faithfulness holds back both unstressed mid vowel raising and palatalization, and stands for Coetzee’s (2016[12]) hypothesis that non-grammatical factors solely contribute towards the frequency with which certain forms are observed. That being the case, ID(ant) is the faithfulness constraint affected by social conditioning in the variable palatalization in BP-AP: when ID(ant) ranking value is incremented, the grammar inhibits palatalization; when it is decremented, the grammar produces palatalized /t, d/. Lower application rates of mid-vowel /e/ in the rural area by their turn interfere with such scaling effects on palatalization.

The non-grammatical conditioning of palatalization in BP-AP can be modeled as follows (Figure 14). Considering faithfulness constraints (F) as an expression of the social constraint “place of residence”, the diagram in Figure 14 indicates that speakers from the urban area tend to palatalize, and speakers from the rural area tend not to, a pattern that has to do with lower rates of unstressed mid vowel /e/ raising.

Figure 14. Diagram representing the actuation of social constraints in the BP-AP grammar. Source: The authors.

The more faithfulness constraints move down, the more speakers are likely to palatalize /t, d/. The more faithfulness constraints move up, the more likely speakers are to preserve the unstressed mid vowel /e/, which implies non palatalization. The grammar of variable /e/ raising seems, in a sense, to direct palatalization depending on the non-grammatical factor “place of residence” (urban or rural area) which either refrains or spreads the process.

The analysis has demonstrated that the variable palatalization process in BP-AP is driven by both markedness and faithfulness constraints, where markedness favors palatalization and faithfulness inhibits it. Social effects on the grammar are played by faithfulness constraints, in which ID(ant) has been decisive in describing variable palatalization and providing a detailed framework for the variants observed in the speech community.

5. Conclusion

This study has tackled two issues priorly raised (GUTIERRES; BATTISTI; DORNELLES FILHO, 2018[15]): (a) it tested the assumptions of Coetzee and Kawahara (2013[13]) and Coetzee’s (2012, 2016[11,12]) model regarding the role faithfulness constraints play in the grammar, determining the frequency with which linguistic forms are variably produced; (b) it included a third variant in the analysis of the variable palatalization in BP-AP, referring to the environment of unstressed mid-vowel /e/ (with raised [i] but no palatalization), highlighting the tendency of the BP-AP community to preserve the vowel and, therefore, not palatalizing alveolar stops in this environment.

The analysis succeeds in clarifying not only the role of “place of residence” in the grammar of palatalization in BP-AP, but also the effect of the variable elevation of /e/, which feeds the process, in a pattern of language variation and change. As pointed out in previous analyses (BATTISTI; DORNELLES FILHO, 2010[3]; GUTIERRES; BATTISTI; DORNELLES FILHO, 2018[15]), variation emerges from the competition of grammars in the community (the urban and rural area grammars). From a computational perspective, runs of Noisy-HG model variable grammars and explain the effect of non-grammatical factors by means of scaling faithfulness constraints.

Coetzee’s (2016[12]) hypothesis was then confirmed: in a grammar-dominant, constraint-based model, grammar drives variation and non-grammatical factors contribute to the frequency variable forms are observed. According to the author, a comprehensive model of phonological variation is a model that can include, in the analysis of a phonological process, both the effects of linguistic constraints and social factors.

Our investigation includes social and linguistic constraints together in the modeling of phonological variation, a field yet to be more deeply investigated. It demonstrates how generative-based models are complying with recent findings of social investigations and the crucial part non-grammatical aspects play in the configuration of grammars. This seems to go along with the established assumption that the potential of a theory relies on its capacity to represent empirical data.

At least two questions are yet to be explored. One has to do with the configuration of the community grammar. We are assuming multiple variable grammars competing in the speech community, in a sense that the two sub-communities (rural and urban) have separate, independent grammars that might converge to a core grammar once the process of palatalization is complete. But a different possibility would be that speakers share a single core production grammar and just differentiate by social constraints. Looking at perception grammar(s) in future studies may provide different hypotheses to work on, since both rural and urban speakers talk to and understand each other in local social interactions. Another question refers to derived [i] coming from /e/ and its possible effect on palatalization, a case of phonological opacity which most constraint-based models cannot handle due to their non-derivational nature. Both questions are left for future investigation.


We are deeply grateful to Andries W. Coetzee for his valuable comments and insights on issues raised in this manuscript. We wish to thank the reviewers of the paper, Ronaldo Mangueira Lima Jr and Márcia Cristina do Carmo, for their useful comments and questions. We naturally assume full responsibility for any errors that may remain.


ABAURRE, M. B. M.; PAGOTTO, E. G. Palatalização das oclusivas dentais no português do Brasil. In: ABAURRE, M.B.M.; RODRIGUES, A.C.S. (Org.) Gramática do Português Falado Volume VIII: novos estudos descritivos. Campinas, SP: Editora da UNICAMP, 2002. p.557-602.

ALVES, U. K. Teoria da Otimidade Estocástica e Algoritmo de Aprendizagem Gradual: princípios de funcionamento e tutorial para simulação computacional. Revista virtual de estudos da linguagem – REVEL, vol. 15, n. 28, p. 202-234, 2017.

ANTTILA, A. Deriving variation from grammar. In: HINSKENS, F.; HOUT, R. van; WETZELS, L. (Ed). Variation, Change and Phonological Theory. Amsterdam: John Benjamins, 1997. p. 35-68.

ANTTILA, A.; CHO, Y. Y. Variation and change in Optimality Theory. Lingua, n.104, p. 31-56, 1998.

ARCHANGELI, D. Optimality theory: an introduction to linguistics in the 1990s. In: ARCHANGELI, D.; LANGENDOEN, D. T. (Ed.). Optimality theory: an overview. Malden/Oxford: Blackwell, 1997. p.1-32.

BATTISTI, E. et. al. Palatalização das oclusivas alveolares e a rede social dos informantes. Revista virtual de estudos da linguagem – REVEL.v.5, n.9, p. 1-29, 2007.

BATTISTI, E.; DORNELLES FILHO, A.A. Universais implicacionais e restrições estruturais à variação e mudança fonológica: o caso da palatalização das oclusivas alveolares em português numa comunidade ítalo-brasileira. Cadernos de Pesquisas em Linguística, v. 4, n. 1, p. 80-93, 2009.

BATTISTI, E.; DORNELLES FILHO, A. A. A palatalização variável das oclusivas alveolares num falar de português brasileiro e sua análise pela Teoria da Otimidade. Letras de Hoje, Porto Alegre, v. 45, n. 1, p. 80-86, 2010.

BATTISTI, E.; GUZZO, N. Palatalização das oclusivas alveolares: o caso de Chapecó. In: BISOL, L.; COLLISCHONN, G. (Orgs.) Português do Sul do Brasil: variação fonológica. Porto Alegre: EDIPUCRS, 2010. p.97-118.

BATTISTI, E.; HERMANS, B. A palatalização das oclusivas alveolares: propriedades fixas e variáveis. Alfa – Revista de linguística, v. 52, n.2, p. 279-288, 2008.

BATTISTI, E.; HERMANS, B. The structural motivation of palatalization. Forum linguístico, Florianópolis, v.17, número especial, p. 4596-4611, 2020.

BOERSMA, P. Functional Phonology: formalizing the interaction between articulatory and perceptual drives. The Hague: Holland Academic Graphics. [Doctoral dissertation, University of Amsterdam.] 1998.

BOERSMA, P.; HAYES, B. Empirical tests of the Gradual Learning Algorithm. Linguistic Inquiry, v.32, n.1, p.45-86, 2001.

BOERSMA, P.; PATER, J. Convergence Properties of a Gradual Learner in Harmonic Grammar. Manuscript. Amsterdam and Amherst, MA: Meertens Institute and University of Massachusetts Amherst. [Available on Rutgers Optimality Archive, ROA-970.] 2008.

BOERSMA, P.; PATER, J. Convergence properties of a gradual learning algorithm for Harmonic Grammar. 2013. Retrieved from on 11/15/2019.

BOERSMA, P.; WEENINK, D. Praat: Doing Phonetics by Computer. (Version 6.1.16). [Computer program]. Retrieved on July 11, 2020 from, 2020.

CARDOSO, Suzana. et al. Atlas Linguístico do Brasil. Londrina: Eduel, 2014. v. 2.

COETZEE, A. W. An integrated grammatical/non-grammatical model of phonological variation. In: KANG, Young-Se et. al. (Ed.) Current issues in linguistic interfaces. v. 2. Seoul: Hankookmunhwasa, 2009a. p.267–294.

COETZEE, A. W. Phonological variation and lexical frequency. In: SCHARLD, A.; WALKOW, M.; ABDURRAHMAN, M. (eds). NELS 38: Proceedings of the 38 Meeting of the North East Linguistics Society. v. 1. Amherst: GLSA. p. 189-202. [Available on Rutgers Optimality Archive, ROA-952]. 2009b.

COETZEE, A. W. Variation: where laboratory and theoretical phonology meet. In: COHN, A.C.; FOUGERON, C.; HUFFMAN, M. K. (Eds.) The Oxford handbook of laboratory phonology. Oxford: Oxford University Press, 2012. p.62-76.

COETZEE, A. W. Formal phonological models of variation: Optimality Theory 2 – Stochastic Grammars. Handout. University of Michigan, 2014.

COETZEE, A. W. A comprehensive model of phonological variation: grammatical and non grammatical factors in variable nasal place assimilation. Phonology, n. 33, p. 211–246, 2016.

COETZEE, A. W.; KAWAHARA, S. Frequency biases in phonological variation. Natural Language & Linguistic Theory, n. 31, p. 47–89, 2013.

DORNELLES FILHO, A. A. Algoritmo para ordenação de restrições na Teoria da Otimidade. 2014. 69 f. Monografia (Especialização em Métodos Quantitativos: Estatística e Matemática Aplicadas) – Faculdade de Matemática, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, 2014. Disponível em: Acesso em: 11/07/20.

DUTRA, E. de O. A palatalização das oclusivas dentais /t/ e /d/ no município do Chuí, Rio Grande do Sul. 2007. 131 f. Dissertação (Mestrado em Linguística Aplicada) – Faculdade de Letras, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, 2007.

GUTIERRES, A.; DORNELLES FILHO, A. A. Formalização da variação fonológica na aquisição da nasal velar em inglês pelo ORTO Ajuste Paramétrico. Revista virtual de estudos da linguagem – REVEL, v. 15, n. 28, p.271-288, 2017.

GUTIERRES, A.; BATTISTI, E.; DORNELLES FILHO, A. A. O efeito de fatores sociais sobre restrições linguísticas na análise fonológica de um processo variável. Diadorim, v. 20, n. 2, p. 255-279, 2018.

HAYES, B. Varieties of noisy harmonic grammar. Proceedings of the 2016 Annual Meeting in Phonology - USC. 2017.

HAYES, B.; TESAR, B.; ZURAW, K. OTSoft 2.1. Computer program. 2003. Disponível em: Acesso em: 27 fev. 2010.

KAGER, R. Optimality theory. Cambridge: Cambridge University Press, 1999.

KAMIANECKY, F. A palatalização das oclusivas dentais /t/ e /d/ nas comunidades de Porto Alegre e Florianópolis: uma análise quantitativa. 2003. 114 f. Dissertação (Mestrado em Lingüística Aplicada) - Faculdade de Letras, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, 2003.

LABOV, W. Sociolinguistic patterns. Philadelphia: University of Philadelphia Press, 1972.

LEGENDRE, G.; MIYATA, Y.; SMOLENSKY, P. Harmonic Grammar - a formal multi- level connectionist theory of linguistic well-formedness: theoretical foundations. Computer Science Technical Reports. Paper 447. 1990.

MAURI, C. A palatalização das oclusivas alveolares e práticas sociais em capelas de Forqueta (Caxias do Sul, RS). 2008. 77 f. Dissertação (Mestrado em Letras e Cultura Regional) – Programa de Pós-Graduação em Letras e Cultura Regional, Universidade de Caxias do Sul, Caxias do Sul, 2008.

MATTÉ, G. D. A palatalização variável de /t, d/ em Caxias do Sul. Cadernos do IL, n.38, p.43-55, 2009.

McCARTHY, J. J. Doing Optimality Theory: Applying theory to data. Malden/Oxford: Blackwell Publishing, 2008.

OOSTENDORP, M. V. Style Levels in Conflict Resolution. In: HINSKENS, F.; HOUT, R. V.; WETZELS, L. (eds). Variation, Change and Phonological Theory. Amsterdam: John Benjamins, 1997. p. 207-229.

OOSTENDORP, M. V. Variation in generative grammar. Manuscript. Amsterdam, Meertens Institute, 2008. Retrieved from on 11/15/2019.

PAULA, A. T. de. A palatalização das oclusivas dentais /t/ e /d/ nas comunidades bilíngues de Taquara e de Panambi, RS: análise quantitativa. 2006. 152 f. Dissertação (Mestrado em Estudos da Linguagem) – Instituto de Letras, Universidade Federal do Rio Grande do Sul, Porto Alegre, 2006.

PIRES, L. B. A palatalização das oclusivas dentais em São Borja. 2003. 116 f. Dissertação (Mestrado em Linguística Aplicada) – Faculdade de Letras, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, 2003.

PRINCE, A.; SMOLENSKY, P. Optimality Theory: constraint interaction in Generative Grammar. Malden/Oxford: Wiley-Blackwell, 2004.

ROVEDA, S. D. Elevação da vogal média átona final em comunidades bilíngues: português e italiano. 1998. 81 f. Dissertação (Mestrado em Letras) – Faculdade de Letras, Pontifícia Universidade Católica do Rio Grande do Sul, Porto Alegre, 1998.