The cognitive infrastructure of language

Josie Helen Siman,
Thiago Oliveira da Motta Sampaio

Abstract

Michael Tomasello, one of the most influential scholars dedicated to the study of language evolution and acquisition, in his online lecture to Abralin ao vivo, presents his thesis that human communication capacity is prior to language development and is fundamental to explain our linguistic ability. That is, a capacity for human communication, which emcompasses a variety of general cognitive skills — such as cooperation, inference of intentions, deciphering gestures contextually, which requires the recognition of common ground and joint attention — are the conditions which can create or support a complex human language. In addition, the author demonstrates how gestures (e.g., pointing) are used in a sophisticated way by children, but not by non-human primates. Thus, the author argues that gestures, supported by the human communicative infrastructure, can give rise to grammatical conventions.

Text

What differentiates humans from other animals? One of the most notable differences is the sophisticated human capacity for language: our flexible ability to produce infinite hierarchically structured sentences and semantic combinations. Other animals do not do that. In fact, the communication of non-human animals is, in general, made up of mostly innate patterns and admits little contextual flexibility (cf. ARBIB, 2008[1]). For example, monkeys are born knowing how to produce fixed vocalizations, with fixed meanings. That is, they do not learn these vocalizations socially; they only regulate the social adequacy of their vocalizations with social experiences (SAPOLSKY, 1996[2]; TOMASELLO, 2018[3]).

On the other hand, human language presents variable syntactic structures, arbitrary meanings (based on cultural conventions), variations in expression modality (e.g. it can be performed through gestures or vocalizations, often both), variations in forms of information codification (e.g. time, aspect, and modality), etc. Human language is different from animal communication in several ways. It is, in general, more abstract, more complex structurally and more varied. One of the great challenges of linguistic studies is to understand the (evolutionary and ontogenetic) origin of human capacity for language. What do humans have that enables them to develop this language?

Michael Tomasello, one of the most influential researchers dedicated to the study of evolution and language acquisition, in his online lecture to Abralin ao vivo[4], presents his thesis that human communicative capacity, in a broad sense, develops before and is fundamental for explaining linguistic ability. That is, human communicative and pragmatic capacity, which emcompases a range of cognitive capacities as infrastructure — such as cooperative skills, the ability to infer intentions, the ability to decipher gestures contextually, which requires the recognition of common ground and joint attention — are the basic conditions that allow the conventionalization of linguistic structures. Without this infrastructure, there is no human language as we know it.

In his lecture, Tomasello defends the importance of studying human gestures to understand language. This importance can be explained by the fact that, ontogenetically, children use gestures communicatively before using language, and some humans use sign language exclusively (ARBIB et al., 2008[1]). In addition, gestures accompany human speech, which has multimodal properties (MCNEILL, 2006[5]). Moreover, it is important to notice that, among non-human primates, vocalizations are innate and fixed (hardwired), while gestures are learned. However, the gestures of non-human primates are different from the gestures of a prelinguistic child in many ways.

Non-human primates gesticulate in a ritualized way, such as when hitting the ground to get another primate's attention. It is an intentional, deliberate and learned gesture (instead of innate). But the gestures of non-human primates are, in general, dyadic (involves sender and receiver) and directive, that is, these gestures are produced when the primate wants something for himself. Human infants’ gestures are, on the other hand, triadic (involves sender, receiver, and a third entity) and informative, for example, infants point to things in the world because they just want to show something, without the intention of obtaining that item, or even with the intention of sharing information with their interlocutor. These are gestures which have social and cooperative purposes. This means that the cognitive infrastructure that differentiates us from other animals is already present before linguistic fluency, and, according to Tomasello, this is a sine qua non for language development.

Tomasello proceeds to address the importance of the pointing gesture. Pointing is meaningless without human “pragmatic” infrastructure. If we hide food under a can and point to it, signaling to a non-human primate that there is something there for him, the primate will not understand the message (HARE; TOMASELLO, 2005[6]). Children understand. The difference is that the non-human primate is not able to infer that the human has a cooperative intention (a question to ask is whether the non-human primate would understand an informative gesture produced by a member of his group, even though they do not produce these gestures spontaneously). Non-human primates are able to infer intentions in competitive contexts, for example, if there is a dispute over resources and the human runs towards the can, the non-human primate is able to infer that the human wants that can because there is food in there. But he is not able to make this inference in cooperative contexts, where the human points to the can informing that there is something there for it (for the non-human primate). To make this kind of inference, a human being is able to understand that "he is pointing at something because it is relevant to me" (recursive inference).

For the pointing gesture to make sense, we need to share a common ground, a common knowledge with our interlocutor about the immediate situation or about a culturally shared history. This common ground is used frequently in our everyday conversations. For example, when one person asks "do you want to go to the movies tonight" and another person replies "tomorrow I have to work". What is the relationship between that answer and the question? We are only able to infer the relationship because we share prior information, such as that we need to sleep early to wake up early the next day, going to the movies late at night could disturb us, etc. Likewise, a pointing gesture is only understood from a common ground. If a person points to a chair, without any previous context, it is not possible to know what this gesture means. But if one person is looking for a place to sit, and the other is pointing at the chair, then the meaning of the gesture becomes “sit there”. If a person is looking for wood to burn, the same gesture means "burn that chair". The gestures at the same time indicate a fact (there is X) and an intention (know that x, use x for ...).

In an experiment, Liebal et al. (2008[7]) create a context in which there is a common ground between child and researcher, since both are participating in an activity that consists of putting toys away. When the researcher points to a toy, the child feches it and puts it away. In a second condition, a second researcher, who does not share a common ground with the child, enters the room and points to a toy. In this condition, the child does not know what to do with the toy (she smiles, or hands it over to the second researcher, but does not put it away). The gesture of pointing in the second condition is not relevant to the activity that the child was performing before.

For Tomasello, gestures (such as pointing) and pantomime are key factors in the transition to conventional human communication. In order for these gestures, typically human, to work, we need to have an infrastructure of shared intentionality that has evolved to enable collaborative joint activities. Linguistic structures are built on these typically human cognitive abilities. On shared intentionality, Tomasello (2019, p. 7[3]) states that:

humans’ abilities to cooperate with one another take unique forms because individuals are able to create with one another a shared agent “we,” operating with shared intentions, shared knowledge, and shared sociomoral values. The claim is that these abilities emerged first in human evolution between collaborative partners operating dyadically in acts of joint intentionality, and then later among individuals as members of a cultural group in acts of collective intentionality.

Tomasello assumes that grammar is based on new and given information. Thus, he exemplifies how it would be possible to go from gestures to grammatical constructions: he illustrates how a child, being close to his mother and a chair, and away from the table, can point to the chair meaning “I want the chair close to table” (information complemented by the common ground shared between mother and child). Subsequently, the same child is near the table (i.e. the table is the “given” information) and points to the chair that is far away, to say exactly the same thing.

In both situations, the act of pointing is not grammatical. But understanding situations through multiple semantic or participant roles, and the fact that the child is able to navigate these flexible information frames based on given and new information (which is fundamental to grammatical constructions) is the basis for creating grammatical conventions. In this way, there would be a functional continuity between gestures (pointing) and linguistic constructions, which are based on cognitive capacities specific to our species: cooperative motives (e.g. informing, sharing) and cooperative cognition (e.g. joint attention and common ground).

Tomasello's lecture is interesting for those who want to understand which aspects of human cognition are unique to our species, how prelinguistic children differ from other primates and how a general cognitive infrastructure — based on cooperation, gestures and pragmatic abilities — can support our sophisticated language skills. Tomasello's latest book (2019), Becoming Human, is also an important addition to the lecture.

References

ARBIB, Michael A. et al. Primate vocalization, gesture, and the evolution of human language. Current anthropology, v. 49, n. 6, p. 1053-1076, 2008. DOI: https://doi.org/10.1086/593015

HARE, Brian; TOMASELLO, Michael. Human-like social skills in dogs?. Trends in cognitive sciences, v. 9, n. 9, p. 439-444, 2005. DOI: https://doi.org/10.1016/j.tics.2005.07.003

LIEBAL, Kristin et al. Infants use shared experience to interpret pointing gestures. Developmental science, v. 12, n. 2, p. 264-271, 2008. DOI: https://doi.org/10.1111/j.1467-7687.2008.00758.x

MCNEILL, David. Gesture: a psycholinguistic approach. The encyclopedia of language and linguistics, p. 58-66, 2006.

SAPOLSKY, Robert. Biology and Human Behavior: The Neurological Origins of Individuality. 2nd Edition. The Great Courses. 1996 (Dis-ponível apenas em audiobook).

TOMASELLO, Michael. Becoming human: A theory of ontogeny. Belknap Press, 2019.

COMMUNICATION Before Language. Conferência apresentada por Michael Tomasello [s.l., s.n], 2020. 1 vídeo (1h 17min 35s). Publicado pelo canal da Associação Brasileira de Linguística. Disponível em: https://www.youtube.com/watch?v=46IrwGZpDQ4&t=594s. Acesso em 13 jun, 2020.