Comparative constructions in Portuguese and their annotation using Dependency Syntax

Magali Sanches Duran,
Lucelene Lopes,
Thiago Alexandre Salgueiro Pardo,
Maria das Graças Volpe Nunes

Abstract

The mental ability to make comparisons is common to all humans. However, the ways of expressing comparisons in different languages may vary considerably and the grammaticalization of some forms gave rise to the so-called “comparative constructions", which are complex, highly frequent, and versatile structures used to express evaluations about various topics. Currently, due to the large volume of evaluations produced by users on the Web, interest has grown in the automatic processing of comparative constructions, with the goal of inferring, for example, which is the comparison topic, what this topic is being compared to, what is the comparison parameter, and whether the comparison is positive or negative. The first requirement to facilitate automatic processing of comparative constructions is to properly tag them in a logical and consistent way. Focusing on this goal, this paper presents guidelines for the syntactic annotation of comparative structures in Portuguese using the set of labels proposed by the Universal Dependencies annotation guidelines. The studies that formed the basis for this proposal involved the review of works on typology of languages and works specific to the Portuguese language. The guidelines were tested on the Porttinari-base corpus (Pardo et al., 2021; Duran et al. 2023) and refined until they reached the stage presented here. The set of 122 annotated sentences is available at Arborator-Grew-NILC framework for corpus annotation (Miranda & Pardo, 2022).

References

BECK, Sigrid; KRASIKOVA, Sveta; FLEISCHER, Daniel; GERGEL, Remus; HOFSTETTER, Stefan; SAVELSBERG, Christiane; VANDERELST, John; VILLALTA, Elisabeth. Crosslinguistic variation in comparison constructions. Linguistic Variation Yearbook, v. 9(1), p. 1-66, jan. 2009. DOI https://doi.org/10.1075/livy.9.01bec. Acesso em: 13 outubro 2022.

BIBER, Douglas. Corpus-Based and Corpus-driven Analyses of Language Variation and Use. In: HEINE, Bernd; NARROG, Heiko. The Oxford Handbook of Linguistic Analysis. Oxford: Oxford Academic, 2012. DOI https://doi.org/10.1093/oxfordhb/9780199544004.013.0008. Acesso em: 13 outubro 2022.

CASTILHO, Ataliba T. Gramática do Português Brasileiro. São Paulo: Editora Contexto, 2010.

CEGALLA, Domingos Paschoal. Novíssima Gramática da Língua Portuguesa. São Paulo: Companhia Editora Nacional, 2020.

CROFT, W. Typology and Universals (2nd ed., Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press, 2002. DOI https://doi.org/10.1017/CBO9780511840579. Acesso em: 13 outubro 2022.

CUNHA, Celso Ferreira; LINDLEY CINTRA, Luis Filipe. Nova gramática do Português contemporâneo. Rio de Janeiro: Lexikon Editora Digital, 7a edição, 2017. UR: https://ia800706.us.archive.org/12/items/NovaGramticaDoPortugusContemporneo. Acesso em: 13 outubro 2022.

DE MARNEFFE, Marie-Catherine; MANNING, Christopher D.; NIVRE, Joakim; ZEMAN, Daniel. Universal Dependencies. Computational Linguistics, v. 47(2), p. 255{308, 2021. DOI https://doi.org/10.1162/coli_a_00402. Acesso em: 13 outubro 2022.

DIXON, R.M.W. Comparative constructions: A cross-linguistic typology. Studies in Language. International Journal sponsored by the Foundation “Foundations of Language”, v. 32(4), p. 787-817, 2008. DO: https://doi.org/10.1075/sl.32.4.02dix. Acesso em: 13 outubro 2022.

DURAN, M.S. Manual de Anotação de PoS tags: Orientações para anotação de etiquetas morfossintáticas em Língua Portuguesa, seguindo as diretrizes da abordagem Universal Dependencies (UD). Relatório Técnico do ICMC 434. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos-SP, Setembro, 2021. URL https://drive.google.com/file/d/1BddPswn-_Ioo-A5GsldA1cO1kqbcCahb. Acesso em: 13 outubro 2022.

DURAN, M.S.; NUNES, M.G.V.; LOPES, L.; PARDO, T.A.S. Manual de anotação como recurso de Processamento de Linguagem Natural: o modelo Universal Dependencies em língua portuguesa. Domínios de Lingu@gem, v. 16(4), p. 1608-1643, 2022. DOI https://doi.org/10.14393/DL52-v16n4a2022-13. Acesso em: 13 outubro 2022.

DURAN, M.S. Manual de Anotação de Relações de Dependência - Versão Revisada e Estendida: Orientações para anotação de relações de dependência sintática em Língua Portuguesa, seguindo as diretrizes da abordagem Universal Dependencies (UD). Relatório Técnico do ICMC 440. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo. São Carlos-SP, Outubro, 2022. URL https://drive.google.com/file/d/1ile8Wfxu1qdrZOmLGqkvVuQ4fXvHgVMo. Acesso em: 13 outubro 2022.

DURAN, Magali Sanches; LOPES, Lucelene; NUNES, Maria das Graças Volpe; PARDO, Thiago Alexandre Salgueiro. The Dawn of the Porttinari Multigenre Treebank: Introducing its Journalistic Portion. In: Simpósio Brasileiro de Tecnologia Da Informação E Da Linguagem Humana (STIL), 14. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 115-124. DOI: https://doi.org/10.5753/stil.2023.233975.

GANAPATHIBHOTLA, Murthy; LIU, Bing. Mining Opinions in Comparative Sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008). Manchester: Coling 2008 Organizing Committee, p. 241{248, 2008. URL https://aclanthology.org/C08-1031. Acesso em: 13 outubro

HASPELMATH, Martin. Equative constructions in world-wide perspective. In: TREIS, Yvonne; VANHOVE, Martine. Similative and equative constructions: A crosslinguistic perspective. Amsterdam: John Benjamins, 2017. DOI https://doi.org/10.1075/tsl.117.02has. Acesso em: 13 outubro 2022.

IMRÉNYI, András; MAZZIOTTA, Nicolas. Chapters of Dependency Grammar: A historical survey from Antiquity to Tesnière. Amsterdam: John Benjamins, 2020. DOI https://doi.org/10.1075/slcs.212. Acesso em: 13 outubro 2022.

KÁNTOR, Gergely; BACSKAI-ATKARI, Julia. Elliptical comparatives revisited. Budapest: Research Institute for Linguistic, Hungarian Academy of Sciences, 2012. URL https://www.bacskaiatkari.de/pdf/comparatives_bacskai_atkari_kantor.pdf. Acesso em: 13 outubro 2022.

LECHNER, Winfried. Ellipsis in comparatives. Berlin, New York: De Gruyer Mouton, 2008. DOI https://doi.org/10.1515/9783110197402. Acesso em: 13 outubro 2022.

MCDONALD, Ryan; NIVRE, Joakim; QUIRMBACH-BRUNDAGE, Yvonne; GOLDBERG, Yoav; DAS, Dipanjan; GANCHEV, Kuzman; HALL, Keith; PETROV, Slav; ZHANG, Hao; TACKSTROM, Oscar; BEDINI, Claudia; BERTOMEU CASTELLÓ, Núria; LEE, Jungmee. Universal Dependency Annotation for Multilingual Parsing. In:

Proceedings of ACL, 2013.

MIRANDA, L.G.M.; PARDO, T.A.S. An Improved and Extended Annotation Tool for Universal Dependencies-based Treebank Construction. In: Proceedings of the PROPOR Demonstrations Workshop, 2022, p. 1-3. URL: https://drive.google.com/file/d/1Gz9k3-SU72zXx6v2a0lTrutHVAtpUMUX. Acesso em: 18 maio 2023.

NEVES, Maria Helena de Moura. Gramática de Usos do Português. São Paulo: Editora Unesp, 2000.

NIVRE, Joakim; DE MARNEFFE, Marie-Catherine; GINTER, Filip; HAJI_C, Jan; MANNING, Christopher D.; PYYSALO, Sampo; SCHUSTER, Sebastian; TYERS, Francis; ZEMAN, Daniel. Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection. In: Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020). Marseille: European Language Resources Association, p. 4034-4043, 2020. URL https://aclanthology.org/2020.lrec-1.497. Acesso em: 13 outubro 2022.

OSBORNE, Timothy; GERDES, Kim. The status of function words in dependency grammar: A critique of Universal Dependencies (UD). Glossa: a journal of general linguistics, v. 4(1), 17, 2019. DOI https://doi.org/10.5334/gjgl.537. Acesso em: 13 outubro 2022.

PARDO, T.A.S.; DURAN, M.S.; LOPES, L.; DI FELIPPO, A.; ROMAN, N.T.; NUNES, M.G.V. Porttinari - a large multi-genre-treebank for Brazilian Portuguese. In Proceedings of the XIV Symposium in Information and Human Language (STIL), São Paulo: SBC, pp. 1-10. 2020. DOI https://doi.org/10.5753/stil.2021.17778. Acesso em: 13 outubro 2022.

PEREIRA, Sandra; PINTO, Clara; PRATAS, Fernanda. Construções comparativas em português: porque algumas são mais iguais que outras. In: Textos Selecionados do XXIX Encontro da Associação Portuguesa de Linguística. Lisboa: Associação Portuguesa de Linguística, 2014. URL http://hdl.handle.net/10451/33739. Acesso

em: 13 outubro 2022.

PULMAN, S. G. Comparatives and Ellipsis. In: Proceedings of the 5th European Meeting of the Association for Computational Linguistics, Berlin, 1991. URL https://aclanthology.org/E91-1002. Acesso em: 13 outubro 2022.

RANCHHOD, Elisabete. Frozen Adverbs – Comparative Forms Como C in Portuguese. Lingvisticae Investigationes, v. 15(1), p. 141 - 170, jan. 1991. DOI https://doi.org/10.1075/li.15.1.07ran. Acesso em: 13 outubro 2022.

RANCHHOD, Elisabete; DE GIOIA, Michele. Comparative Romance Syntax. Frozen Adverbs in Italian and in Portuguese. Lingvisticae Investigationes, v. 20(1), p. 33-85, jan. 1996. DOI https://doi.org/10.1075/li.20.1.04ran. Acesso em: 13 outubro 2022.

ROCHA LIMA, Carlos Henrique. Gramática normativa da língua portuguesa. São Paulo: Editora José Olympio, 2010.

STASSEN, Leon. Comparison and Universal Grammar. Oxford: Basil Blackwell, 1985.

STASSEN, Leon. Comparative constructions. In: DRYER, Matthew S.; HASPELMATH, Martin. The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology, 2013. URL https://wals.info/chapter/121. Acesso em: 13 outubro 2022.

TESNIÈRE, Lucien. Èléments de Syntaxe Structurale. Paris: Librarie C. Klincksieck, 1959.

TESNIÈRE, Lucien. Elements of Structural Syntax. Tradução de OSBORNE, Timothy; KAHANE, Sylvain. Amsterdam: John Benjamins, 2015.

TOGNINI-BONELLI, Elena. Corpus Linguistics at Work. Computational Linguistics, v. 28(4), p. 583, December 2002. DOI https://doi.org/10.1162/coli.2002.28.4.583a. Acesso em: 13 outubro 2022.

TREIS, Yvonne. Comparative Constructions: An Introduction. Linguistic Discovery, v. 16(1), 2018. DOI https://doi.org/10.1349/PS1.1537-0852.A.492. Acesso em: 13 outubro 2022.

ULTAN, Russell Some features of basic comparative constructions. Working Papers on Language Universals (Stanford), v. 9, p. 117-162, 1972.

VARATHAN, Kasturi Dewi; GIACHANOU, Anastasia; CRESTANI, Fabio. Comparative opinion mining: A review. Journal of the Association for Information Science and Technology, v. 68(4), p. 811{829, April 2017. DOI https://doi.org/10.1002/asi.23716. Acesso em: 13 outubro 2022.