Natural Language Processing Applied To Annotations Built With DLnotes

Luis Gustavo Saibro da Silva,
Elder Rizzon Santos

Abstract

The DLNotes is an annotation tool tailored to digital texts adopted during the teaching and learning process of Literature courses. The adoption of the DLNotes system in the last years resulted in a large annotation collection suitable for computational processing and discovery aimed at producing more knowledge. Natural language processing techniques were adopted in this project to develop a dataset allowing the extraction of knowledge. In addition to the data from DLNotes external data from the Moodle learning system is also aggregated in the proposed dataset. The resulting dataset was applied to the prediction of the teacher´s evaluation of activities based on the student´s annotation. This prediction model was developed as proof of concept of the dataset. Furthermore, the prediction is aimed at speeding up the student feedback and supporting the teacher during the evaluation process. Finally, the main contribution of this work is the adopted approach to construct the dataset and the preliminary results report from the evaluation prediction.

References

ABEL, Mara ; RAMA FIORINI, Sandro. Uma Revisão Da Engenharia Do Conhecimento: Evolução, Paradigmas E Aplicações. International Journal of Knowledge Engineering and Management, v. 2, n. 2, p. 1, 2013.

BERNERS-LEE, Tim; HENDLER, James A ; LASSILA, O. The Semantic Web: A New Form of Web Content that is Meaningful to Computers will Unleash a Revolution of New Possibilities. In: HENDLER, James (Org.). Linking the World’s Information: Essays on Tim Berners-Lee’s Invention of the World Wide Web. [s.l.]: ACM eBooks, 2023, p. 91–103.

BISHOP, Christopher M. Model-based machine learning. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, v. 371, n. 1984, p. 20120222, 2013.

CARBONELL, Jaime G.; MICHALSKI, Ryszard S. ; MITCHELL, Tom M. An overview of Machine Learning. In: CARBONELL, Jaime G.; MICHALSKI, Ryszard S. ; MITCHELL, Tom M. (Orgs.). Machine Learning - An Artificial Intelligence Approach, Volume I. [s.l.]: Morgan Kaufmann, 1983, p. 3–23.

CHOWDHARY, K. R. Natural Language Processing. In: Fundamentals of Artificial Intelligence. New Delhi: Springer, 2020, p. 603–649.

EDMUNDS, Angela ; MORRIS, Anne. The problem of information overload in business organisations: a review of the literature. International Journal of Information Management, v. 20, n. 1, p. 17–28, 2000.

GIEBLER, Corinna; GRÖGER, Christoph; HOOS, Eva; et al. Leveraging the Data Lake: Current State and Challenges. In: ORDONEZ, C.; SONG, I.Y. ; ANDERST-KOTSIS, G. (Orgs.). Big Data Analytics and Knowledge Discovery. [s.l.]: Springer, 2019, v. 11708, p. 179–188.

GRUBER, Thomas R. A translation approach to portable ontology specifications. Knowledge Acquisition, v. 5, n. 2, p. 199–220, 1993. Disponível em: <https://dl.acm.org/citation.cfm?id=173747>.

HOPFIELD, J.J. Artificial neural networks. IEEE Circuits and Devices Magazine, v. 4, n. 5, p. 3–10, 1988.

JAMES, Gareth; WITTEN, Daniela; HASTIE, Trevor; et al. An Introduction to Statistical Learning. New York, NY: Springer US, 2021.

LEVY, Omer ; GOLDBERG, Yoav. Dependency-BasedWordEmbeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ShortPapers). [s.l.: s.n.], 2014, p. 302–308.

LORENA, Ana Carolina ; DE CARVALHO, André C. P. L. F. Uma Introdução às Support Vector Machines. Revista de Informática Teórica e Aplicada, v. 14, n. 2, p. 43–67, 2007.

MITTMANN, Adiel; WILLRICH, Roberto; FILETO, Renato; et al. DLNotes2: Anotações Digitais como Apoio ao Ensino. In: Anais do ... Simpósio Brasileiro de Informática na Educação. [s.l.: s.n.], 2013.

NAYAK, Arjun Srinivas ; KANIVE, Ananthu P. Survey on Pre-Processing Techniques for Text Mining. International Journal Of Engineering And Computer Science, v. 5, n. 5, 2016.

RUSSEL, Stuart ; NORVIG, Peter. Artificial intelligence: a Modern approach. 4. ed. [s.l.]: Prentice Hall, 2020.

SAWADOGO, Pegdwendé ; DARMONT, Jérôme. On data lake architectures and metadata management. Journal of Intelligent Information Systems, v. 56, n. 1, p. 97–120, 2020.

TENNEY, Ian; DAS, Dipanjan ; PAVLICK, Ellie. BERT Rediscovers the Classical NLP Pipeline. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. [s.l.: s.n.], 2019, p. 4593–4601.

WANG, Xuezhi; WANG, Haohan ; YANG, Diyi. Measure and Improve Robustness in NLP Models: A Survey. In: NAACL 2022. [s.l.: s.n.], 2022. Disponível em: <https://arxiv.org/abs/2112.08313>. Acesso em: 16 out. 2023.

WILLRICH, Roberto; MITTMANN, Adiel; FILETO, Renato; et al. Capture and visualisation of text understanding through semantic annotations and semantic networks for teaching and learning. Journal of Information Science, v. 46, n. 4, p. 528–543, 2019.

WIRTH, Rüdiger ; HIPP, Jochen. CRISP-DM: Towards a standard process model for data mining. In: Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining. [s.l.: s.n.], 2000, p. 29–39.

ZHAO, Bo. Web Scraping. In: Encyclopedia of Big Data. [s.l.: s.n.], 2017, p. 1–3.