The course introduces the fundamental concepts, methodologies, and tools of computational linguistics and natural language processing (NLP). The statistical foundations for automatic text analysis and for understanding the functioning of machine learning methodologies and language models are provided. The main NLP tasks, the main generative artificial intelligence tools, and the main prompting techniques are presented. Upon completion of the course, students are able to perform linguistic tasks on texts and evaluate their performance.
Course Prerequisites
Familiarity with basic notion in general linguistics, which will be reviewed in class at the beginning of the course.
Teaching Methods
Face-to-face interactive Lectures. Slides. Lab with group activities on the following topics: 1- Introduction to Python programming and Colab 2- Text preprocessing 3- POS tagging and syntactic parsing 4- Named Entity Recognition 5- Distributional semantics 6- Sentiment Analysis
Assessment Methods
Final oral exam covering material from the entire course. Final assignment (5 pages including references and excluding tables and figures) reporting the results of an in-depth investigation of a linguistic phenomenon (morphological, syntactic, semantic, lexical or discourse) or of a social/cultural phenomenon (through linguistic analysis) performed using the tools introduced in class, previously agreed during office hours. The text must be sent to elisabetta.jezek@unipv.it 7 days before the exam.
Texts
Readings: Elisabetta Jezek & Rachele Sprugnoli (2023). Linguistica computazionale. Introduzione all’analisi automatica dei testi. Bologna: Il Mulino. Chapter I Definition, goals and historical notes; Chapter III Basics of statistics; Chapter IV Machine learning; Chapter 5 Distributional semantics and types of vectors; Chapter VI Text annotation. Additional readings will be introduced in class and made available on the KIRO platform
Contents
The course will cover the following topics: - Definition, goals, and history of Computational Linguistics and Natural Language Processing - Basics of Statistics - Traditional and Neural Machine Learning for Natural Language Processing - Evaluation of Computational Models - Annotation of Linguistic Data for Machine Learning - Language Models and Generative Artificial Intelligence
Course Language
Italian
More information
Material for the course is available on the KIRO platform (access with personal username and password).