ID:
501166
Duration (hours):
36
CFU:
6
SSD:
GLOTTOLOGIA E LINGUISTICA
Year:
2025
Overview
Date/time interval
Secondo Semestre (23/02/2026 - 22/05/2026)
Syllabus
Course Objectives
The course, as a whole, has a methodological and empirical orientation. The expected learning outcomes for this module are as follows:
(a) Students will be familiar with the main methodological issues that must be addressed in any linguistic research and will have the skills necessary to tackle them; they will be able to evaluate the key limitations of the methodologies commonly used to answer research questions about language and languages.
(b) Students will have understood what constitutes linguistic data and will be familiar with the different types of linguistic data; they will have the skills to select, collect, and elicit the most suitable linguistic data for their research purposes and will know how to evaluate, describe, and analyze it.
(c) Students will be able to critically interpret and evaluate the results of linguistic research, both their own and others'.
(d) They will be able to present qualitative or quantitative linguistic data in scientific prose articles.
(a) Students will be familiar with the main methodological issues that must be addressed in any linguistic research and will have the skills necessary to tackle them; they will be able to evaluate the key limitations of the methodologies commonly used to answer research questions about language and languages.
(b) Students will have understood what constitutes linguistic data and will be familiar with the different types of linguistic data; they will have the skills to select, collect, and elicit the most suitable linguistic data for their research purposes and will know how to evaluate, describe, and analyze it.
(c) Students will be able to critically interpret and evaluate the results of linguistic research, both their own and others'.
(d) They will be able to present qualitative or quantitative linguistic data in scientific prose articles.
Course Prerequisites
Students must have basic knowledge of general linguistics, as well as the ability to read and use bibliography in English.
Teaching Methods
Lectures
Slides, including interactive ones (created with Wooclap)
Seminar-style classes
Group activities
Case study
Slides, including interactive ones (created with Wooclap)
Seminar-style classes
Group activities
Case study
Assessment Methods
The exam will consist of the following components:
Preparation of a final paper (up to 15 points)
Oral exam (up to 10 points)
Group work (up to 5 points)
Active participation during the course (honors)
The overall grade for ‘Dati empirici e teorie linguistiche’ will be the average of the grades from the two modules that make it up: ‘Laboratorio di Analisi di Dati linguistici’ and ‘Sintassi e semantica.’
The final paper may involve:
Analysis of a phenomenon within a corpus
Analysis of a phenomenon using one or more tools learned during the course
Data collection/elicitation and the design of a small corpus
Discussion of the issues/perspectives of a linguistic resource of interest
The final paper, which should be 8-12 pages long, must be submitted one week before the exam and may be written in Italian or another language of the student’s choice. The final exam will be completed with an oral exam, where students will discuss their paper with the instructor and demonstrate their understanding of the course content (1-2 questions). The duration of the oral exam will vary depending on the assessment of the final paper.
Non-attending students must contact the instructor to arrange a substitute program for the group work component.
Preparation of a final paper (up to 15 points)
Oral exam (up to 10 points)
Group work (up to 5 points)
Active participation during the course (honors)
The overall grade for ‘Dati empirici e teorie linguistiche’ will be the average of the grades from the two modules that make it up: ‘Laboratorio di Analisi di Dati linguistici’ and ‘Sintassi e semantica.’
The final paper may involve:
Analysis of a phenomenon within a corpus
Analysis of a phenomenon using one or more tools learned during the course
Data collection/elicitation and the design of a small corpus
Discussion of the issues/perspectives of a linguistic resource of interest
The final paper, which should be 8-12 pages long, must be submitted one week before the exam and may be written in Italian or another language of the student’s choice. The final exam will be completed with an oral exam, where students will discuss their paper with the instructor and demonstrate their understanding of the course content (1-2 questions). The duration of the oral exam will vary depending on the assessment of the final paper.
Non-attending students must contact the instructor to arrange a substitute program for the group work component.
Texts
Texts and works that will be referred to during the classes are listed below. Please note that the following list is not exhaustive. At the end of each class, the instructor will indicate the parts that have been covered and that will have to be prepared for the exam.
Anthony, L. 2013. A critical look at software tools in corpus linguistics. Linguistic Research 30(2): 141-161.
Dryer, M. S. 2006. Descriptive theories, explanatory theories and Basic Linguistic Theory. In Catching Language: The Art and Craft of Grammar Writing, Evans, N., A. Dench & F. Ameka (eds.), 207-234. Berlin: de Gruyter.
Himmelmann, N. P. 2012. Linguistic Data Types and the Interface between Language Documentation and Description. Language Documentation and Conservation 6: 187-207.
Hovy, E. & J. Lavid. 2010. Towards a “Science” of Corpus Annotation: A New Methodological Challenge for Corpus Linguistics. International Journal of Translation Studies 22(1): 13–36
Iannàccaro, G. 2000. Per una semantica più puntuale del concetto di “dato linguistico”: un tentativo di sistematizzazione epistemologica. Quaderni di Semantica 21(1):51-79.
Lehmann, C. 2004. Data in Linguistics. The Linguistics Review 21(3/4): 275-310.
Litosseliti, Lia (ed.). 2025. Research methods in Linguistics. London: Bloomsbury.
Paquot, M. & S. T. Gries. 2020. A Practical Handbook of Corpus Linguistics. Cham: Springer.
Podesva Robert J., Devyani Sharma. 2014. Research methods in linguistics. Cambridge, UK: CUP.
Tummers, J., K. Heylen & D. Geeraerts. 2005. Usage-based approaches in Cognitive Linguistics: A technical state of the art. Corpus Linguistics and Linguistic Theory 1(2): 225-261.
Anthony, L. 2013. A critical look at software tools in corpus linguistics. Linguistic Research 30(2): 141-161.
Dryer, M. S. 2006. Descriptive theories, explanatory theories and Basic Linguistic Theory. In Catching Language: The Art and Craft of Grammar Writing, Evans, N., A. Dench & F. Ameka (eds.), 207-234. Berlin: de Gruyter.
Himmelmann, N. P. 2012. Linguistic Data Types and the Interface between Language Documentation and Description. Language Documentation and Conservation 6: 187-207.
Hovy, E. & J. Lavid. 2010. Towards a “Science” of Corpus Annotation: A New Methodological Challenge for Corpus Linguistics. International Journal of Translation Studies 22(1): 13–36
Iannàccaro, G. 2000. Per una semantica più puntuale del concetto di “dato linguistico”: un tentativo di sistematizzazione epistemologica. Quaderni di Semantica 21(1):51-79.
Lehmann, C. 2004. Data in Linguistics. The Linguistics Review 21(3/4): 275-310.
Litosseliti, Lia (ed.). 2025. Research methods in Linguistics. London: Bloomsbury.
Paquot, M. & S. T. Gries. 2020. A Practical Handbook of Corpus Linguistics. Cham: Springer.
Podesva Robert J., Devyani Sharma. 2014. Research methods in linguistics. Cambridge, UK: CUP.
Tummers, J., K. Heylen & D. Geeraerts. 2005. Usage-based approaches in Cognitive Linguistics: A technical state of the art. Corpus Linguistics and Linguistic Theory 1(2): 225-261.
Contents
In the first part of the course, we will begin by questioning what constitutes linguistic data; we will provide a semiotic definition of linguistic data and focus on its epistemological status. We will then highlight the specific characteristics of linguistic data and distinguish between different types of them. We will explain what it means to describe, analyze, and explain linguistic data, and why it is appropriate for these tasks to represent distinct phases of linguistic research. Subsequently, we will relate the various conceptions of linguistics and its subject matter to the different methods of data collection and elicitation. We will delve into the main techniques for eliciting non-existent data and briefly touch upon the main software tools for analyzing it.
In the second part of the course, students will become familiar with the guidelines for creating a corpus. We will focus on how to sample collected data based on our research design and corpus principles such as balance, representativeness, scope, and saturation. Later, we will systematize some basic concepts of Corpus Linguistics. After briefly discussing the history and “philosophy” of the field, we will define what a corpus is and introduce different types of corpora, gathered for different users and purposes. We will provide concrete examples of how different corpora can be used to address various research questions. Finally, we will briefly address the purposes for which annotated corpora can be used and the advantages and disadvantages of using them.
In the third part of the course, practical instructions will be given on how to use two tools for linguistic research: ELAN (for transcription of spoken data) and INCEpTION or MedTator (annotation tools). Afterward, students will be divided into small groups of five or six people to complete a group project. Using linguistic data provided by the instructor (e.g., tweets, newspaper articles, recordings, etc.), students will be asked to design an annotation schema on a linguistic phenomenon established by the instructor and implement it in INCEpTION/MedTator. The annotation schemas produced by the various groups will then be compared in a dedicated class session.
We will conclude the course with a session devoted to a collective reflection on the characteristics of a good linguistics paper—its content, form, and structure—in order to replicate the model in preparation for the final paper.
Some sections of the syllabus may be more or less in-depth depending on the students' prior knowledge and educational backgrounds. The course may be complemented by a series of optional seminars titled “Discorsi sul metodo” co-organized by PhD students in Linguistics.
In the second part of the course, students will become familiar with the guidelines for creating a corpus. We will focus on how to sample collected data based on our research design and corpus principles such as balance, representativeness, scope, and saturation. Later, we will systematize some basic concepts of Corpus Linguistics. After briefly discussing the history and “philosophy” of the field, we will define what a corpus is and introduce different types of corpora, gathered for different users and purposes. We will provide concrete examples of how different corpora can be used to address various research questions. Finally, we will briefly address the purposes for which annotated corpora can be used and the advantages and disadvantages of using them.
In the third part of the course, practical instructions will be given on how to use two tools for linguistic research: ELAN (for transcription of spoken data) and INCEpTION or MedTator (annotation tools). Afterward, students will be divided into small groups of five or six people to complete a group project. Using linguistic data provided by the instructor (e.g., tweets, newspaper articles, recordings, etc.), students will be asked to design an annotation schema on a linguistic phenomenon established by the instructor and implement it in INCEpTION/MedTator. The annotation schemas produced by the various groups will then be compared in a dedicated class session.
We will conclude the course with a session devoted to a collective reflection on the characteristics of a good linguistics paper—its content, form, and structure—in order to replicate the model in preparation for the final paper.
Some sections of the syllabus may be more or less in-depth depending on the students' prior knowledge and educational backgrounds. The course may be complemented by a series of optional seminars titled “Discorsi sul metodo” co-organized by PhD students in Linguistics.
Course Language
Italian
More information
All materials for the course - including the updated list of readings, the slides of the lectures, the guidelines for the final assignment - are available on the KIRO platform (access with personal username and password).
Degrees
Degrees
2 years
No Results Found