The course aims to provide students with the fundamental knowledge required for data analysis and modeling. In particular, it will introduce the most commonly used algorithms in statistical analysis, such as the simple and multiple linear regression model, residual analysis, and model selection techniques, with attention to both theoretical foundations and practical applications, through numerous examples with real and simulated data. The course will also present more advanced tools, such as decision trees and cluster analysis.
Course Prerequisites
No specific prerequisites are required
Teaching Methods
Theoretical lessons and programming sessions in R
Assessment Methods
Written examination aimed at assessing the acquisition of theoretical foundations and basic programming skills, as well as the ability to correctly interpret the outputs obtained from data analysis.
Texts
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning: with Applications in R (2nd ed.). Springer. ISBN 978-1-0716-1418-1.
Contents
- Univariate exploratory analysis - Multivariate exploratory analysis - Simple linear model - Multiple linear model - Residual analysis - Model comparison and selection of the best model - Decision trees - Cluster analysis
The methods presented will be implemented using the R programming language.