ID:
509074
Durata (ore):
60
CFU:
6
SSD:
SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
Anno:
2024
Dati Generali
Periodo di attività
Secondo Semestre (03/03/2025 - 13/06/2025)
Syllabus
Obiettivi Formativi
This course provides an in-depth exploration of the key theories, solutions and tools for developing data science approaches in contemporary big data scenarios.
At the end of this course the student must be able to: understand data, formulate hypothesis on data, transform and model data, define a methodology for the analysis, test and confirm results.
The students will learn the main techniques for data science using machine learning and deep learning solutions.
They will learn basic concepts of cloud computing along with the best practice to being able to operate in this context. Moreover, they will investigate the main architectures for processing big-datasets, such as MapReduce.
Finally, they will understand how to work with NoSQL databases and other storage solutions to face the main issues in the big data realm.
At the end of this course the student must be able to: understand data, formulate hypothesis on data, transform and model data, define a methodology for the analysis, test and confirm results.
The students will learn the main techniques for data science using machine learning and deep learning solutions.
They will learn basic concepts of cloud computing along with the best practice to being able to operate in this context. Moreover, they will investigate the main architectures for processing big-datasets, such as MapReduce.
Finally, they will understand how to work with NoSQL databases and other storage solutions to face the main issues in the big data realm.
Prerequisiti
Basic Knowledge on database management systems, data manipulation, and programming.
Metodi didattici
This course is organized in lectures, laboratory and cooperative learning.
Lectures are used to present theoretical concepts and all the notions about this course. During the lectures, the student will also understand how to apply these notions.
Laboratory is used as a mean to allow the student to apply the concepts and techniques shown during the lectures to real-world case studies.
Finally, this course leverages cooperative learning, group working and brainstorming. This allows for the development of many transversal skills, such as: team working capabilities, conflict management, and the capability to acquire and exploit different ideas from a team.
Lectures are used to present theoretical concepts and all the notions about this course. During the lectures, the student will also understand how to apply these notions.
Laboratory is used as a mean to allow the student to apply the concepts and techniques shown during the lectures to real-world case studies.
Finally, this course leverages cooperative learning, group working and brainstorming. This allows for the development of many transversal skills, such as: team working capabilities, conflict management, and the capability to acquire and exploit different ideas from a team.
Verifica Apprendimento
The assessment consists of an oral discussion about a group project work each student is involved in. The student is in charge of preparing a report about his project work.
During the oral discussion the report presented by the student will bel used as a mean to go in-depth in the theoretical concepts used therein.
To prepare the report, the student will have to use the tools, introduced during lectures, to extract knowledge form real-life datasets.
During the assessment the student must prove a good knowledge of the main concepts introduced in this course, to be able to handle the lifecycle of a data science project and to know the main architectures, tools and NoSQL solutions to work in the data analytics and Big Data contexts.
The assessment will carefully consider the level of expertise in the use of the tools, the ability of the student to build projects adopting these tools, the level of understanding of the notions taught in this course, the methodological rigor and appropriateness of the technical vocabulary.
During the oral discussion the report presented by the student will bel used as a mean to go in-depth in the theoretical concepts used therein.
To prepare the report, the student will have to use the tools, introduced during lectures, to extract knowledge form real-life datasets.
During the assessment the student must prove a good knowledge of the main concepts introduced in this course, to be able to handle the lifecycle of a data science project and to know the main architectures, tools and NoSQL solutions to work in the data analytics and Big Data contexts.
The assessment will carefully consider the level of expertise in the use of the tools, the ability of the student to build projects adopting these tools, the level of understanding of the notions taught in this course, the methodological rigor and appropriateness of the technical vocabulary.
Testi
1. Data Science and Big Data Analytics - Discovering, Analyzing, Visualizing and Presenting Data. Wiley.
2. Big Data Fundamentals – Concepts, Drivers & Techniques. Prentice Hall, 2015.
3. Data Mining - Practical Machine Learning Tools and Techniques. Elsevier.
4. Data Mining - Concept and Techniques. Elsevier.
5. Notes provided by Professor
2. Big Data Fundamentals – Concepts, Drivers & Techniques. Prentice Hall, 2015.
3. Data Mining - Practical Machine Learning Tools and Techniques. Elsevier.
4. Data Mining - Concept and Techniques. Elsevier.
5. Notes provided by Professor
Contenuti
What is Data Science.
Python libraries for data manipulation and Machine Learning.
The Big Data paradigm and the main issues in this context.
Brief overview of NoSQL Databases and the main Cloud architectures for Big Data.
Hadoop, HDFS, Map Reduce, and an overview of Apache Spark.
Fundamental concepts of Explainable AI.
An introduction to PyTorch as a primary library for Deep Learning, and SHAP as an XAI tool for deep learning models.
Text mining, Natural Language Processing (NLP) and Large Language Models.
Data Science in Cybersecurity and Cyber Threat Intelligence.
Python libraries for data manipulation and Machine Learning.
The Big Data paradigm and the main issues in this context.
Brief overview of NoSQL Databases and the main Cloud architectures for Big Data.
Hadoop, HDFS, Map Reduce, and an overview of Apache Spark.
Fundamental concepts of Explainable AI.
An introduction to PyTorch as a primary library for Deep Learning, and SHAP as an XAI tool for deep learning models.
Text mining, Natural Language Processing (NLP) and Large Language Models.
Data Science in Cybersecurity and Cyber Threat Intelligence.
Lingua Insegnamento
INGLESE
Corsi
Corsi (2)
COMPUTER ENGINEERING
Laurea Magistrale
2 anni
2 anni
No Results Found