General Information
SSD: ING-INF/05
CFU: 8
Professor: Elena Baralis
Teaching Assistants: Flavio Giobergia, Eliana Pastor, Alkis Koudounas, Lorenzo Vaiani
Exams
Exam rules
The exam rules for the A.Y. 2022/23 are available here.
Written Exams
In this section you will find the results of the written tests — good luck!
- Winter Session # 1
- Winter Session # 2
- Summer Session
- Fall Session
Projects
Exam Session | Assignment | Results | Example Report * |
Winter | text | final scores | |
Summer | text | ||
Fall | text |
* Occasionally, we may ask students to publish here their reports in case of very good productions. They will serve as a reference for their colleagues.
Teaching Material
Data science
This section will contain the slides of the data science course.
- Course introduction (slides)
- Introduction to data science (slides)
- Data preprocessing (slides)
- Association rules (slides)
- Data exploration, feature engineering and data visualization (slides)
- Classification fundamentals (slides)
- Clustering fundamentals (slides)
- Regression analysis (slides)
- Time series analysis (slides)
Python
This section will contain the slides of the data science course.
- Introduction to Python (slides)
- Python programming (slides)
- Structuring Python projects (slides)
- NumPy (slides)
- Pandas (slides)
- Matplotlib (slides)
- scikit-learn – classification (slides)
- scikit-learn – regression (slides)
- scikit-learn – clustering (slides)
- scikit-learn – preprocessing (slides)
Exercises
Other material
- GitHub repository with exercises
- Scientific writing – how to write your report (slides)
- ML in production: Automation of ML pipelines with Luigi (slides, link repository)
- Talk – Financial Crime Detection (slides)
Laboratory Material
This section will contain all the material for carrying out laboratories. No laboratory will be evaluated and assigned a mark, so no laboratory will give additional points to the final exam.
Material
Introduction to laboratories – pdf
Data Science Lab Environment: link
- Lab 1 – Python basics (text) (solution)
- Lab 2 – Data Preparation (text) (solution)
- Lab 3 – Frequent Itemsets, Association Rules (text) (solution)
- Lab 4 – KNN implementation (text) (solution)
- Lab 5 – Pandas (text) (solution)
- Lab 6 – Tree-based models (text) (solution)
- Lab 7 – Classification * (text) (report example)
- Lab 8 – Modeling time series (text) (solution)
- Lab 9 – Regression * (text) (solution)
- Lab 10 – Clustering (text) (solution)
* During this laboratory, we will set up Data Science Lab Environment, the online evaluation platform we will use during the leaderboard part of the project.
Team organization
Students will be divided into two teams, Team 1 and Team 2. Team 1 will attend the laboratories on Monday from 13:00 to 16:00. Team 2, instead, will attend on Thursday from 11:30 to 14:30. Both lab sessions will be held in LAIB 3.
You can find the list of student ID – team mappings here. If your student ID is not in the least, please use the following rule:
- Team 1 if your last name starts with a letter from A to K
- Team 2 if your last name starts with a letter from J to Z