Data science lab: process and methods (2020/2021)
General information
SSD: ING-INF/05
CFU: 8
Professor: Elena Baralis
Teaching assistants:
Tania Cerquitelli (Lessons), Andrea Pasini (Python classes)
Giuseppe Attanasio, Flavio Giobergia, Francesco Ventura (Laboratory sessions)
Exam
The following are the key dates for the 2021 september session:
- August 24, 2021 (by end of day): submission platform opens
- September 6, 2021: written exam
- September 14, 2021, 23:59 CEST: submission platform closes
Exam rules: pdf
The report template (adapted from the official IEEE template) in LaTeX (you are strongly encouraged to use this one) or Word format.
Make sure you test your software/hardware setup before taking the exam. You can find a simulation of the exam (using Respondus + Lockdown Browser) from Portale della didattica (Remote Exams -> Take the simulation test)
Submission platform: link
Fall call
Past calls
Summer call
Second call of the winter session:
First call of the winter session:
- Assignment: pdf
- Written scores (25/01/2021): pdf
- Final scores (25/01/2021): pdf
- Some of the best-written reports: pdf, pdf, pdf, pdf
Announcements
- 06-10-2020. You can register to Piazza whether you are already enrolled in the course or you are waiting for it. If you do not have an @studenti.polito.it address yet, drop an email to giuseppe.attanasio@polito.it including your personal email address and your ID on the Polito website or the Apply procedure – it is in the format FXXXXX. The registration with your personal address is temporary: remember to add your educational address on Piazza as soon as you receive it.
- 28-09-2020. The laboratories will not take place in the first week of the course.
- 28-09-2020. During the semester, we will be using Piazza. Piazza is a collaborative Questions and Answers platform that allows students to post their questions to the teaching staff. Please signup here with your educational email address and use your full name, i.e. Name Surname. We will also post on Piazza useful notes, suggestions, and the code for Python exercises (under the section “Resources” on Piazza).
- 25-09-2020. The first Python lesson will be on 02 October 2020. We suggest you to bring your own PC with Python3 and Jupyter installed.
In the “Python” section you can find instruction for installing the necessary software.
Learning material
Data science
This section will contain the slides of the data science course.
- Course introduction (pdf)
- Introduction to data science (pdf)
- Data preprocessing (pdf)
- Association rules (pdf)
- Classification (pdf)
- Regression analysis (pdf)
- Time series analysis (pdf)
- Data exploration, Feature Engineering, Data visualization (pdf)
- Clustering (pdf) – NEW (07/12/2020)
- Clustering (pdf) OLD
Other material
- Use case: Modelling energy efficiency of buildings based on open-data (pdf)
- Use case: Characterising Electricity Consumption Over Time for Residential Consumers through cluster analysis (pdf)
Python
This section will contain the slides and material of the Python classes.
Material- Exercises on piazza. Here we will publish text and solutions of the exercises solved during Python lectures.
- Python installation tutorial (pdf)
- GitHub tutorial (pdf). Github is a useful resource to share your code online and manage version control.
Slides
Exam exercises
Exercises for the written exam
- Exercise 1, Text, Solution
- Exercise 2, Text, Solution
- Exercise 3, Text and solution
Laboratory material
This section will contain all the material for carrying out laboratories.
- Laboratory 1 (7-8 October 2020): pdf – Solution: html
- Laboratory 2 (14-15 October): pdf – Solution: html
- Laboratory 3 (21-22 October): pdf (updated 22/10/2020) – Solution: html
- Laboratory 4 (28-29 October): pdf (updated, EDIT on Equation 3) – Solution: html
- Laboratory 5 (4-5 November): pdf – Solution: html
- Laboratory 6 (11-12 November): pdf – Solution: html
- Laboratory 7 (18-19 November): pdf – Solution: pdf
- Laboratory 8 (25-26 November): pdf – Solution: pdf
- Laboratory 9 (2-3 December): pdf – Solution: pdf
- Laboratory 10 (9-10 December): pdf
Research Bites and Seminars
- ML in production – Automation of ML pipelines with Luigi – 11/12/2020 – Eliana Pastor: material
- Image understanding: Tasks and architectures – 14/12/2020 – Andrea Pasini: slides
- Generative Adversarial Networks: Beyond discriminative models – 14/12/2020 – Moreno La Quatra: slides
- A brief (and practical) introduction to word embeddings – 18/12/2020 – Flavio Giobergia: slides (demo)
- From recurrent models to the advent of Attention: a recap – 18/12/2020 – Giuseppe Attanasio: slides
- Explainable Artificial Intelligence: an introduction to current trends – 18/12/2020 – Francesco Ventura: slides
- How to start a start-up – 16/10/2020 – Luca de Alfaro: slides