Data science lab: process and methods (2020/2021)

This page has hierarchy - Parent page: Teaching

General information

CFU: 8
Professor: Elena Baralis
Teaching assistants:
Tania Cerquitelli (Lessons), Andrea Pasini (Python classes)
Giuseppe Attanasio, Flavio Giobergia, Francesco Ventura (Laboratory sessions)



  • 06-10-2020. You can register to Piazza whether you are already enrolled in the course or you are waiting for it. If you do not have an address yet, drop an email to including your personal email address and your ID on the Polito website or the Apply procedure – it is in the format FXXXXX. The registration with your personal address is temporary: remember to add your educational address on Piazza as soon as you receive it.
  • 28-09-2020. The laboratories will not take place in the first week of the course.
  • 28-09-2020.  During the semester, we will be using Piazza. Piazza is a collaborative Questions and Answers platform that allows students to post their questions to the teaching staff. Please signup here with your educational email address and use your full name, i.e. Name Surname. We will also post on Piazza useful notes, suggestions, and the code for Python exercises (under the section “Resources” on Piazza).
  • 25-09-2020. The first Python lesson will be on 02 October 2020. We suggest you to bring your own PC with Python3 and Jupyter installed.
    In the “Python” section you can find instruction for installing the necessary software.

Learning material


Data science

This section will contain the slides of the data science course.

    • Course introduction (pdf)
    • Introduction to data science (pdf)
    • Data preprocessing (pdf)
    • Association rules (pdf)
    • Classification (pdf)
    • Regression analysis (pdf)
    • Time series analysis (pdf)
    • Data exploration, Feature Engineering, Data visualization (pdf)
    • Clustering (pdf)



This section will contain the slides and material of the Python classes.

  • Exercises on piazza. Here we will publish text and solutions of the exercises solved during Python lectures.
  • Python installation tutorial (pdf)
  • GitHub tutorial (pdf). Github is a useful resource to share your code online and manage version control.


    • Introduction to Python (pdf)
    • Python programming (pdf)
    • Overview of Python libraries and Matplotlib (pdf)
    • Structuring Python projects (pdf)
    • Numpy (pdf)
    • Pandas (pdf)
    • Scikit-learn: classification (pdf)
    • Scikit-learn: regression (pdf)
    • Scikit-learn: clustering (pdf)

Exam exercises

Exercises for the written exam


Laboratory material

This section will contain all the material for carrying out laboratories.

  • Laboratory 1 (7-8 October 2020): pdf – Solutions: html
  • Laboratory 2 (14-15 October 2020): pdf – Solutions: html
  • Laboratory 3 (21-22 October 2020): pdf (updated 22/10/2020)