## General Information

**SSD**: ING-INF/05

**CFU**: 6

**Professor**: Tania Cerquitelli – tania.cerquitelli@polito.it

**Teaching Assistants**: Salvatore Greco – salvatore_greco@polito.it

**Class Schedule:** Tuesday 1:00 pm – 2:30 pm (classroom 1S) – Wednesday 11:30 am – 2:30 pm (classroom 13S)

**Tutoring Hours Schedule** (Starting from **15/05/23**): Monday 10:00 am – 11:30 am (classroom 8C); Monday 2:30 pm – 4:00 pm (classroom 5M)

**Join Piazza**: link

### Teaching Material

This section contains the slides of the Data Science and Machine Learning for Engineering Applications course.

- Course introduction*: pdf
- Data science intro*: pdf
- Data Preprocessing: pdf
- Association Rules: pdf
- Clustering: pdf
- Classification: pdf
- Regression: pdf

#### Laboratory Material

This section contains the slides and the lecture notes of the laboratory lectures

- Python Basics: slides lecture_notes
- Numpy: slides lecture_notes
- Matplotlib: slides lecture_notes
- Pandas: slides lecture_notes
- Scikit-Learn Clustering: slides
- MLxtend: official documentation
- Scikit-Learn Classification: slides
- Scikit-Learn Regression: slides
- Scikit-Learn and Pandas Pre-Processing: slides

#### Laboratory Exercises

This section contains the texts and the solutions to the laboratory exercises

- Python installation guide: pdf
- Python Basics (Lab 1):
**Text:**zip folder with the Jupyter notebook containing the**text exercises**of lab 1 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 1

- Numpy (Lab 2):
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 2 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 2

- Matplotlib (Lab 3):
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 3 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 3

- Pandas (Lab 4):
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 4 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 4

- Scikit-Learn – Clustering (Lab 5)
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 5 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 5

- MLXtend – Association Rules (Lab 6)
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 6 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 6

- Scikit-Learn – Classification (Lab 7)
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 7 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 7

- Scikit-Learn – Regression (Lab 8)
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 8 (zip)**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 8

- Scikit-Learn and Pandas – Pre-Processing (Lab 9)
**Text:**zip folder with the Jupyter notebook containing the text exercises of lab 9 (zip) – data**Solutions:**Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the**solutions**to lab 9

### Homeworks

Each homework consists of a Jupyter notebook with the text exercises and the code blocks to be inserted (delimited by comments). Some hints and expected outputs are included for each exercise.

Please submit your homework through the **teaching portal** under the *“Homework”* section (or *“Elaborati”*). You must submit your **Jupyter notebook** by placing it in a **zip file** with the following name format: **homeworkN_sXXXXXX.zip**. **N** must be replaced with the homework number (i.e., for the first homework, with 1). **XXXXXX** must be replaced with your student identifier. For instance, **homework1_s123456.zip** is a valid submission format for the first homework and student s123456.

Each correctly submitted homework will contribute **0.5 points** toward the final examination grade.

- Homework 1 (
**Extended**Deadline:**29/03/2023**):**Text:**Zip folder with the Jupyter notebook containing the exercises of homework 1 (zip)

- Homework 2 (Deadline:
):**19/04/2023****Text:**Zip folder with the Jupyter notebook containing the exercises of homework 2 (zip)

- Homework 3 (Deadline:
):**17/05/2023****Text:**Zip folder with the Jupyter notebook containing the exercises of homework 3 (zip)

- Homework 4 (Deadline:
):**22/05/2023****Text:**Zip folder with the Jupyter notebook containing the exercises of homework 4 (zip)

- Homework 5 (Deadline:
):**29/05/2023****Text:**Zip folder with the Jupyter notebook containing the exercises of homework 5 (zip)

- Homework 6 (Deadline:
):**10/06/2023****Text:**Zip folder with the Jupyter notebook containing the exercises of homework 6 (zip)

### Final Projects

**Project groups**: pdf

**Clustering**: text, data**Binary Classification**: text, data**Multi-class Classification**: text, data**Regression**: text, data

Projects description: slides

Google form for the definition of the groups: link

### Suggested Books

E. Matthes. **Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming. No Starch Press**, 2019. isbn: 9781593279288

Jake VanderPlas. 2016. **Python Data Science Handbook: Essential Tools for Working with Data** (1st. ed.). O’Reilly Media, Inc.

McKinney Wes. 2017. **Python for data analysis** (2nd. ed.). O’Reilly Media, Inc.