General Information
SSD: ING-INF/05
CFU: 6
Professor: Tania Cerquitelli – tania.cerquitelli@polito.it
Teaching Assistants: Salvatore Greco – salvatore_greco@polito.it
Class Schedule: Wednesday 10:00 am – 1:00 pm (classroom LGI1 – Laboratorio di GeoInformatica1) – Thursday 8:30 am – 10:00 am (classroom 1P)
Join Piazza: link
Teaching Material
This section contains the slides of the Data Science and Machine Learning for Engineering Applications course.
- Course introduction: pdf
- Data science intro: pdf
- Data Preprocessing: pdf
- Association Rules: pdf
- Clustering: pdf
- Classification: pdf
- Regression: pdf
- Examples of Theory questions*: pdf
- Examples of Theory questions part 2*: pdf
Laboratory Material
This section contains the slides and the lecture notes of the laboratory lectures
- Python Basics: slides lecture_notes
- Numpy: slides lecture_notes
- Matplotlib: slides lecture_notes
- Pandas: slides lecture_notes
- MLxtend: official documentation
- Scikit-Learn Clustering: slides
- Scikit-Learn Classification: slides example_zip example_pdf
- Scikit-Learn Regression: slides example_reg_pdf example_polynomial_reg_pdf
- Scikit-Learn and Pandas Pre-Processing: slides
- Introduction to Deep Learning and Image Processing: slides
Laboratory Exercises
This section contains the texts and the solutions to the laboratory exercises
- Python installation guide: pdf
- Python Basics (Lab 1):
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 1 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 1
- Numpy (Lab 2):
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 2 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 2
- Matplotlib (Lab 3):
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 3 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 3
- Pandas (Lab 4):
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 4 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 4
- MLXtend – Association Rules (Lab 5)
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 5 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 5
- Scikit-Learn – Clustering (Lab 6)
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 6 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 6
- Scikit-Learn – Classification (Lab 7)
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 7 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 7
- Scikit-Learn – Regression (Lab 8)
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 8 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 8
- Scikit-Learn and Pandas – Pre-Processing (Lab 9)
- Text: zip folder with the Jupyter notebook containing the text exercises of lab 9 (zip)
- Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 9
Homeworks
Each homework consists of a Jupyter notebook with the text exercises and the code blocks to be inserted (delimited by comments). Some hints and expected outputs are included for each exercise.
Please submit your homework through the teaching portal under the “Homework” section (or “Elaborati”). You must submit your Jupyter notebook by placing it in a zip file with the following name format: homeworkN_sXXXXXX.zip. N must be replaced with the homework number (i.e., for the first homework, with 1). XXXXXX must be replaced with your student identifier. For instance, homework1_s123456.zip is a valid submission format for the first homework and student s123456.
Each correctly submitted homework will contribute 0.5 points toward the final examination grade.
- Homework 1 (Python Basics) (Deadline: 05/04/2024):
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 1 (zip)
- Homework 2 (Numpy) (Deadline: 29/04/2024):
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 2 (zip)
- Homework 3 (Scikit-Learn Clustering) (Deadline: 17/05/2024):
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 3 (zip)
- Homework 4 (Scikit-Learn Classification) (Deadline: 24/05/2024):
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 4 (zip)
- Homework 5 (Scikit-Learn Regression) (Deadline: 14/06/2024)
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 5 (zip)
- Homework 6 (Pre-Processing) (Deadline: 14/06/2024)
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 6 (zip)
- Homework 7 (Image Processing) (Deadline: 18/06/2024)
- Text: Zip folder with the Jupyter notebook containing the exercises of homework 7 (zip)
Final Projects
Google form for the definition of the groups: link
Project proposal:
- Binary Classification: description, data
- Multi-class Classification: description, data
- Regression: description, data
Project description: slides
Suggested Books
E. Matthes. Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming. No Starch Press, 2019. isbn: 9781593279288
Jake VanderPlas. 2016. Python Data Science Handbook: Essential Tools for Working with Data (1st. ed.). O’Reilly Media, Inc.
McKinney Wes. 2017. Python for data analysis (2nd. ed.). O’Reilly Media, Inc.