Data science and Machine Learning for Engineering Applications

Data science and Machine Learning for Engineering Applications

General Information

SSD: ING-INF/05

CFU: 6

Professor: Tania Cerquitelli – tania.cerquitelli@polito.it

Teaching Assistants: Salvatore Greco – salvatore_greco@polito.it

Class Schedule: Tuesday 1:00 pm – 2:30 pm (classroom 1S) – Wednesday 11:30 am – 2:30 pm (classroom 13S)

Tutoring Hours Schedule (Starting from 15/05/23): Monday 10:00 am – 11:30 am (classroom 8C); Monday 2:30 pm – 4:00 pm (classroom 5M)

Join Piazza: link


Teaching Material

This section contains the slides of the Data Science and Machine Learning for Engineering Applications course.

  1. Course introduction*: pdf
  2. Data science intro*: pdf
  3. Data Preprocessing: pdf
  4. Association Rules: pdf
  5. Clustering: pdf
  6. Classification: pdf
  7. Regression: pdf


Laboratory Material

This section contains the slides and the lecture notes of the laboratory lectures

  1. Python Basics: slides lecture_notes
  2. Numpy: slides lecture_notes
  3. Matplotlib: slides lecture_notes
  4. Pandas: slides lecture_notes
  5. Scikit-Learn Clustering: slides
  6. MLxtend: official documentation
  7. Scikit-Learn Classification: slides
  8. Scikit-Learn Regression: slides
  9. Scikit-Learn and Pandas Pre-Processing: slides

Laboratory Exercises

This section contains the texts and the solutions to the laboratory exercises

  1. Python installation guide: pdf
  2. Python Basics (Lab 1):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 1 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 1
  3. Numpy (Lab 2):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 2 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 2
  4. Matplotlib (Lab 3):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 3 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 3
  5. Pandas (Lab 4):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 4 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 4
  6. Scikit-Learn – Clustering (Lab 5)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 5 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 5
  7. MLXtend – Association Rules (Lab 6)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 6 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 6
  8. Scikit-Learn – Classification (Lab 7)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 7 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 7
  9. Scikit-Learn – Regression (Lab 8)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 8 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 8
  10. Scikit-Learn and Pandas – Pre-Processing (Lab 9)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 9 (zip) – data
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 9

Homeworks

Each homework consists of a Jupyter notebook with the text exercises and the code blocks to be inserted (delimited by comments). Some hints and expected outputs are included for each exercise.

Please submit your homework through the teaching portal under the “Homework” section (or “Elaborati”). You must submit your Jupyter notebook by placing it in a zip file with the following name format: homeworkN_sXXXXXX.zipN must be replaced with the homework number (i.e., for the first homework, with 1). XXXXXX must be replaced with your student identifier. For instance, homework1_s123456.zip is a valid submission format for the first homework and student s123456.

Each correctly submitted homework will contribute 0.5 points toward the final examination grade.

  • Homework 1 (Extended Deadline: 29/03/2023):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 1 (zip)
  • Homework 2 (Deadline: 19/04/2023):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 2 (zip)
  • Homework 3 (Deadline: 17/05/2023):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 3 (zip)
  • Homework 4 (Deadline: 22/05/2023):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 4 (zip)
  • Homework 5 (Deadline: 29/05/2023):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 5 (zip)
  • Homework 6 (Deadline: 10/06/2023):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 6 (zip)

Final Projects

Project groups: pdf

  1. Clustering: text, data
  2. Binary Classification: text, data
  3. Multi-class Classification: text, data
  4. Regression: text, data

Report templates: Word LaTeX

Projects description: slides

Google form for the definition of the groups: link


Suggested Books

E. Matthes. Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming. No Starch Press, 2019. isbn: 9781593279288

Jake VanderPlas. 2016. Python Data Science Handbook: Essential Tools for Working with Data (1st. ed.). O’Reilly Media, Inc.

McKinney Wes. 2017. Python for data analysis (2nd. ed.). O’Reilly Media, Inc.