Data science and Machine Learning for Engineering Applications

Data science and Machine Learning for Engineering Applications

General Information

SSD: ING-INF/05

CFU: 6

Professor: Tania Cerquitelli – tania.cerquitelli@polito.it

Teaching Assistants: Salvatore Greco – salvatore_greco@polito.it

Class Schedule: Wednesday 10:00 am – 1:00 pm (classroom LGI1 – Laboratorio di GeoInformatica1) – Thursday 8:30 am – 10:00 am (classroom 1P)

Join Piazza: link


Teaching Material

This section contains the slides of the Data Science and Machine Learning for Engineering Applications course.

  1. Course introduction: pdf
  2. Data science intro: pdf
  3. Data Preprocessing: pdf
  4. Association Rules: pdf
  5. Clustering: pdf
  6. Classification: pdf
  7. Regression: pdf
  8. Examples of Theory questions*: pdf
  9. Examples of Theory questions part 2*: pdf


Laboratory Material

This section contains the slides and the lecture notes of the laboratory lectures

  1. Python Basics: slides lecture_notes
  2. Numpy: slides lecture_notes
  3. Matplotlib: slides lecture_notes
  4. Pandas: slides lecture_notes
  5. MLxtend: official documentation
  6. Scikit-Learn Clustering: slides
  7. Scikit-Learn Classification: slides example_zip example_pdf
  8. Scikit-Learn Regression: slides example_reg_pdf example_polynomial_reg_pdf
  9. Scikit-Learn and Pandas Pre-Processing: slides
  10. Introduction to Deep Learning and Image Processing: slides

Laboratory Exercises

This section contains the texts and the solutions to the laboratory exercises

  1. Python installation guide: pdf
  2. Python Basics (Lab 1):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 1 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 1
  3. Numpy (Lab 2):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 2 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 2
  4. Matplotlib (Lab 3):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 3 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 3
  5. Pandas (Lab 4):
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 4 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 4
  6. MLXtend – Association Rules (Lab 5)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 5 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 5
  7. Scikit-Learn – Clustering (Lab 6)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 6 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 6
  8. Scikit-Learn – Classification (Lab 7)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 7 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 7
  9. Scikit-Learn – Regression (Lab 8)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 8 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 8
  10. Scikit-Learn and Pandas – Pre-Processing (Lab 9)
    • Text: zip folder with the Jupyter notebook containing the text exercises of lab 9 (zip)
    • Solutions: Zip folder with the Jupyter notebook (solutions_zip), and PDF (solutions_pdf) containing the solutions to lab 9

Homeworks

Each homework consists of a Jupyter notebook with the text exercises and the code blocks to be inserted (delimited by comments). Some hints and expected outputs are included for each exercise.

Please submit your homework through the teaching portal under the “Homework” section (or “Elaborati”). You must submit your Jupyter notebook by placing it in a zip file with the following name format: homeworkN_sXXXXXX.zipN must be replaced with the homework number (i.e., for the first homework, with 1). XXXXXX must be replaced with your student identifier. For instance, homework1_s123456.zip is a valid submission format for the first homework and student s123456.

Each correctly submitted homework will contribute 0.5 points toward the final examination grade.

  • Homework 1 (Python Basics) (Deadline: 05/04/2024):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 1 (zip)
  • Homework 2 (Numpy) (Deadline: 29/04/2024):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 2 (zip)
  • Homework 3 (Scikit-Learn Clustering) (Deadline: 17/05/2024):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 3 (zip)
  • Homework 4 (Scikit-Learn Classification) (Deadline: 24/05/2024):
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 4 (zip)
  • Homework 5 (Scikit-Learn Regression) (Deadline: 14/06/2024)
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 5 (zip)
  • Homework 6 (Pre-Processing) (Deadline: 14/06/2024)
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 6 (zip)
  • Homework 7 (Image Processing) (Deadline: 18/06/2024)
    • Text: Zip folder with the Jupyter notebook containing the exercises of homework 7 (zip)

Final Projects

Google form for the definition of the groups: link

Project proposal:

  1. Binary Classification: description, data
  2. Multi-class Classification: description, data
  3. Regression: description, data

Project description: slides

Report template: Word LaTeX



Suggested Books

E. Matthes. Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming. No Starch Press, 2019. isbn: 9781593279288

Jake VanderPlas. 2016. Python Data Science Handbook: Essential Tools for Working with Data (1st. ed.). O’Reilly Media, Inc.

McKinney Wes. 2017. Python for data analysis (2nd. ed.). O’Reilly Media, Inc.