Data Science And Database Technology (2024/2025)

Data Science And Database Technology (2024/2025)

General Information

SSDCFUProfessorTeaching Assistants
ING-INF/058Silvia ChiusanoAlessandro Fiori, Davide Napolitano

📰 Announcements [dd/mm/yyyy]

  • [18/11/2024] Homework 1 published
  • [10/10/2024] Next week (14/10/2024 – 18/10/2024) Lab1 will take place

📅Weekly schedule (18/11/2024 – 22/11/2024)

MondayTuesdayWednesdayThursdayFriday
8:30-10:00[TEAM B: LAIB2B]
Lab 4: Data mining with Python
10:00-11:30[ROOM R2]
[ROOM R1]
11:30-13:00[ROOM R1]
13:00-14:30[TEAM A: LAIB3]
Lab 4: Data mining with Python
14:30-16:00[ROOM R2]
Physical access to data
16:00-17:30[TEAM C: LAIB2B]
Lab 4: Data mining with Python
17:30-19:00

📒Teaching Material

Course IntroductionCourse introduction: pdf
Introduction to Data ScienceIntroduction to Data Science (slides)

Data warehouse
Data warehouse: introduction (slides)
Data warehouse: design (slides)
Data warehouse: analysis (slides)
Data warehouse: materialized view, physical design, ETL (slides)
Data lakesData lakes (slides)
Data miningData mining process (slides)
Data preparation (slides)
Data mining: association rules (slides)
Data mining: classification (slides)
Data mining: clustering (slides)
DBMSIntroduction to DBMS (slides)
Buffer Manager (slides)
Physical access to data (slides)
Query optimization (slides)
Physical Design (slides)
Concurrency Control (slides)
Reliability management (slides)
Oracle: Oracle optimizer (slides) Oracle Hints (slides)
Distributed databases (slides)
NoSQL, beyond relational databases (slides)
Introduction to MongoDB (slides)
ElasticSearch (slides)


🗒️Exercise

SUBJECTMATERIAL
Extended SQL, materialized view, triggersExtended SQL and materialized view in Oracle (2 slides per page6 slides per page)
Exercise 1 on extended SQL (text)
Materialized views and triggers (text, draft solution)
Supporting material: Introduction to triggers (slides)
Data WarehouseStorehouses (text, draft solution)
Italian wines (text, draft solution)
Remote heating (text, draft solution)
Scientific publications (text)
Parcels (text)
Query optimization


💻Laboratory

  • The laboratory practices will start from the fourth week.
  • The subdivision into teams may be subject to change following the close of the registration period.
  • It is recommended to respect the division into teams to allow the laboratories to take place.
  • Remember to bring your laptop to solve the lab.
LAB TEAMS (Division into teams for Surname)WHENHOURWHERE
TEAM A: from AAA to GZZMonday
(except Lab 6: Tuesday 17/12/2024 – 17:30-19:00 Laib2B)
13:00-14:30LAIB3
TEAM B: from HAA to OZZFriday8:30-11:30LAIB2B
TEAM C: from PAA to ZZZTuesday16:00-17:30LAIB2B
SUBJECTTEAM ATEAM BTEAM CTEXTSOLUTIONSOFTWARE
Lab 1: Extended SQLMonday 21/10/2024Friday 18/10/2024 Tuesday 22/10/2024TextDW SQLFiles
Lab 2: Data StudioMonday 28/10/2024Friday 25/10/2024Tuesday 29/10/2024Text
Lab 3: Materialize viewsMonday 11/11/2024Friday 15/11/2024Tuesday 12/11/2024TextSol
Lab 4: Data mining with PythonMonday 18/11/2024Friday 22/11/2024Tuesday 19/11/2024TextFiles
Lab 5: Oracle optimizerMonday 25/11/2024Friday 29/11/2024Tuesday 26/11/2024
Lab 6: MongoDBTuesday 17/12/2024 (17:30-19:00 Laib2B)Friday 20/12/2024Tuesday 17/12/2024

📗Homeworks

To obtain the points associated with the Homeworks, students have to observe the following terms:

  • Complete all the points of the exercises in the homework text.
  • All exercises must be computer-written (e.g. Conceptual Schema, Logical Schema, SQL queries, Triggers, etc…).
  • Prepare one file in PDF format with the solution of the homework.
  • Name the file as: HomeworkN_Surname_Name_StudentId.pdf where
    • StudentId, Surname and Name should be substituted with student information (place all names/surnames that are used in your PoliTo account, separated by the underscore char)
    • the N character following Homework should be substituted with the number of the submitted homework
    • Since uploaded files are automatically processed, using the wrong format or naming the file with a wrong name implies the cancellation of the related homework submission.
    • For example, for homework 1 and extension pdf, the student with name Luigi Maria, surname Rossi and id s123456 will upload Homework1_Rossi_Luigi_Maria_s123456.pdf
  • Load the file on the didactic portal (Portale della didattica) in the section Work Submission (Elaborati) before the deadline.
    • Multiple loadings for the same student and/or for the same homework are not allowed.
    • The upload date shown on the didactic portal is considered for the evaluation.
    • Since uploaded files are automatically processed, uploading after the deadline implies canceling the related homework submission.
  • During the upload procedure a description (“Descrizione”) field is requested. Insert the same name of the file according to the rules described above.
  • Only the students without the access to the course page on the didactic portal can submit the homework before the deadline by sending an email with the PDF to the assistant lecturer (davide.napolitano@polito.it)
  • Discuss the homework with a positive evaluation on the fixed date (announcement will be published).

Homework Info:

  • Homoworks are not mandatory
  • Each Homework provides at most 0.5pt, resulting in a max score of 2pt
  • Each Homework is evaluated between 0 and 30, with the final score scaled into the [0, 0.5] pt range.

Homework Discussion:

Homework Schedule:

HomeworkTextFilesUploadDeadline
Homework1: DW, Extended SQL and MVtextuploaded before the end of November 18th, 2024 to be delivered by November 27th, 2024 at 11.59 PM (UTC/GMT+1)
Homework2: Data mining
Homework3: The Optimizer
Homework4: MongoDB