Data Science And Database Technology (2023/2024)

Data Science And Database Technology (2023/2024)

General Information

SSD: ING-INF/05

CFU: 8

Professor: Silvia Chiusano

Teaching Assistants: Alessandro Fiori, Davide Napolitano

Announcements [yy-mm-dd]

  • [2023-11-07] – Wednesday, November 8, 2023: Laboratory on Data Studio for TEAMS A and B
  • [2023-10-17] – The laboratory starts on Wednesday 25/10/2023 – The organization of students into teams has been published (more details in Laboratory section of this page)

Weekly schedule (from 8/1/2024 to 12/1/2024)

10:00-11:3013:00-14:3014:30-16:0017:30-19:00
Monday [Room R4]
Exam simulation
[Room R4]
Solution of exam simulation
Wednesday[LAIB2B]
Team A: No Lab
[LAIB2B]
Team B: No Lab
[Room R3]
Solution of written exam
Thursday[Room R2]
Solution of homework

Teaching Material

  • Course introduction: pdf

Part I

  • Introduction to Data Science (slides)
  • Data warehouse: introduction (slides)
  • Data warehouse: design (slides)
  • Data warehouse: analysis (slides)
  • Data warehouse: materialized view, physical design, ETL (slides)
  • Data lakes (slides)
  • Data mining process (slides)
  • Data preparation (slides)
  • Data mining: association rules (slides)
  • Data mining: classification (slides)
  • Data mining: clustering (slides)

Part II

  • Introduction to DBMS (slides)
  • Buffer Manager (slides)
  • Physical access to data (slides)
  • Query optimization (slides)
  • Physical Design (slides)
  • Concurrency Control (slides)
  • Reliability management (slides)
  • Distributed databases (slides)
  • NoSQL, beyond relational databases (slides)
  • Introduction to MongoDB (slides)
  • ElasticSearch (slides)


Exercise

SUBJECTMATERIAL
Extended SQL, materialized view, triggersExtended SQL and materialized view in Oracle (2 slides per page6 slides per page)
Exercise 1 on extended SQL (text, draft solution)
Materialized views and triggers (text, draft solution)
Supporting material: Introduction to triggers (slides)
Data WarehouseStorehouses (text, draft solution)
Italian wines (text, draft solution)
Remote heating (text, draft solution)
Scientific publications (text)
Query optimizationFine (text)
Students (text, draft solution)
Athletes (text, draft solution)
Tourist village (text)

Laboratory Material

The laboratory practices will start from the fourth week.

LAB TEAMS (Division into two teams for surname)WHENWHERE
TEAM A: from AAA to LZZ Wednesday 13:00-14:30LAIB2B
TEAM B: from MAA to ZZZWednesday 14:30-16:30LAIB2B
NOTE: it is recommended to respect the division into teams to allow the laboratories to take place


SUBJECTLAB SCHEDULE TEXTSOLUTIONSOFTWARE
Lab 1: Extended SQLWednesday 25/10/2023Text Sol_DW Sol_SQLFiles
Lab 2: Data StudioWednesday 08/11/2023Text
Lab 3: Materialize viewsWednesday 15/11/2023TextSol
Lab 4: Data mining with RapidminerWednesday 22/11/2023TextSolFiles
Lab 5: Oracle optimizerWednesday 06/12/2023TextFiles
Lab 6: MongoDBWednesday 20/12/2023TextFiles

Homework to be delivered

To obtain the points associated with the homeworks, students have to observe the following terms:

  • Complete all the points of the exercises in the homework text.
  • All exercises must be computer-written where possible (e.g. SQL queries, Triggers, etc…). Only some exercises are accepted handwritten, such as Conceptual Schema in DW design.
  • Prepare one file in PDF format with the solution of the homework.
  • Name the file as: HomeworkN_Surname_Name_StudentId.pdf where
    • StudentId, Surname and Name should be substituted with student information
    • the N character following Homework should be substituted with the number of the submitted homework
    • Since uploaded files are automatically processed, using the wrong format or naming the file with a wrong name implies the cancellation of the related homework submission.
    • For example, for homework 1 and extension pdf, the student with name and surname Mario Rossi and id s123456 will upload Homework1_Rossi_Mario_s123456.pdf
  • Load the file on the didactic portal (Portale della didattica) in the section Work Submission (Elaborati) before the deadline.
    • Multiple loadings for the same student and/or for the same homework are not allowed.
    • The upload date show on the didactic portal is considered for the evaluation.
    • Since uploaded files are automatically processed, the upload after the deadline implies the cancellation of the related homework submission.
  • During the upload procedure a description (“Descrizione”) field is requested. Insert the same name of the file according to the rules described above.
  • Only the students without the access to the course page on the didactic portal can submit the homework before the deadline by sending an email to the assistant lecturer (davide.napolitano@polito.it)
  • Discuss the homework with a positive evaluation on the fixed date (announcement will be published).

Homework to be delivered:

The solution of each homework will be uploaded after the corresponding deadline.

Homework discussion:

Homework submissions:  list of delivered submissions. In case of any incongruencies or missing delivery, send an email to davide.napolitano@polito.it. You have to contact me before the 26th of January 2024. Emails sent later will not be considered.

On Wednesday 17 January 2024, the following students will have to come to Laib2B between 2PM and 4PM for the discussion of their homeworks:

  • 321510
  • 329268
  • 324533
  • 324739
  • 329511
  • 318642
  • 327752
  • 328807

if you are not available in person, please e-mail me so that we can make other arrangements (for example remotely). Those who do not show up (without any notice) will lose the points awarded.

HomeworkTextFilesUploadDeadline
Homework #1: Data warehouse and materialized viewsTextuploaded before the end of November 15th, 2023 to be delivered by November 28th, 2023 at 11.59 PM (UTC/GMT+1)
Homework #2: Data miningTextDatasetuploaded before the end of November 24th, 2023 to be delivered by December 6th, 2023 at 11.59 PM (UTC/GMT+1)
Homework #3: The OptimizerTextuploaded before the end of December 7th, 2023 to be delivered by December 21st, 2023 at 11.59 PM (UTC/GMT+1)
Homework #4: MongoDBTextDatasetuploaded before the end of December 21st, 2023 to be delivered by January 11th, 2024 at 11.59 PM (UTC/GMT+1)