General Information
SSD: ING-INF/05
CFU: 8
Professor: Silvia Chiusano
Teaching Assistants: Alessandro Fiori, Eliana Pastor
Announcements [dd-mm-yy]
23-01-2022 – The list of the submitted homeworks is available in the “Homework” section.
27-10-2021 – On the week of 1/11/2021-5/11/2021 there will be no laboratory.
28-09-2021 – Slides and other material used in lessons and practices will be made available here during the semester.
07-11-2021 – The calendar of the laboratory practices has been updated.
Teaching Material
- Course introduction: pdf
Part I
- Introduction to Data Science (slides)
- Data warehouse: introduction (slides)
- Data warehouse: design (slides)
- Data warehouse: analysis (slides)
- Data warehouse: materialize view (slides)
- Data lakes (slides)
- Data mining process (slides)
- Data preparation (slides)
- Data mining: association rules (slides)
- Data mining: classification (slides)
- Data mining: clustering (slides)
Part II
- Introduction to DBMS (slides)
- Buffer Manager (slides)
- Physical access to data (slides)
- Query optimization (slides)
- Physical Design (slides)
- Concurrency control (slides)
- Reliability management (slides)
- Distributed databases (slides)
- NoSQL, beyond relational databases (slides)
- Introduction to MongoDB (slides)
- ElasticSearch (slides)
Oracle
- Extended SQL (2 slides per page, 6 slides per page)
- Oracle optimizer (slides)
- Oracle Hints (slides)
Exercise
- Extended SQL
- Exercise 1 (text, draft solution)
- Data Warehouse
- Storehouses (text, draft solution)
- Italian wines (text, draft solution)
- Remote heating (text)
- Scientific publications (updated text, draft solution)
- Materialized views
- Consulting (text, draft solution)
- Materialized views and triggers
- Optimizer
- Fine (text, draft solution)
- Students (text, draft solution)
- Athletes (text, draft solution)
- Tourist village (text)
- Homeworks
- Data warehouse and materialized views (draft solution)
- Optimizer (draft solution)
Laboratory Material
The laboratory practices will start from the fourth week.
Topic | Team A (5:30-7pm) | Team B (4-5:30pm) | Lab Assistance |
Practice #1: Extended SQL in Oracle | 19/10/2021 | 22/10/2021 | assistant lecturer |
Practice #2: Data warehousing | 26/10/2021 | 29/10/2021 | assistant lecturer |
Practice #3: Materialize views | 9/11/2021 | 12/11/2021 | assistant lecturer |
Practice #4: Data mining with Rapidminer | 16/11/2021 | 1911/2021 | assistant lecturer |
Lab for Homework #2 on data mining with Rapidminer | 23/11/2021 | 26/11/2021 | scholarship holder |
Practice #5: Oracle optimizer | 30/11/2021 | 3/12/2021 | assistant lecturer |
Practice #6: MongoDB | 14/12/2021 | 17/12/2021 | assistant lecturer |
LAB SCHEDULE.
TEAM A (FROM A TO K) on Tuesday from 5.30pm to 7pm
TEAM B (FROM L TO Z) on Friday from 4pm to 5.30pm
Lab 1: Extended SQL
- Text (pdf)
- Data warehouse tables in csv format (zip)
- SQL Developer is already available at LABINF. If you want to practise at home, you can follow these tutorials
- Installing Oracle Database 18c Express Edition and SQL Developer
- Import Database and Tables: Tutorial
- In the case, you want to practice at home and you have problems in using Oracle Database and SQL Developer, you can consider Oracle Live SQL.
- You can add tables using SQL scripts (zip)
- A short guide on how to import SQL scripts and query the DB in Oracle Live SQL is available (pdf)
- if you experience some issues on importing the complete FACT table, you can opt for a “light” version of the table with contains a sample of the rows (facts_sample.sql).
Draft solution (star schema, queries)
Lab 2 : data warehouse analytics and reporting (Google Data Studio)
- Text (pdf)
Lab 3 : materialized views and triggers
Lab 4: Data mining – Rapid Miner
- Text Practice 4
- Dataset (Users.xls)
- Supporting material
- Rapid Miner 5.0 Community Edition Guide (rapidminer-5.0-manual-english_v1.0)
- Rapid Miner download http://rapidminer.com/products/rapidminer-studio/
- Free Community Edition
- Introduction to RapidMiner (2 slides per page, 3 slides per page, 6 slides per page)
- Examples (download)
Lab 5: The Oracle Optimizer
- Text (pdf)
- Scripts for creating DBs (Lab5Database_OPT)
- Useful scripts
- Draft solution (pdf)
Lab 6: NoSQL in MongoDB
- Text (pdf)
- Collection “restaurants” (txt, zipped json)
Homework to be delivered
To obtain the points associated with the homeworks, students have to observe the following terms:
- Complete all the points of the exercises in the homework text.
- Prepare one file in PDF, DOC or ODT format with the solution of the homework.
- Name the file as: HomeworkN_Surname_Name_StudentId.XXX where
- StudentId, Surname and Name should be substituted with student information
- the N character following Homework should be substituted with the number of the submitted homework
- the filename extension XXX depends on the file type chosen for the submission (PDF, DOC or ODT).
- DOCX format is not supported.
- Since uploaded files are automatically processed, naming the file with a wrong name implies the cancellation of the related homework submission.
- For example, for homework 1 and extension pdf, the student with name and surname Mario Rossi and id s123456 will upload Homework1_Rossi_Mario_s123456.pdf
- Load the file on the didactic portal (Portale della didattica) in the section Work Submission (Elaborati) before the deadline.
- Multiple loadings for the same student and/or for the same homework are not allowed.
- The upload date show on the didactic portal is considered for the evaluation.
- Since uploaded files are automatically processed, the upload after the deadline implies the cancellation of the related homework submission.
- During the upload procedure a description (“Descrizione”) field is requested. Insert the same name of the file according to the rules described above.
- Only the students without the access to the course page on the didactic portal can submit the homework before the deadline by sending an email to the assistant lecturer (eliana dot pastor at polito dot it)
- Discuss the homework with a positive evaluation on the fixed date (announcement will be published).
Homework discussion: Students attending the written exam must bring the following items:
- for Homeworks #1 – #4:
- a hard-copy of the submitted reports
Homework deliveries:
Homework submissions : list of delivered submissions. In case of any incongruencies or missing delivery, send an email to eliana.pastor@polito.it.
Homework | Material | Deadline | Homework deliveries |
Homework #1: Data warehouse and materialized views | HW text | to be delivered by Thursday, November 25th, 2021 at 11.59 PM (UTC/GMT+1) |
Homework #2: Data mining | HW text – breast dataset | to be delivered by Thursday, December 2nd, 2021 at 11.59 PM (UTC/GMT+1) |
Homework #3: Optimizer | HW text | to be delivered by Thursday, December 16nd, 2021 at 11.59 PM (UTC/GMT+1) |
Homework #4: MongoDB | HW text, bike_stations dataset (updated 18/12/21) | to be delivered by Tuesday, January 11, 2022 at 11.59 PM (UTC/GMT+1) |