Data Science and Database Technology


This page has hierarchy - Parent page: Teaching

Table of content

General information

Announcements


Exams

Exams on June-September 2020

Exam on May 9, 2020

  • Exam rules for exam on May 9, 2020 (ExamRulesMay2020 )
  • Results  are available in the didactic portal, section “Materiale”. All exam grades will be recorded. Students, who received a grade >= 18 in the written exam and want to reject it, have to send an e-mail to Silvia Chiusano (silvia dot chiusano at polito dot it) by June 11, 2020.
  • Notes on Exam DSDBT09-05-2020
  • DSDBT-May2020DraftSolution

Exam (in presence) on February 2020


Material

  • Course introduction (slides)

Part I

    • Introduction to Big Data (slides)
    • Data warehouse: introduction (slides)
    • Data warehouse: design (slides)
    • Data warehouse: analysis (slides)
    • Data mining: introduction (slides)
    • Data mining: data preprocessing (slides)
    • Data mining: Association rules (slides). New, updated on 17/10/2019
    • Data mining: classification (slides, slidesNew)
    • [NEW] Data mining: classification, neural networks (slides)
    • Data mining: clustering (slides)
Oracle
  1. Extended SQL (2 slides per page, 6 slides per page)

Part II

Oracle
  1. Oracle Optimizer
    1. Baseline version (2 slides per page6 slides per page)
    2. Extended version with examples (2 slides per page,6 slides per page)
  2. Hints (2 slides per page6 slides per page)
  3. Documentation

 


Exercises

Data warehouse

Exercise  Text Draft solution
Extended SQL (Customer ) text Draft solution
Extended SQL (Rental ) text Draft solution
Data warehouse design (Italian household) exercise Draft solution
Data warehouse design (SearchingYourHouse) exercise Draft solution
Data warehouse design (Hotel chain) exercise Draft solution
Data warehouse design (Parcels service) exercise

Triggers

Exercise Text Draft solution
Exercise 1 (Athlete) exercise Draft solution
Exercise 2 (Greenhouse) exercise Draft solution
Exercise 3 (Student grant) exercise Draft solution
Exercise 4 (Boat rental) exercise DraftSolBoatRental (timecondition)

 

Optimizer

Exercise  Text Draft solution
Exercise 1 (Fine) 2 slides per page 6 slides per page Draft Solution
Exercise 2 (Students, Projects) 2 slides per page 6 slides per page Draft Solution
Exercise 3 (Discs) exercise Draft Solution
Exercise 4 (Athletes, Members) 2 slides per page 6 slides per page  Draft Solution

 

 

Exercises from written exams

 

AA 2015-2016

Exam Draft solution
Exam (23-02-2016) optimizer, dw, trigger
Exam (2016-01-27) optimizer, dw, trigger

AA 2011-2012

Exam Draft solution
Exam (2012-02-06) optimizer and dwtrigger
Exam (2012-02-28) optimizer and dw
Exam (2012-06-21) optimizer and dwtrigger
Exam (2012-09-07) optimizer and dw

AA 2010-2011

Exam Draft solution
Exam (2011-02-07) optimizer, trigger, dw
Exam (2011-02-22)
Exam (2011-07-08) optimizer, trigger
Exam (2011-09-21) optimizer, trigger

 

 

 

 


Tutoring sessions

  • In tutoring sessions teachers are available to answer questions on homework, exercises and topics presented in lessons.
  • The calendar of the tutoring sessions in December is the following:
    • Thursday 5/12 – 8:30-10:00 – room 8D
    • Friday 6/12 – 17:30-18:00 – room 8D
    • Thursday 12/12 – 8:30-10:00 – room 8D
    • Thursday 19/12 – 8:30-10:00 – room 8D
    • Friday 20/12 – 17:30-18:00 – room 8D
  • The calendar of the tutoring sessions in January will be posted soon

Practices

Topic Team A Team B Lab Assistance
Practice #1: Extended SQL in Oracle  21/10/2019  25/10/2019 assistant lecturer
Practice #2: Data warehousing  28/10/2019  8/11/2019 assistant lecturer
Practice #3: Data mining with Rapidminer 11/11/2019 15/11/2019 assistant lecturer
Lab for Homework #2 on data mining with Rapidminer 18/11/2019 22/11/2019 scholarship holder
Practice #4: Oracle triggers 25/11/2019 29/11/2019 assistant lecturer
Lab for Homework #3 on triggers in Oracle 2/12/2019 6/12/2019 scholarship holder
Practice #5: Oracle optimizer  16/12/2019 [NEW]  13/12/2019 assistant lecturer
Practice #6: MongoDB  13/01/2020 [NEW]  10/01/2020 assistant lecturer

LAB SCHEDULE. To fit LabInf capacity students must attend all the course labs according to the following schedule (students are sorted in alphabetic order by surname):

  • TEAM A (SURNAME STARTING WITH A TO K) on Monday from 8.30 pm to 10pm
  • TEAM B (SURNAME STARTING WITH L TO Z) on Friday from 16pm to 17.30pm

The compliance to the schedule above is necessary to guarantee the feasibility of the lab.

LAB ACCOUNT. Please make sure to have an account on the LABINF PCs before the beginning of the lab practice (the accounts used to log in to the PCs of the other LAIBs are *not* valid). To register an account at LABINF, please visit the Labinf  website for further information.

 

Lab 1: Extended SQL

    • Text (pdf)
    • Data warehouse tables in csv format (zip)
    •        FOR STUDENTS WHO WANT TO PRACTICE AT HOME WITH EXTENDED SQL:
      • to import tables using Oracle SQL Developer: Import data (right click on “Tables” of “Connection” Tab)
      • (alternatively) to import tables from the Oracle XE Web interface: select Home>Utilities>Data Load/Unload>Load>Load Text Data->)
  • Installing Oracle 11g Express Edition at home
  • Draft solution of queries 1-5 and materialized view (pdf) and DW design

Lab 2: Data-warehouse analytics and reporting with Google Data Studio

Lab 3: Data mining – Rapid Miner  

    • Recommendations
        • Rapid Miner is already installed on the LabInf PCs. Please follow the instructions reported in the practice text.
      • Some processes may need a memory allocation higher than the default one (1024 MB). If so, you have to allocate a higher Java Heap memory space to Rapid Miner. Under Windows, run the command shell (cmd), go to the Rapid Miner lib folder (by default, c:\Program Files\Rapid-I\RapidMiner5\lib) and launch Rapid Miner form the shell using the following command: java -Xmx1500m -jar rapidminer.jar  where 1500 is the maximum heap memory space expressed in MB.

Lab 4: Triggers

  • Text (pdf)
  • scripts for creating DBs (create_db scripts)
  • Screenshots of the database after the trigger executions: Results
  • LABINF – Oracle SQL Developer English version: RUN, write %APPDATA%\sqldeveloper, select 19.1.0 version, product.conf, write addVMoption -Duser.language=en
  • Note: If the following error occurs “ORA-00001: unique constraint (SYS.I_PLSCOPE_SIG_IDENTIFIER$) violated“, try to change PLSCOPE_SETTINGS to NONE from menu Tools->Preferences->Database->PL/SQL Compiler->PLScope Identifiers. Alternatively add “ALTER SESSION SET PLSCOPE_SETTINGS = ‘IDENTIFIERS:NONE’; ” before the creation/update of a trigger (CREATE OR REPLACE TRIGGER TriggerName).

Lab 5: The Oracle Optimizer

Lab 6: NoSQL in MongoDB


 

 


Homework to be delivered

To obtain the points associated with the homeworks, students have to observe the following terms:

    • Complete all the points of the exercises in the homework text.
    • Prepare one file in PDF, DOC or ODT format with the solution of the homework.
    • Name the file as: HomeworkN_Surname_Name_StudentId.XXX where
        • StudentId, Surname and Name should be substituted with student information
        • the N character following Homework should be substituted with the number of the submitted homework
        • the filename extension XXX depends on the file type chosen for the submission (PDF, DOC or ODT).
        • DOCX format is not supported.
      • Since uploaded files are automatically processed, naming the file with a wrong name implies the cancellation of the related homework submission.
    • Load the file on the didactic portal (Portale della didattica) in the section Work Submission (Elaborati) before the deadline.
        • Multiple loadings for the same student and/or for the same homework are not allowed.
        • The upload date show on the didactic portal is considered for the evaluation.
      • Since uploaded files are automatically processed, the upload after the deadline implies the cancellation of the related homework submission.
    • During the upload procedure a description (“Descrizione”) field is requested. Insert the same name of the file according to the rules described above.
    • Only the students without the access to the course page on the didactic portal can submit the homework before the deadline by sending an email to the assistant lecturer (eliana dot pastor at polito dot it)
  • Discuss the homework with a positive evaluation on the fixed date (announcement will be published).

Homework discussion: Students attending the written exam must bring the following items:

  • for Homeworks #1 – #4:
    • a hard-copy of the submitted reports

 

 

 

 

Homework Material Deadline Homework deliveries
Homework #1: Data warehouse HW text (pdf) to be delivered by Sunday, November 10th, 2019 at 11.59 PM (UTC/GMT+1)
Homework #2: Data Mining HW text (pdf) – Dataset (breast.xls) to be delivered by Sunday, November 24th, 2019 at 11.59 PM (UTC/GMT+1)
Homework #3: Triggers HW text (pdf) –  scripts. to be delivered by Sunday, December 8th, 2019 at 11.59 PM (UTC/GMT+1)
Homework #4: Query Optimization HW text (pdf) to be delivered by Sunday, January 12th, 2020 at 11.59 PM (UTC/GMT+1)