{"id":3141,"date":"2022-02-21T20:51:14","date_gmt":"2022-02-21T19:51:14","guid":{"rendered":"https:\/\/dbdmg.polito.it\/dbdmg_web\/?p=3141"},"modified":"2023-02-18T18:20:29","modified_gmt":"2023-02-18T17:20:29","slug":"distributed-architectures-for-big-data-processing-and-analytics-2021-2022","status":"publish","type":"post","link":"https:\/\/dbdmg.polito.it\/dbdmg_web\/2022\/distributed-architectures-for-big-data-processing-and-analytics-2021-2022\/","title":{"rendered":"Distributed architectures for big data processing and analytics (2021\/2022)"},"content":{"rendered":"\n<h1 class=\"eplus-wrapper wp-block-heading\"><strong>THIS IS THE WEB PAGE OF THE FORMER YEAR <\/strong><\/h1>\n\n\n\n<h2 class=\"eplus-wrapper wp-block-heading\" id=\"general-information\">General Information<\/h2>\n\n\n\n<p class=\" eplus-wrapper\"><strong>SSD<\/strong>: ING-INF\/05<\/p>\n\n\n\n<p class=\" eplus-wrapper\"><strong>CFU<\/strong>: 8<\/p>\n\n\n\n<p class=\" eplus-wrapper\"><strong>Professor<\/strong>: Paolo Garza<\/p>\n\n\n\n<p class=\" eplus-wrapper\"><strong>Teaching Assistant<\/strong>: Luca Colomba<\/p>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer eplus-wrapper\"><\/div>\n\n\n\n<h3 class=\"eplus-wrapper wp-block-heading\" id=\"announcements\">Announcements<\/h3>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0e8684\">\n<li class=\" eplus-wrapper\">04-03- 22: Lab activities<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-75a833\">\n<li class=\" eplus-wrapper\">Team 1: Students from A to G &#8211; Tuesday from 11:30 to 13:00 (First lab activity &#8211; March 8, 2022) &#8211; <a rel=\"noreferrer noopener\" href=\"https:\/\/www.labinf.polito.it\/\" target=\"_blank\">LABINF<\/a> <\/li>\n\n\n\n<li class=\" eplus-wrapper\">Team 2: Students from H to Z &#8211; Friday from 11:30 to 13:00 (First lab activity &#8211; March 11, 2022) &#8211; <a rel=\"noreferrer noopener\" href=\"https:\/\/www.labinf.polito.it\/\" target=\"_blank\">LABINF<\/a><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">21-02-22: The first lecture is scheduled for February 28, 2022, at 8:30 in Classroom R2 <\/li>\n<\/ul>\n\n\n<hr class=\"wp-block-separator has-css-opacity eplus-wrapper\"\/>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer eplus-wrapper\"><\/div>\n\n\n\n<h3 class=\"eplus-wrapper wp-block-heading\" id=\"teaching-material\">Teaching Material<\/h3>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-66e21c\">\n<li class=\" eplus-wrapper\">Introduction to the course content and exam rules (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/00_Intro_DistributedBigData_2122-1.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Introduction to Big Data (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/01_Intro_BigData_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Big Data Architectures (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/02_Architectures_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Hadoop and MapReduce<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-29cae9\">\n<li class=\" eplus-wrapper\">Introduction to Apache Hadoop and the MapReduce programming paradigm (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/03_Intro_HadoopAndMapReduce_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-8b0249\">\n<li class=\" eplus-wrapper\">Interaction with HDFS and Hadoop by means of the command line (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/03b_HDFS_Hadoop_CommandLine_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Hadoop implementation of MapReduce (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/04_HadoopImplementationOfMapReduceNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-bcf1a9\">\n<li class=\" eplus-wrapper\">Source code of the Word Count Ecplise project (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2016\/03\/WordCount.zip\" target=\"_blank\">WordCount.zip<\/a>) \u2013 Use the import maven project option to import it in Eclipse<\/li>\n\n\n\n<li class=\" eplus-wrapper\">PDF version of the code (i.e., PDF version of the java files) (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2016\/03\/WordCountPDF.zip\" target=\"_blank\">WordCountPDF.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">BigData@Polito environment + Jupyter \u2013 How to submit MapReduce jobs on BigData@Polito (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/04b_ClusterJupyter_BigDataNB.pdf\">pdf<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce \u2013 Design patterns \u2013 Part 1 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/05_MapReduce_Patterns_Part1_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Hadoop \u2013 Advanced Topics: Multiple inputs, Multiple outputs, Distributed cache (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/06_AdvancedTopicsMapReduce_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce \u2013 Design patterns \u2013 Part 2 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/\/07_MapReduce_Patterns_Part2_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce \u2013 Relational Algebra\/SQL operators (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/02\/08_SQLOperators_BigDataNB.pdf\">pdf<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-1fbc6e\">\n<li class=\" eplus-wrapper\">Introduction to Apache Spark (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/10_SparkIntroduction_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-dc9be6\">\n<li class=\" eplus-wrapper\">How to submit Spark applications (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/10b_SparkSubmit_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">How to use Jupyter notebooks for your Spark applications (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/10c_JupyterNotebooks_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-d5b7a3\">\n<li class=\" eplus-wrapper\">A useful online tutorial for those who want to install and run Spark locally on their PCs (tested for Linux)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-3d4795\">\n<li class=\" eplus-wrapper\">How to use PySpark on your computer\u201d by Favio V\u00e1zquez (<a rel=\"noreferrer noopener\" href=\"https:\/\/towardsdatascience.com\/how-to-use-pyspark-on-your-computer-9c7180075617\" target=\"_blank\">link<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Download Spark from <a rel=\"noreferrer noopener\" href=\"https:\/\/spark.apache.org\/\" target=\"_blank\">https:\/\/spark.apache.org\/<\/a><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">RDD-based programs<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-48e13f\">\n<li class=\" eplus-wrapper\">RDDs: creation, basic transformations and actions (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/11_SparkRDDBasedProgramming_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Key-value RDDs: transformations and actions on key-value RDDs (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/12_SparkPairRDD_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">DoubleRDDs (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/13_SparkDoubleRDD_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Advanced Topics: Cache, accumulators, broadcast variables, custom partitioners, broadcast join (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/14_SparkRDDBasedProgramming_AdvancedTopics_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-f025a0\">\n<li class=\" eplus-wrapper\">RDD partition examples (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/RDDPartitionsExamples.zip\" target=\"_blank\">RDDPartitionsExamples.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Example: PageRank &#8220;naive&#8221; implementation (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/RDDPageRank.zip\" target=\"_blank\">RDDPageRank.zip<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-70079b\">\n<li class=\" eplus-wrapper\">Introduction to PageRank (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/15b_SparkIntroPageRankNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark SQL and DataFrames<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-93e795\">\n<li class=\" eplus-wrapper\">Spark SQL (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/16_SparkSQL_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-90998b\">\n<li class=\" eplus-wrapper\">Simple examples \u2013 Jupyter notebook (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkSQLSimpleExamples.zip\" target=\"_blank\">SparkSQLSimpleExamples.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark SQL join examples \u2013 Jupyter notebook (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExamplesSparkSQLJoins.zip\" target=\"_blank\">ExamplesSparkSQLJoins.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Data mining and Machine learning algorithms with Spark MLlib<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-82b9dd\">\n<li class=\" eplus-wrapper\">Introduction and Preprocessing (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18a_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Classification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18b_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-cd1461\">\n<li class=\" eplus-wrapper\">Classification examples \u2013 Jupyter notebooks and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleClassificationMLlib.zip\">ExampleClassificationMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Clustering (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18c_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-fad48c\">\n<li class=\" eplus-wrapper\">Clustering example \u2013 Jupyter notebook and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleClusteringMLlib.zip\">ExampleClusteringMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Regression (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18d_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-84f5c2\">\n<li class=\" eplus-wrapper\">Regression example \u2013 Jupyter notebook and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleRegressionMLlib.zip\">ExampleRegressionMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Itemset and Association rule mining (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18e_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-ece1f3\">\n<li class=\" eplus-wrapper\">Itemset and Association rule mining example \u2013 Jupyter notebook and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleItemsetMLlib.zip\">ExampleItemsetMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">GraphX\/GraphFrames<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-61ec4f\">\n<li class=\" eplus-wrapper\">Introduction to GraphX and GraphFrames (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/19_SparkGraphFrame_PartI_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Graph Algorithms with GraphFrames (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/20_SparkGraphFrame_Algorithms_DistributedBigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-a9cbd6\">\n<li class=\" eplus-wrapper\">Simple example \u2013 Jupyter notebook (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/GraphFrameExamples.zip\" target=\"_blank\">GraphFrameExamples.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Select kernel GraphFrames (Yarn) to run it on jupyter.polito.it<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Run \u201cpyspark \u2013packages graphframes:graphframes:0.8.1-spark3.0-s_2.12 \u2013repositories https:\/\/repos.spark-packages.org\u201d to run it locally on your PC<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-3ec81b\">\n<li class=\" eplus-wrapper\">Use package graphframes:graphframes:0.8.0-spark2.4-s_2.11 if you locally installed Spark 2 instead of Spark 3<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Streaming data analytics<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-f7e0bc\">\n<li class=\" eplus-wrapper\">Spark Streaming Spark Streaming (DStreams) (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/21_SparkStreaming_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-f92ad3\">\n<li class=\" eplus-wrapper\">Simple examples \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkSteamingExamples.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkSteamingExamples.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Structured Streaming (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/22_SparkStructuredStreaming_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-08f652\">\n<li class=\" eplus-wrapper\">Simple examples \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleStructutedStreaming.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkStructutedStreamingExamples.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Introduction to other big stream processing frameworks: Apache Storm, Apache Flink, .. (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/23_StreamingFrameworks_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<h4 class=\"eplus-wrapper wp-block-heading\" id=\"exercise\">Exercise<\/h4>\n\n\n\n<p class=\" eplus-wrapper\"><mark style=\"background-color:rgba(0, 0, 0, 0);color:#fc0303\" class=\"has-inline-color\">If you use your PC to write and run your code, import the projects based on Maven (those projects can be run locally).<br>If you use the PC available in the LAB, import the Eclipse projects with libraries (those projects cannot be run locally but only on the cluster exporting the jar file of the project).<\/mark><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-accent-2-color\"> <\/mark><\/p>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-900e2a\">\n<li class=\" eplus-wrapper\">MapReduce exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/01_MapReduce_Exercises_BigData_NewStyle.pdf\">slides<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0471be\">\n<li class=\" eplus-wrapper\">Solutions of Exercises 1-29 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/SolutionsExMapReduce.zip\" target=\"_blank\">SolutionsExMapReduce.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic MapReduce project<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-2584ac\">\n<li class=\" eplus-wrapper\">Linux and MacOs<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e27c2b\">\n<li class=\" eplus-wrapper\">Basic Eclipse project for MapReduce applications (with libraries) (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/WordCountLibraries.zip\" target=\"_blank\">MapReduceBasicProjectWithLibraries.zip<\/a>) \u2013 Import using Import\/General\/Existing Projects into Workspace<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic Eclipse project for MapReduce applications (based on maven) (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/MapReduceBasicProject.zip\" target=\"_blank\">MapReduceBasicProject.zip<\/a>) \u2013 Import it using Import\/Maven\/Existing Maven Projects<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Windows<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-1419bd\">\n<li class=\" eplus-wrapper\">Basic Eclipse project for MapReduce applications (with libraries) (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/WordCountLibraries.zip\" target=\"_blank\"><\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/WordCountLibraries.zip\" target=\"_blank\">MapReduceBasicProjectWithLibraries<\/a>.<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/WordCountLibraries.zip\" target=\"_blank\">zip<\/a>) \u2013 Import using Import\/General\/Existing Projects into Workspace<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Setup instructions for running MapReduce applications locally inside Eclipse (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/ConfigureWindowsEnviroment.pdf\" target=\"_blank\">ConfigureWindowsEnviroment.pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-9d62ed\">\n<li class=\" eplus-wrapper\">You must install also <strong>JDK 1.8<\/strong> and select it for the imported project inside Eclipse. If you already installed the JDK environment\u00a0but the version is greater than JDK 1.8 you must install also JDK 1.8.<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Winutils executable (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/winutils.zip\" target=\"_blank\">winutils.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic Eclipse project for MapReduce applications (based on maven) (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/MapReduceBasicProjectWindows.zip\" target=\"_blank\">MapReduceBasicProjectWindows.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-39d588\">\n<li class=\" eplus-wrapper\">Spark exercises (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/03\/02_Spark_Exercises_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e9ab5e\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/ExSparkData30_46.zip\" target=\"_blank\">ExSparkData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">RDD-based solutions of Exercises 30-46 \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/SolutionsExSpark30_46.zip\">SparkNotebooksSol30_46.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark SQL exercises (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/02_Spark_ExerciseSparkSQLNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0e2227\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExSparkSQLData.zip\" target=\"_blank\">ExSparkSQLData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercises 47-50 \u2013\u00a0 Jupyter notebooks (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol47_50.zip\" target=\"_blank\">SparkNotebooksSol47_50.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark MLlib exercises (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/03_MLlib_Exercises_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-470cba\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleMLlibData.zip\" target=\"_blank\">ExampleMLlibData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercise 51 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol51.zip\" target=\"_blank\">SparkNotebooksSol51.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">GraphFrame exercises (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/04_GraphFrame_Exercises_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-2f7ac1\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleGraphFrameData.zip\" target=\"_blank\">ExampleGraphFrameData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercises 52-57b \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/SparkNotebooksSol52_57b.zip\">SparkNotebooksSol52_57b.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark streaming exercises (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/05_SparkStreaming_Exercises_BigDataNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-cae7a3\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleSparkStreamingData-1.zip\" target=\"_blank\">ExampleSparkStreamingData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercises 58-65 \u2013 Jupyter notebooks (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol58_65.zip\" target=\"_blank\">SparkNotebooksSol58_65.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark structured streaming and MLlib exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/06_SparkStructuredStreamingAndMLlib_ExercisesNB.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-d01904\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleSparkStructuredMLlibData.zip\" target=\"_blank\">ExampleSparkStructuredMLlibData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution of Exercise 66 \u2013 Jupyter notebooks (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol66.zip\" target=\"_blank\">SparkNotebooksSol66.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<hr class=\"wp-block-separator has-css-opacity eplus-wrapper\"\/>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer eplus-wrapper\"><\/div>\n\n\n\n<h3 class=\"eplus-wrapper wp-block-heading\" id=\"laboratory-material\">Laboratory Material<\/h3>\n\n\n\n<p class=\" eplus-wrapper\"><strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#fd0202\" class=\"has-inline-color\">No lab activities during the first week of the course<\/mark><\/strong><\/p>\n\n\n\n<p class=\" eplus-wrapper\">Team 1: Students from A to G &#8211; Tuesday from 11:30 to 13:00 (First lab activity &#8211; March 8, 2022) &#8211; <a href=\"https:\/\/www.labinf.polito.it\/\" target=\"_blank\" rel=\"noreferrer noopener\">LABINF<\/a> <\/p>\n\n\n\n<p class=\" eplus-wrapper\">Team 2: Students from H to Z &#8211; Friday from 11:30 to 13:00 (First lab activity &#8211; March 11, 2022) &#8211; <a rel=\"noreferrer noopener\" href=\"https:\/\/www.labinf.polito.it\/\" target=\"_blank\">LABINF<\/a><\/p>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-013fb7\">\n<li class=\" eplus-wrapper\">Lab1: Hadoop and MapReduce<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-56f305\">\n<li class=\" eplus-wrapper\">Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab1_DBD.pdf\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic project and small example data set (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab1_DBD_with_libraries.zip\">Lab1_DBD_with_libraries.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic project based on Maven &#8211; Use this version of the project to run the MapReduce application locally on your own PC (<mark style=\"background-color:rgba(0, 0, 0, 0);color:#f80404\" class=\"has-inline-color\"><strong>DO NOT USE IT ON THE LABINF PCs<\/strong><\/mark>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-b9dd94\">\n<li class=\" eplus-wrapper\">Import it using Import\/Maven\/Existing Maven Projects<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-49af75\">\n<li class=\" eplus-wrapper\">Linux and macOS (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab1_DBD_mvn.zip\">Lab1_DBD_mvn<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab1.zip\" target=\"_blank\">.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Windows (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab1_DBD_Windows_mvn.zip\" target=\"_blank\">Lab1_DBD_Windows_mvn.zip<\/a>) <\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Bigger data set: finefoods_text.txt (<a rel=\"noreferrer noopener\" href=\"https:\/\/www.dropbox.com\/s\/fswdiblx15mhmyo\/finefoods_text.zip?dl=0\" target=\"_blank\">zip<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-40b154\">\n<li class=\" eplus-wrapper\">You can use it to test your application locally on your own PC if you are using Maven<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution Bonus track<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-f7619a\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab1_SolBonusMvn.zip\" target=\"_blank\">Lab1_SolBonusMvn.zip<\/a> &#8211; The project is based on mvn<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e1d8be\">\n<li class=\" eplus-wrapper\">Lab2: Filter with Hadoop MapReduce<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-ef46fd\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab2_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Skeleton Eclipse project Hadoop \u2013 MapReduce (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab2_DBD_with_libraries.zip\" target=\"_blank\">Lab2_DBD_with_libraries.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic project based on Maven \u2013 Use this version of the project to run the MapReduce application locally on your own PC (<strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f80404\" class=\"has-inline-color\"><strong>DO NOT USE IT ON THE LABINF PCs<\/strong><\/mark><\/strong>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-42d88f\">\n<li class=\" eplus-wrapper\">Import it using Import\/Maven\/Existing Maven Projects<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-b93ef7\">\n<li class=\" eplus-wrapper\">Linux and macOS (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab2_DBD_mvn.zip\" target=\"_blank\">Lab2_DBD_mvn.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Windows (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab2_DBD_Windows_mvn.zip\" target=\"_blank\">Lab2_DBD_Windows_mvn.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Outputs of the first lab<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-9d411d\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/OutputFolderLab1.zip\" target=\"_blank\">OutputFolderLab1.zip<\/a><\/li>\n\n\n\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/OutputFolderLab1BonusTrack.zip\" target=\"_blank\">OutputFolderLab1BonusTrack.zip<\/a><\/li>\n\n\n\n<li class=\" eplus-wrapper\">You can use them to test your application locally on your own PC if you are using Maven<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-29cf2f\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab2_DBD_Sol.zip\" target=\"_blank\">Lab2_DBD_Sol.zip<\/a> \u2013 This project is based on mvn<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution Bonus track<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-29e100\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab2_DBD_SolBonuszip.zip\" target=\"_blank\">Lab2_SolBonus.zip<\/a> \u2013 This project is based on mvn<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-425d17\">\n<li class=\" eplus-wrapper\">Lab3: Frequently bought\/reviewed together application with Hadoop MapReduce<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-75ad9c\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Skeleton Eclipse project Hadoop \u2013 MapReduce (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DBD_with_libraries.zip\" target=\"_blank\">Lab3_DBD_with_libraries.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic project based on Maven \u2013 Use this version of the project to run the MapReduce application locally on your own PC (<strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f80404\" class=\"has-inline-color\"><strong>DO NOT USE IT ON THE LABINF PCs<\/strong><\/mark><\/strong>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e2e6c0\">\n<li class=\" eplus-wrapper\">Import it using Import\/Maven\/Existing Maven Projects<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e0a642\">\n<li class=\" eplus-wrapper\">Linux and macOS (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DBD_mvn.zip\" target=\"_blank\">Lab3_DBD_mvn.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Windows (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DBD_Windows_mvn.zip\" target=\"_blank\">Lab3_DBD_Windows_mvn.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Sample file (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/AmazonTransposedDataset_Sample.txt\" target=\"_blank\">AmazonTransposedDataset_Sample.txt<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-745f3a\">\n<li class=\" eplus-wrapper\">You can use them to test your application locally on your own PC if you are using Maven<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-868a94\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DBD_Sol.zip\" target=\"_blank\">Lab3_DBD_Sol.zip<\/a> \u2013 The project is based on mvn<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Comments on the three uploaded solutions (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DraftSolution_BigData_NewStyle.pdf\" target=\"_blank\">slides<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-cfbb9b\">\n<li class=\" eplus-wrapper\">Lab4: Normalized ratings for product recommendations with Hadoop MapReduce<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-81e7e6\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab4_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Skeleton Eclipse project Hadoop \u2013 MapReduce (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab4_DBD_with_libraries.zip\" target=\"_blank\">Lab4_DBD_with_libraries.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Basic project based on Maven \u2013 Use this version of the project to run the MapReduce application locally on your own PC (<strong><mark style=\"background-color:rgba(0, 0, 0, 0);color:#f80404\" class=\"has-inline-color\"><strong>DO NOT USE IT ON THE LABINF PCs<\/strong><\/mark><\/strong>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0b32ad\">\n<li class=\" eplus-wrapper\">Import it using Import\/Maven\/Existing Maven Projects<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e02c8a\">\n<li class=\" eplus-wrapper\">Linux and macOS (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab4_DBD_mvn.zip\" target=\"_blank\">Lab4_DBD_mvn.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Windows (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab4_DBD_Windows_mvn.zip\" target=\"_blank\">Lab4_DBD_Windows_mvn.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Sample file (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/ReviewsSample.csv\" target=\"_blank\">ReviewsSample.csv<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-36641b\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab4_DBD_Sol.zip\" target=\"_blank\">Lab4_DBD_Sol.zip<\/a> \u2013 This project is based on mvn<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab5: Filter data and compute basic statistics with Apache Spark<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-b534a1\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab5_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Sample file (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/SampleLocalFile.csv\" target=\"_blank\">SampleLocalFile.csv<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-835dc9\">\n<li class=\" eplus-wrapper\"> <a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab5_DBD_Sol.zip\" target=\"_blank\">Lab5_DBD_Sol.zip<\/a> \u2013 Jupyter notebook (Lab5_Sol.ipynb) and Python script (Lab5_Sol.py)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab6: Frequently bought\/reviewed together application with Apache Spark<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-2bef0e\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/Lab6_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Sample dataset (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ReviewsSample.csv\" target=\"_blank\">ReviewsSample.csv<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-233fb4\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/Lab6_DBD_Sol.zip\" target=\"_blank\">Lab6_DBD_Sol.zip<\/a> \u2013 Jupyter notebook (Lab6_Sol.ipynb) and Python script (Lab6_Sol.py)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab7: Bike sharing data analysis<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-36d739\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/Lab7_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Sample data (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/sampleData.zip\" target=\"_blank\">zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Example KML file (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/example.zip\" target=\"_blank\">zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">KML file containing the result of the analysis setting the threshold to 0.6 and running the program on the HDFS files (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/resultTh0.6.zip\" target=\"_blank\">zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-4fe786\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/Lab7_DBD_Sol.zip\" target=\"_blank\">Lab7_DBD_Sol.zip<\/a> \u2013 Jupyter notebook (Lab7_Sol.ipynb) and Python script (Lab7_Sol.py)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab8: Bike sharing data analysis based on Spark SQL<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-22bd5f\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/Lab8_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Sample data (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/sampleData.zip\" target=\"_blank\">zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-757f0b\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/Lab8_DBD_Sol.zip\" target=\"_blank\">Lab8_DBD_Sol.zip<\/a> \u2013 Jupyter notebooks and Python scripts<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab9: A classification pipeline with MLlib + SparkSQL<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-2f16f2\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab9_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab9_template.zip (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab9_template.zip\" target=\"_blank\">zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-38f086\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab9_DBD_Sol.zip\" target=\"_blank\">Lab9_DBD_Sol.zip<\/a> \u2013 Jupyter notebooks<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab10: GraphFrame<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-713f94\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab10_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Data (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab10Data.zip\" target=\"_blank\">Lab10Data.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-d4b134\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab10_DBD_Sol.zip\" target=\"_blank\">Lab10_DBD_Sol.zip<\/a> \u2013 Jupyter notebook<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab11: Tweet analysis \u2013 Spark streaming<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-94f6bf\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab11_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Example files \u2013 tweets (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab11Data.zip\" target=\"_blank\">Lab11Data.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-32aa15\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/Lab11_DBD_Sol.zip\" target=\"_blank\">Lab11_DBD_Sol.zip<\/a> \u2013 Jupyter notebooks<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Lab12: Classification with MLlib + Spark streaming<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-77f9cf\">\n<li class=\" eplus-wrapper\">Problem specification (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/Lab12_DBD.pdf\" target=\"_blank\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Template (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/Lab12_DBD_templates.zip\" target=\"_blank\">Lab12_DBD_templates.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Streaming only data (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/Lab12_DBD_streaming_data.zip\" target=\"_blank\">Lab12_DBD_streaming_data.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">All data \u2013 train, test, and streaming (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/Lab12_DBD_all_data.zip\" target=\"_blank\">Lab12_DBD_all_data.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-d111e0\">\n<li class=\" eplus-wrapper\"><a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/Lab12_DBD_Sol.zip\" target=\"_blank\">Lab12_DBD_Sol.zip<\/a> \u2013 Jupyter notebooks<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<h3 class=\"eplus-wrapper wp-block-heading\">Previous exams<\/h3>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-23c5d3\">\n<li class=\" eplus-wrapper\">Exam September 1, 2022 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/09\/DBD_Exam20220901.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-9e27c6\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-64e43e\">\n<li class=\" eplus-wrapper\">Question 1: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/09\/DBD_Exam20220901Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20220901Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam July 18, 2022 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/07\/DBD_Exam20220718.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-c59174\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0d7e39\">\n<li class=\" eplus-wrapper\">Question 1: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/07\/DraftSolution20220718.zip\">DBD_Exam20220718Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam June 27, 2022 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DBD_Exam20220627.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-a531c7\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-21d59f\">\n<li class=\" eplus-wrapper\">Question 1: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (a)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DBD_Exam20220627Sol.zip\" target=\"_blank\">DBD_Exam20220607Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam February 10, 2022 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/DBD_Exam20220210.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-2e7705\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-872594\">\n<li class=\" eplus-wrapper\">Question 1: (a)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DBD_Exam20220210Sol.zip\" target=\"_blank\">DBD_Exam20220210Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam September 17, 2021 (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/DBD_Exam20210917.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-cd462c\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-98944b\">\n<li class=\" eplus-wrapper\">Question 1: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (a)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DraftSolution20210917.zip\" target=\"_blank\">DBD_Exam20210917.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam July 5, 2021 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/07\/DBD_Exam20210705.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-c016c9\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-54d1ef\">\n<li class=\" eplus-wrapper\">Question 1: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (a)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/07\/DBD_Exam20210705Sol.zip\">DBD_Exam20210705Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam June 21, 2021 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/DBD_Exam20210621.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-464446\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-006e83\">\n<li class=\" eplus-wrapper\">Question 1: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (a)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/DraftSolutionExam_20210621.zip\">DBD_Exam20210621Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam January 22, 2021 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/DBD_Exam20210122.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-fda7a2\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0af96b\">\n<li class=\" eplus-wrapper\">Question 1: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/DBD_Exam20210122Sol.zip\">DBD_Exam20210122Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam September 14, 2020 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/09\/DBD_Exam20200914.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-650d80\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-f36d5d\">\n<li class=\" eplus-wrapper\">Question 1: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/09\/DBD_Exam20200914Sol.zip\">DBD_Exam20200914Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam July 20, 2020 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/07\/DBD_Exam20200720.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-26e602\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-17b695\">\n<li class=\" eplus-wrapper\">Question 1: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (b) \u2013 Note that there are three actions and hence the input file is read three times.<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/07\/DBD_Exam20200720Sol.zip\">DBD_Exam20200720Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam June 27, 2020 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/DBD_Exam20200627.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0e5a5b\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-40f60b\">\n<li class=\" eplus-wrapper\">Question 1: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (a)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/DBD_Exam20200627Sol.zip\">DBD_Exam20200627Sol.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<h3 class=\"eplus-wrapper wp-block-heading\" id=\"exam-examples\">Exam examples and multiple choice questions<\/h3>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-166ccf\">\n<li class=\" eplus-wrapper\">Some more examples of multiple choice questions (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/ExamplesMultipleChoiceQuestions.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-731eb2\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-0e853d\">\n<li class=\" eplus-wrapper\">Question 1: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 3: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 4: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 5: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 6: (d)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">GraphFrame &#8211; Examples of multiple choice questions (<a rel=\"noreferrer noopener\" href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/ExamplesMultipleChoiceQuestionsGraphFrame.pdf\" target=\"_blank\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-48d93f\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-df67c2\">\n<li class=\" eplus-wrapper\">Question 1: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam Example #1 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/03\/DistrBD_ExamExample1.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-ad7ef2\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-9c4898\">\n<li class=\" eplus-wrapper\">Question 1: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\"><a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/SolutionExamExample1.zip\">SolutionExamExample1.zip<\/a><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam Example #2 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/03\/DistrBD_ExamExample2.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-091c21\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-21b3eb\">\n<li class=\" eplus-wrapper\">Question 1: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\"><a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/SolutionExamExample2.zip\">SolutionExamExample2.zip<\/a><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam Example #3 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/DistrBD_ExamExample3.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-87f99c\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-41a98d\">\n<li class=\" eplus-wrapper\">Question 1: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\"><a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/SolutionExamExample3.zip\">SolutionExamExample3.zip<\/a><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam Example #4 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/DistrBD_ExamExample4.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-042a13\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-ec165b\">\n<li class=\" eplus-wrapper\">Question 1: (d)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (c)<\/li>\n\n\n\n<li class=\" eplus-wrapper\"><a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/SolutionExamExample4.zip\">SolutionExamExample4.zip<\/a><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Exam Example #5 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/DistrBD_ExamExample5.pdf\">pdf<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-e68c5f\">\n<li class=\" eplus-wrapper\">Solution<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-908954\">\n<li class=\" eplus-wrapper\">Question 1: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Question 2: (b)<\/li>\n\n\n\n<li class=\" eplus-wrapper\"><a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/SolutionExamExample5.zip\">SolutionExamExample5.zip<\/a><\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<hr class=\"wp-block-separator has-css-opacity eplus-wrapper\"\/>\n\n\n\n<h3 class=\"eplus-wrapper wp-block-heading\" id=\"additional-material\">Additional material<\/h3>\n\n\n<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-8f5326\">\n<li class=\" eplus-wrapper\">Slides and screencasts about Java (kindly provided by prof. Torchiano) (<a href=\"http:\/\/dbdmg.polito.it\/~paolo\/JavaMaterials\/02JEY%20-%20Object%20Oriented%20Programming.html\">link<\/a>)<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-c913cb\">\n<li class=\" eplus-wrapper\">Suggested slides\/lectures for those students who have never used Java<ul class=\"eplus-wrapper wp-block-list eplus-styles-uid-7f089a\">\n<li class=\" eplus-wrapper\">OO Paradigm and UML (The UML part is not mandatory)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">The Java Environment<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Java Basic Features<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Java Inheritance<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<div class=\"wp-block-buttons eplus-wrapper is-layout-flex wp-block-buttons-is-layout-flex\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>THIS IS THE WEB PAGE OF THE FORMER YEAR General Information SSD: ING-INF\/05 CFU: 8 Professor: Paolo Garza Teaching Assistant: Luca Colomba Announcements Teaching Material Exercise If you use your PC to write and run your code, import the projects based on Maven (those projects can be run locally).If you &hellip;<\/p>\n","protected":false},"author":5,"featured_media":3290,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"editor_plus_copied_stylings":"{}","footnotes":""},"categories":[37],"tags":[],"class_list":["post-3141","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-courses"],"_links":{"self":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts\/3141","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/comments?post=3141"}],"version-history":[{"count":105,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts\/3141\/revisions"}],"predecessor-version":[{"id":5672,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts\/3141\/revisions\/5672"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/media\/3290"}],"wp:attachment":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/media?parent=3141"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/categories?post=3141"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/tags?post=3141"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}