{"id":13575,"date":"2026-02-19T18:34:40","date_gmt":"2026-02-19T17:34:40","guid":{"rendered":"https:\/\/dbdmg.polito.it\/dbdmg_web\/?p=13575"},"modified":"2026-04-20T13:12:55","modified_gmt":"2026-04-20T11:12:55","slug":"distributed-architectures-for-big-data-processing-and-analytics-2025-2026","status":"publish","type":"post","link":"https:\/\/dbdmg.polito.it\/dbdmg_web\/2026\/distributed-architectures-for-big-data-processing-and-analytics-2025-2026\/","title":{"rendered":"Distributed architectures for big data processing and analytics (2025\/2026)"},"content":{"rendered":"\n<h2 class=\" wp-block-heading eplus-wrapper\">General Information<\/h2>\n\n\n\n<p class=\" eplus-wrapper\"><strong>SSD<\/strong>: ING-INF\/05<\/p>\n\n\n\n<p class=\" eplus-wrapper\"><strong>CFU<\/strong>: 8<\/p>\n\n\n\n<p class=\" eplus-wrapper\"><strong>Professor<\/strong>: Paolo Garza<\/p>\n\n\n\n<p class=\" eplus-wrapper\"><strong>Teaching Assistants<\/strong>: Simone Papicchio<\/p>\n\n\n\n<hr class=\" wp-block-separator has-css-opacity eplus-wrapper\"\/>\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n<h2 class=\" wp-block-heading eplus-wrapper\">Teaching material<\/h2>\n\n\n\n<h3 class=\" wp-block-heading eplus-wrapper\">Introduction<\/h3>\n\n\n<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-3c80db\">\n<li class=\" eplus-wrapper\">Introduction to the course content and exam rules (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/02\/00_Intro_DistributedBigData_2526.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>) &#8211; This slide deck contains the exam rules, which are also provided in the course description (<a href=\"https:\/\/didattica.polito.it\/pls\/portal30\/gap.pkg_guide.viewGap?p_cod_ins=01TUYWS&amp;p_a_acc=2026&amp;p_header=S&amp;p_lang=IT&amp;multi=N\">link<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Introduction to Big Data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/01_Intro_BigData_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Big Data Architectures (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/02_Architectures_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n<\/ul>\n\n\n<h3 class=\" wp-block-heading eplus-wrapper\">Hadoop and MapReduce<\/h3>\n\n\n<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-9ceee1\">\n<li class=\" eplus-wrapper\">Introduction to Apache Hadoop and the MapReduce programming paradigm (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/03_Intro_HadoopAndMapReduce_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Hadoop implementation of MapReduce (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/04_HadoopImplementationOfMapReduceNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-cca65b\">\n<li class=\" eplus-wrapper\">BigData@Polito environment + Jupyter \u2013 How to submit MapReduce jobs on BigData@Polito (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/02\/04b_ClusterJupyter_BigData_NewStyle.pdf\">slides<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce \u2013 Design patterns \u2013 Part 1 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/05_MapReduce_Patterns_Part1_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce and Hadoop \u2013 Advanced Topics: Multiple inputs, Multiple outputs, Distributed cache (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/06_AdvancedTopicsMapReduce_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce \u2013 Design patterns \u2013 Part 2 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/07_MapReduce_Patterns_Part2_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">MapReduce \u2013 Relational Algebra\/SQL operators (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/02\/08_SQLOperators_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n<\/ul>\n\n\n<h3 class=\" wp-block-heading eplus-wrapper\">Spark<\/h3>\n\n\n<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-6e1f54\">\n<li class=\" eplus-wrapper\">Introduction to Apache Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/10_SparkIntroduction_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-2c096c\">\n<li class=\" eplus-wrapper\">How to submit Spark applications (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/10b_SparkSubmit_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">How to use Jupyter Notebooks for your Spark applications (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/10c_JupyterNotebooks_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">You can install PySpark and JupyterLab using\u00a0<strong>Conda\/Miniconda\/pip<\/strong>\u00a0(<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">RDD-based programs<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-9a1e36\">\n<li class=\" eplus-wrapper\">RDDs: creation, basic transformations, and actions (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/11_SparkRDDBasedProgramming_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-7aa7c9\">\n<li class=\" eplus-wrapper\">Some examples (partially selected from the slides): Examples \u2013 Notebook (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/ExamplesSlides.zip\">ExamplesSlides.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Key-value RDDs: transformations and actions on key-value RDDs (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/12_SparkPairRDD_BigData_NewStyle.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-d4718c\">\n<li class=\" eplus-wrapper\">Inner join, left outer join, right outer join, full outer join, and \u201cNOT IN\u201d with PairRDDs: Examples \u2013 Notebook (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/04\/JoinsRDD.zip\" target=\"_blank\" rel=\"noreferrer noopener\">JoinsRDD.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">DoubleRDDs (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/13_SparkDoubleRDD_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Advanced Topics: Cache, accumulators, broadcast variables, custom partitioners, broadcast join (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/14_SparkRDDBasedProgramming_AdvancedTopics_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>) &#8211; Notebooks with some examples (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/11\/ExamplesAccumulatorPython.zip\">ExamplesAccumulatorPython.zip<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-b6ae87\">\n<li class=\" eplus-wrapper\">RDD partition examples (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/RDDPartitionsExamples.zip\" target=\"_blank\" rel=\"noreferrer noopener\">RDDPartitionsExamples.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Introduction to PageRank (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/15b_SparkIntroPageRankNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>) \u2013 Example: PageRank \u201cnaive\u201d implementation (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/RDDPageRank.zip\" target=\"_blank\" rel=\"noreferrer noopener\">RDDPageRank.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark SQL and DataFrames<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-c00a0d\">\n<li class=\" eplus-wrapper\">Spark SQL (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/16_SparkSQL_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-4407a4\">\n<li class=\" eplus-wrapper\">Simple examples \u2013 Jupyter notebook (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkSQLSimpleExamples.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkSQLSimpleExamples.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark SQL join examples \u2013 Jupyter notebook (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExamplesSparkSQLJoins.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExamplesSparkSQLJoins.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Data mining and Machine learning algorithms with Spark MLlib<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-c1e635\">\n<li class=\" eplus-wrapper\">Introduction and Preprocessing (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18a_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Classification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18b_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-0604fb\">\n<li class=\" eplus-wrapper\">Classification examples \u2013 Jupyter notebooks and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleClassificationMLlib.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleClassificationMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Clustering (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18c_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-84144d\">\n<li class=\" eplus-wrapper\">Clustering example \u2013 Jupyter notebook and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleClusteringMLlib.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleClusteringMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Regression (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18d_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-6220af\">\n<li class=\" eplus-wrapper\">Regression example \u2013 Jupyter notebook and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleRegressionMLlib.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleRegressionMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Itemset and Association rule mining (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/18e_SparkMLlib_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-0f77f3\">\n<li class=\" eplus-wrapper\">Itemset and Association rule mining example \u2013 Jupyter notebook and sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleItemsetMLlib.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleItemsetMLlib.zip<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">GraphX\/GraphFrames<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-9fd212\">\n<li class=\" eplus-wrapper\">Introduction to GraphX and GraphFrames (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/19_SparkGraphFrame_PartI_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Graph Algorithms with GraphFrames (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/20_SparkGraphFrame_Algorithms_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-6f2512\">\n<li class=\" eplus-wrapper\">Simple example \u2013 Jupyter notebook (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/GraphFrameExamples.zip\" target=\"_blank\" rel=\"noreferrer noopener\">GraphFrameExamples.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Select kernel GraphFrames (Yarn) to run it on jupyter.polito.it<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Run \u201cpyspark \u2013packages graphframes:graphframes:0.8.1-spark3.0-s_2.12 \u2013repositories https:\/\/repos.spark-packages.org\u201d to run it locally on your PC \u2013 Use package graphframes:graphframes:0.8.0-spark2.4-s_2.11 if you locally installed Spark 2 instead of Spark 3<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Streaming data analytics<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-bcb159\">\n<li class=\" eplus-wrapper\">Spark Streaming Spark Streaming (DStreams) (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/21_SparkStreaming_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-2bc52d\">\n<li class=\" eplus-wrapper\">Simple examples \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkSteamingExamples.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkSteamingExamples.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Structured Streaming (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/22_SparkStructuredStreaming_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-852635\">\n<li class=\" eplus-wrapper\">Simple examples \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleStructutedStreaming.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkStructutedStreamingExamples.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Introduction to other big stream processing frameworks: Apache Storm, Apache Flink, \u2026 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/23_StreamingFrameworks_DistributedBigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul>\n\n\n<h2 class=\" wp-block-heading eplus-wrapper\">Exercises<\/h2>\n\n\n\n<h3 class=\" wp-block-heading eplus-wrapper\">MapReduce<\/h3>\n\n\n<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-eb4a2d\">\n<li class=\" eplus-wrapper\">MapReduce Exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/01_MapReduce_Exercises_BigData_NewStyle.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">slides<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-7c9ecf\">\n<li class=\" eplus-wrapper\">Solutions of Exercises 1-29 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/SolutionsExMapReduce.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SolutionsExMapReduce.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\"has-sitetext-color has-text-color has-link-color eplus-wrapper wp-elements-c82186e8431014074925478d40a9fc09\"><mark style=\"background-color:#ece801\" class=\"has-inline-color has-sitetext-color\">How to Write and Compile your Java Application using VSCode<\/mark> (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/10\/BigData_labs-VSCode_guide.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Linux or Mac: Basic project for MapReduce applications (<strong>based on Maven<\/strong>) (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/MapReduceBasicProject.zip\" target=\"_blank\" rel=\"noreferrer noopener\">MapReduceBasicProject.zip<\/a>) (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/10\/BigData_labs-VSCode_guide.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Windows: Basic project for MapReduce applications (<strong>based on Maven<\/strong>) (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/MapReduceBasicProjectWindows.zip\" target=\"_blank\" rel=\"noreferrer noopener\">MapReduceBasicProjectWindows.zip<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-6c66fe\">\n<li class=\" eplus-wrapper\">How to configure the Windows environment to run MapReduce applications locally on your PC(<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/ConfigureWindowsEnviroment.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">ConfigureWindowsEnviroment.pdf<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\"><strong>You must also install<\/strong>\u00a0<strong>JDK 1.8<\/strong>\u00a0and select it for the imported project inside the IDE. If you have already installed the JDK environment but the version is greater than JDK 1.8, you must also install<strong>\u00a0JDK 1.8<\/strong>.<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Winutils executable (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/09\/winutils.zip\" target=\"_blank\" rel=\"noreferrer noopener\">winutils.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n<li class=\" eplus-wrapper eplus-styles-uid-2689a9\">If you use your PC to write and run your code locally, use the projects based on Maven (those projects can be run locally).<\/li>\n\n<li class=\" eplus-wrapper eplus-styles-uid-2689a9\">If you use the PC available in the LAB, import the projects with libraries as reported in the first lab (those projects cannot be run locally, but only on the cluster by exporting the project jar file).<\/li><\/ul>\n\n\n<h3 class=\" wp-block-heading eplus-wrapper\">Spark<\/h3>\n\n\n<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-68e647\">\n<li class=\" eplus-wrapper\">Spark RDD-based exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/02_Spark_Exercises_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-48af94\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/ExSparkData30_46.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExSparkData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">RDD-based solutions of Exercises 30-46 \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/SolutionsExSpark30_46.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkNotebooksSol30_46.zip<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-01a557\">\n<li class=\" eplus-wrapper\">Solution of Exercise 44 based on Left Outer Join (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/04\/ex44LeftOuterJoin.zip\">ex44LeftOuterJoin.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution of Exercise 46 based on Spark SQL APIs + RDD.groupByKey() \u2013 Example to show how to create and manage \u201cstatic windows\u201d with almost only Spark SQL APIs (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/06\/ex46_DF.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ex46_DF.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">PySpark Installation Guide<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-a9a152\">\n<li class=\" eplus-wrapper\">How to run PySpark applications on your PC or Google Colab: You can install PySpark and JupyterLab using\u00a0<strong>Conda\/Miniconda\/pip<\/strong>\u00a0(<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<\/li>\n<\/ul><\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark SQL exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/02_Spark_ExerciseSparkSQLNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-e6d56d\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExSparkSQLData.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExSparkSQLData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercises 47-50 \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol47_50.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkNotebooksSol47_50.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark MLlib exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/03_MLlib_Exercises_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-577fba\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleMLlibData.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleMLlibData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercise 51 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol51.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkNotebooksSol51.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">GraphFrame exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/04_GraphFrame_Exercises_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-4168c3\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleGraphFrameData.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleGraphFrameData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercises 52-57b \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/SparkNotebooksSol52_57b.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkNotebooksSol52_57b.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark streaming exercises (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/05_SparkStreaming_Exercises_BigDataNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-7845a9\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleSparkStreamingData-1.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleSparkStreamingData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solutions of Exercises 58-65 \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol58_65.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkNotebooksSol58_65.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Spark structured streaming and MLlib exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/06_SparkStructuredStreamingAndMLlib_ExercisesNB.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<ul class=\" wp-block-list eplus-wrapper eplus-styles-uid-fb0f13\">\n<li class=\" eplus-wrapper\">Example data \u2013 One folder with (few) data for each exercise (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ExampleSparkStructuredMLlibData.zip\" target=\"_blank\" rel=\"noreferrer noopener\">ExampleSparkStructuredMLlibData.zip<\/a>)<\/li>\n\n\n\n<li class=\" eplus-wrapper\">Solution of Exercise 66 \u2013 Jupyter notebooks (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/SparkNotebooksSol66.zip\" target=\"_blank\" rel=\"noreferrer noopener\">SparkNotebooksSol66.zip<\/a>)<\/li>\n<\/ul><\/li>\n\n\n\n<li class=\" eplus-wrapper\">Further exercises focused on Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/06\/SparkExercisesGruppoStudioHKN.zip\" target=\"_blank\" rel=\"noreferrer noopener\">zip<\/a>) shared by Nunzio Licalzi. These exercises have been used during the Study Group organized by IEEE Eta Kappa Nu.<\/li>\n<\/ul>\n\n\n<h2 class=\" wp-block-heading eplus-wrapper\">Laboratory Material<\/h2>\n\n\n<p class=\" eplus-wrapper eplus-styles-uid-2689a9\">No lab activities during the first week.<\/p>\n\n\n<p class=\" eplus-wrapper\">Team 1: Students from A to D \u2013 Tuesday from 11:30 to 13:00 (First lab activity \u2013 March 3, 2026) @&nbsp;<a href=\"https:\/\/www.labinf.polito.it\/\">LABINF<\/a><br>Team 2: Students from E to M \u2013 Friday from 11:30 to 13:00 (First lab activity \u2013 March 6, 2026) @&nbsp;<a href=\"https:\/\/www.labinf.polito.it\/\" target=\"_blank\" rel=\"noreferrer noopener\">LABINF<\/a><br>Team 3: Students from N to Z \u2013 Friday from 16:00 to 17:30 (First lab activity \u2013 March 6, 2026) @&nbsp;<a href=\"https:\/\/www.labinf.polito.it\/\" target=\"_blank\" rel=\"noreferrer noopener\">LABINF<\/a><\/p>\n\n\n<div class=\"wp-block-columns eplus-wrapper is-layout-flex wp-container-core-columns-is-layout-9d6595d7 wp-block-columns-is-layout-flex eplus-styles-uid-913d32\">\n<div class=\"wp-block-column eplus-wrapper is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:100%\"><figure class=\" wp-block-table eplus-wrapper eplus-styles-uid-af32dd\"><table><tbody><tr><td>Schedule<\/td><td>Problem specification and input data<\/td><td>Solution (Maven-based for Java)<\/td><\/tr><tr><td>Team 1: March 3, 2026 &#8211; 11:30-13:00<br>Team 2: March 6, 2026 &#8211; 11:30-13:00<br>Team 3: March 6, 2026 &#8211; 16:00-17:30<\/td><td><strong>Lab 1<\/strong>: Hadoop and MapReduce<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/10\/Lab1_BigData_vscode.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>&#8211; Basic project with libraries and a small example dataset (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/04\/Lab1_BigData_with_libraries_vscode.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab1_BigData_with_libraries_vscode.zip<\/a>)<br>&#8211; Basic project based on Maven \u2013 Use this version to run the MapReduce application locally on your own PC (<strong><mark>DO NOT USE IT AT LABINF<\/mark><\/strong>)<br>\u2014 Linux and macOS Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab1.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab1.zip<\/a>)<br>\u2014 Windows Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab1Windows.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab1_Windows.zip<\/a>)<br>Bigger dataset: finefoods_text.txt (<a href=\"https:\/\/www.dropbox.com\/s\/fswdiblx15mhmyo\/finefoods_text.zip?dl=0\" target=\"_blank\" rel=\"noreferrer noopener\">zip<\/a>)<\/td><td>Solution: <a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab1_SolBonusMvn.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Bonus track Lab1_SolBonusMvn.zip<\/a><\/td><\/tr><tr><td>Team 1: March 10, 2026 &#8211; 11:30-13:00<br>Team 2: March 13, 2026 &#8211; 11:30-13:00<br>Team 3: March 13, 2026 &#8211; 16:00-17:30<\/td><td><strong>Lab 2<\/strong>: Filter with Hadoop MapReduce<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/Lab2_DBD_2025_2026.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>&#8211; Skeleton project Hadoop \u2014 MapReduce with libraries (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/10\/Lab2_Skeleton_with_libraries_vscode.zip\" target=\"_blank\" rel=\"noreferrer noopener\">lib<\/a>)<br>&#8211; Basic Maven project (<strong><strong><strong><mark>DO NOT USE IT AT LABINF<\/mark><\/strong><\/strong><\/strong>)<br>\u2014 Linux and macOS Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab2_Skeleton.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab2_Skeleton.zip<\/a>)<br>\u2014 Windows Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab2Windows_Skeleton.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab2Windows_Skeleton.zip<\/a>)<br>Outputs of the first lab (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/OutputFolderLab1.zip\" target=\"_blank\" rel=\"noreferrer noopener\">OutputFolderLab1.zip<\/a>) (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/OutputFolderLab1BonusTrack.zip\" target=\"_blank\" rel=\"noreferrer noopener\">OutputFolderLab1BonusTrack.zip<\/a>). You can use them to test your application locally on your PC.<\/td><td>Solution: <a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab2_Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab2_Sol.zip<\/a><br>Solution Bonus track: <a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab2_SolBonus.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab2_SolBonus.zip<\/a><\/td><\/tr><tr><td>Team 1: March 17, 2026 &#8211; 11:30-13:00<br>Team 2: March 20, 2026 &#8211; 11:30-13:00<br>Team 3: March 20, 2026 &#8211; 16:00-17:30<\/td><td><strong>Lab 3<\/strong>: Frequently bought\/reviewed together with Hadoop and MapReduce<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/Lab3_DBD_2025_2026.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>&#8211; Skeleton project Hadoop \u2014 MapReduce with libraries (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/10\/Lab3_Skeleton_with_libraries_vscode.zip\" target=\"_blank\" rel=\"noreferrer noopener\">lib<\/a>)<br>&#8211; Basic Maven project (<strong><strong><strong><mark>DO NOT USE IT AT LABINF<\/mark><\/strong><\/strong><\/strong>)<br>\u2014 Linux and macOS Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab3_Skeleton.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab3_Skeleton.zip<\/a>)<br>\u2014 Windows Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab3Windows_Skeleton.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab3Windows_Skeleton.zip<\/a>)<br>Sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/AmazonTransposedDataset_Sample.txt\" target=\"_blank\" rel=\"noreferrer noopener\">AmazonTransposedDataset_Sample.txt<\/a>)<\/td><td>Solution:\u00a0<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DBD_Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab3_Sol.zip<\/a><br>\u2014 Comments on the three uploaded solutions (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/Lab3_DraftSolution_BigData_NewStyle.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>\u2014 <strong>The second solution MUST NOT BE USED<\/strong> &#8211; It is highly inefficient<\/td><\/tr><tr><td>Team 2 and Team 3: March 23, 2026 &#8211; 14:30-16:00 &#8211; <a href=\"https:\/\/www.polito.it\/mappe?bl_id=TO_CEN03&amp;fl_id=XP02&amp;rm_id=H008\" target=\"_blank\" rel=\"noreferrer noopener\">LAIB 1<br><\/a>Team 3: March 23, 2026 &#8211; 16:00-17:30 &#8211; LABINF<\/td><td><strong>Lab 4<\/strong>: Normalized ratings for product recommendations with Hadoop MapReduce<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/Lab4_DBD_2025_2026.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>&#8211; Skeleton project Hadoop \u2013 MapReduce with libraries (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/10\/Lab4_Skeleton_with_libraries_vscode.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab4_Skeleton_with_libraries_vscode.zip<\/a>)<br>&#8211; Basic Maven project (<strong><strong><strong><mark>DO NOT USE IT AT LABINF<\/mark><\/strong><\/strong><\/strong>)<br>\u2014 Linux and macOS Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab4_Skeleton.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab4_Skeleton.zip<\/a>)<br>\u2014 Windows Maven-based project (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/Lab4Windows_Skeleton.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab4Windows_Skeleton.zip<\/a>)<br>Sample file (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/10\/ReviewsSample.csv\" target=\"_blank\" rel=\"noreferrer noopener\">ReviewsSample.csv<\/a>)<br>Large file (<a href=\"https:\/\/drive.google.com\/file\/d\/1CY8KURQQcZULENtx65vhCBX0zV5R7hQQ\/view?usp=sharing\" target=\"_blank\" rel=\"noreferrer noopener\">Reviews_cleaned.csv<\/a>)<\/td><td>Solution: <a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/11\/Lab4_Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab4_Sol.zip<\/a><\/td><\/tr><tr><td>Team 1: March 31, 2026 &#8211; 11:30-13:00<br>Teams 2 and 3: April 10, 2026 &#8211; 16:00-17:30<\/td><td><strong>Lab 5<\/strong>: Filter data and compute basic statistics with Apache Spark<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/Lab5_DBD_2025_2026.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>Sample file (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/03\/SampleLocalFile.csv\" target=\"_blank\" rel=\"noreferrer noopener\">SampleLocalFile.csv<\/a>)<br>Larger file (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/03\/OutputFolderLab1.zip\">OutputFolderLab1.zip<\/a>)<br>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br>You can install PySpark and JupyterLab on your own PC using\u00a0<strong>Conda\/Miniconda\/pip<\/strong>\u00a0(<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<br>You can also run PySpark on Google Colab (<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<\/td><td>Solution: <a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/04\/Lab5_Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab5_Sol.zip<\/a><\/td><\/tr><tr><td>Team 1: April 14, 2026 &#8211; 11:30-13:00<br>Teams 2 and 3: April 17, 2026 &#8211; 16:00-17:30<\/td><td><strong>Lab 6<\/strong>: Frequently bought\/reviewed together application with Apache Spark<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/04\/Lab6_DBD_2025_2026.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<br>Sample dataset (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/ReviewsSample.csv\" target=\"_blank\" rel=\"noreferrer noopener\">ReviewsSample.csv<\/a>)<br>Larger file (Reviews.csv &#8211; <a href=\"https:\/\/www.dropbox.com\/scl\/fi\/thsye3w2tbd49h3k90yie\/Reviews.zip?rlkey=9rzl7msmxh50emi5woznjaj4l&amp;st=ms09r9tn&amp;dl=0\" target=\"_blank\" rel=\"noreferrer noopener\">zip version<\/a>)<br>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br>You can install PySpark and JupyterLab on your own PC using\u00a0<strong>Conda\/Miniconda\/pip<\/strong>\u00a0(<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<br>You can also run PySpark on Google Colab (<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<\/td><td>Solution:\u00a0<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/04\/Lab6_DBD_Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">Lab6_Sol.zip<\/a><\/td><\/tr><tr><td>Team 1: April 21, 2026 &#8211; 11:30-13:00<br>Teams 2 and 3: April 24, 2026 &#8211; 16:00-17:30<\/td><td><strong>Lab 7<\/strong>: Bike sharing data analysis<br>Problem specification (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2026\/04\/Lab7_DBD_2025_2026.pdf\">pdf<\/a>)<br>Sample data (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/11\/sampleData.zip\" target=\"_blank\" rel=\"noreferrer noopener\">sampleData.zip<\/a>)<br>Example KML file (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/11\/exampleKML.zip\" target=\"_blank\" rel=\"noreferrer noopener\">exampleKML.zip<\/a>)<br><br>Complete\/whole data (<a href=\"https:\/\/www.dropbox.com\/scl\/fi\/ms3qghbnes6697x1e982n\/datiCompleti.zip?rlkey=gcu73fwvdjy97mr6to5bzzfd8&amp;st=xzgslb3j&amp;dl=0\">datiCompleti.zip<\/a>)<br><br><strong>Expected output<\/strong><br>\u2014 Execution on sample data (sampleData\/registerSample.csv and sampleData\/stations.csv) and minimum criticality threshold = 0.4 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/11\/resSampleData0.4-1.txt\" target=\"_blank\" rel=\"noreferrer noopener\">part-00000<\/a>)<br>\u2014 Execution on complete data (\/share\/students\/bigdata\/Dati\/Lab7\/datiCompleti\/register.csv and \/share\/students\/bigdata\/Dati\/Lab7\/datiCompleti\/\/stations.csv) and minimum criticality threshold = 0.6 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2021\/11\/resAllData0.6-1.txt\" target=\"_blank\" rel=\"noreferrer noopener\">part-00000<\/a>)<br>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br>You can install PySpark and JupyterLab on your own PC using\u00a0<strong>Conda\/Miniconda\/pip<\/strong>\u00a0(<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<br>You can also run PySpark on Google Colab (<a href=\"https:\/\/github.com\/dbdmg\/pyspark-install\" target=\"_blank\" rel=\"noreferrer noopener\">instructions here<\/a>)<\/td><td><\/td><\/tr><\/tbody><\/table><\/figure><\/div>\n<\/div>\n\n\n<p class=\" eplus-wrapper\"><\/p>\n\n\n\n\n\n<p class=\" eplus-wrapper\"><\/p>\n\n\n\n<h2 class=\" wp-block-heading eplus-wrapper\">Previous exam examples<\/h2>\n\n\n\n<figure class=\" wp-block-table eplus-wrapper\"><table><tbody><tr><td>Exams<\/td><td>Solutions<\/td><\/tr><tr><td>Exam July 11, 2025 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/07\/DBD_Exam_2025_07_11_v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (c)<br>Question 2: (c)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/07\/dbd_20250711.zip\">DBD_Exam20250711Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam June 27, 2025 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/07\/DBD_Exam_2025_06_27.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (a)<br>Question 2: (b)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/07\/dbd_20250627.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20250627Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam February 10, 2025 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/06\/DBD_Exam_2025_02_10.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b)<br>Question 2: (a)<\/td><\/tr><tr><td>Exam September 6, 2024 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/09\/DBD_Exam_2024_09_06.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (a) &#8211; The three codes are equivalent. They are based on commutative functions\/methods.<br>Question 2: (a) &#8211; There are 3 distinct keys emitted by the map phase. Hence, the reduce method is invoked 3 times. It follows that the sum of the values of the three instances of numCitiesD is 3.<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/09\/dbd_20240906.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20240906Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam July 19, 2024 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/07\/DBD_Exam_2024_07_19.pdf\">pdf<\/a>)<\/td><td>Question 1: (b) &#8211; 2 times &#8211; Three actions are based on the content of the input file, but highTempRDD is cached. Hence, the input file is read once to compute the value of the count action applied to tempRDD and then one more time to compute the content of highTempRDD, which is then used to calculate the results of the actions count and reduce applied to highTempRDD. Globally, due to the cache of highTempRDD, the input file is read twice. <br>Question 2: (d) &#8211; 6 &#8211; There are 6 input lines =&gt; the map method is invoked, overall, 6 times.<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/07\/dbd_20240719.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20240719Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam July 5, 2024 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/07\/DBD_Exam_2024_07_05.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (c) &#8211; Application B is not equivalent to A and C because .reduce(lambda v1,v2: min(v1, v2) ).filter(lambda value : value&gt;5) is not equivalent to .filter(lambda value : value&gt;5).reduce(lambda v1,v2: min(v1, v2) ). The two functions are not commutative.<br>Question 2: (a) &#8211; Considering all instances of the reducer class, the reduce method is invoked 3 times overall (2 + 1 + 0).<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/07\/dbd_20240705.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20240705Sol.zip<\/a>) &#8211; <strong>A more efficient solution based on one single job has been uploaded &#8211; June 3, 2025<\/strong><br>Sketch of a solution based on SQL (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/07\/DraftSQLBased.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">SQLBasedSolution.pdf<\/a>)<\/td><\/tr><tr><td>Exam February 20, 2024 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/06\/DBD_Exam_2024_02_20.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (a), Question 2: (b)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/06\/DraftSolution20240220DBD.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20240220Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam September 18, 2023 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/06\/DBD_Exam20230918.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (c), Question 2: (c)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/07\/Draft_DBD_EXAM_20230918.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Paper-based sketch of the solution &#8211; No code_ Exam20230918.pdf<\/a>)<\/td><\/tr><tr><td>Exam July 19, 2023 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/07\/DBD_Exam20230719.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (a), Question 2: (b)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/07\/dbd_20230719.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20230719Sol.zip<\/a>) &#8211; with an SQL-based solution and some example data &#8211; <strong>Updated on June 2, 2025<\/strong><\/td><\/tr><tr><td>Exam June 26, 2023 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/07\/DBD_Exam20230626.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b), Question 2: (c)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2023\/07\/DBD_Exam20230626Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20230626Sol.zip<\/a>) &#8211; with an SQL-based solution and some example data<\/td><\/tr><tr><td>Exam September 1, 2022 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/09\/DBD_Exam20220901.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b), Question 2: (d)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/09\/DBD_Exam20220901Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20220901Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam July 18, 2022 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/07\/DBD_Exam20220718.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b), Question 2: (b)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/07\/DraftSolution20220718.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20220718Sol.zip<\/a>) &#8211; with an SQL-based solution &#8211; Example related to &#8220;static windows&#8221; and how to manage them either RDD or Spark SQL APIs<\/td><\/tr><tr><td>Exam June 27, 2022 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DBD_Exam20220627.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (c), Question 2: (a)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DBD_Exam20220627Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20220607Sol.zip<\/a>) &#8211; with an SQL-based solution and some example data<\/td><\/tr><tr><td>Exam February 10, 2022 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/DBD_Exam20220210.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (a), Question 2: (b)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DBD_Exam20220210Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20220210Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam September 17, 2021 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/04\/DBD_Exam20210917.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b), Question 2: (a)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/06\/DraftSolution20210917.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20210917.zip<\/a>)<\/td><\/tr><tr><td>Exam July 5, 2021 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2025\/04\/DBD_Exam20210705.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (c), Question 2: (a)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/05\/DBD_Exam20210705Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20210705Sol.zip<\/a>) &#8211; with an SQL-based solution<\/td><\/tr><tr><td>Exam June 21, 2021 (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/DBD_Exam20210621.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b), Question 2: (a)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/DraftSolutionExam_20210621.zip\">DBD_Exam20210621Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam July 20, 2020 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/04\/DBD_Exam20200720.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (d), Question 2: (b)<br>Question 2 \u2013 Note that there are three actions. Hence, the input file is read three times.<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/05\/DBD_Exam20200720Sol.zip\" target=\"_blank\" rel=\"noreferrer noopener\">DBD_Exam20200720Sol.zip<\/a>)<\/td><\/tr><tr><td>Exam June 27, 2020 (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2024\/04\/DBD_Exam20200627.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (b), Question 2: (a)<br>MapReduce and Spark (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2020\/06\/DBD_Exam20200627Sol.zip\">DBD_Exam20200627Sol.zip<\/a>)<\/td><\/tr><tr><td>More examples of multiple choice questions (<a href=\"https:\/\/dbdmg.polito.it\/wordpress\/wp-content\/uploads\/2021\/06\/ExamplesMultipleChoiceQuestions.pdf\">pdf<\/a>)<br>\ufeff<\/td><td>Question 1: (c)<br>Question 2: (d)<br>Question 3: (d)<br>Question 4: (d)<br>Question 5: (b)<br>Question 6: (d)<\/td><\/tr><tr><td>GraphFrame \u2013 Examples of multiple choice questions (<a href=\"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-content\/uploads\/2022\/05\/ExamplesMultipleChoiceQuestionsGraphFrame.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">pdf<\/a>)<\/td><td>Question 1: (d)<br>Question 2: (c)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p class=\" eplus-wrapper\"><\/p>\n\n\n\n<h3 class=\" wp-block-heading eplus-wrapper\">Additional material<\/h3>\n\n\n\n<p class=\" eplus-wrapper\">Slides and screencasts about Java (kindly provided by Prof. Torchiano) (<a href=\"http:\/\/dbdmg.polito.it\/~paolo\/JavaMaterials\/02JEY%20-%20Object%20Oriented%20Programming.html\">link<\/a>)<br>Focus on the following subset of slides\/lectures (for students who have never used Java):<br>&#8212; OO Paradigm and UML (The UML part is not mandatory)<br>&#8212; The Java Environment<br>&#8212;  Java Basic Features<br>&#8212; Java Inheritance<\/p>\n\n\n\n<p class=\" eplus-wrapper\">Other material about Java (<a href=\"https:\/\/softeng.polito.it\/torchiano\/JavaBook\/\" target=\"_blank\" rel=\"noreferrer noopener\">link<\/a>)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>General Information SSD: ING-INF\/05 CFU: 8 Professor: Paolo Garza Teaching Assistants: Simone Papicchio Teaching material Introduction Hadoop and MapReduce Spark Exercises MapReduce Spark Laboratory Material No lab activities during the first week. Team 1: Students from A to D \u2013 Tuesday from 11:30 to 13:00 (First lab activity \u2013 March &hellip;<\/p>\n","protected":false},"author":5,"featured_media":3290,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"editor_plus_copied_stylings":"{}","footnotes":""},"categories":[37],"tags":[],"class_list":["post-13575","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-courses"],"_links":{"self":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts\/13575","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/comments?post=13575"}],"version-history":[{"count":81,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts\/13575\/revisions"}],"predecessor-version":[{"id":14108,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/posts\/13575\/revisions\/14108"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/media\/3290"}],"wp:attachment":[{"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/media?parent=13575"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/categories?post=13575"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dbdmg.polito.it\/dbdmg_web\/wp-json\/wp\/v2\/tags?post=13575"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}