DataBase and Data Mining Group

Paolo Garza

Paolo Garza, Associate professor

I have been an associate professor at the Dipartimento di Automatica e Informatica, Politecnico di Torino since December 2018. Before that, I spent three years as an assistant professor at Politecnico di Milano. I received my master’s and Ph.D. degrees in computer engineering from Politecnico di Torino.
I coauthored more than 100 papers in the areas of data mining and machine learning. My research interests are in the fields of data mining, database systems, and big data analytics.

Research Interests

Big Data, Data Mining, Associative Classification, Textual Data Summarization, Itemset Mining, Clustering.

  • Deep Natural Language Processing
    • Yihao Ding, Lorenzo Vaiani, Soyeon Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero: 3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding. ACL (Findings) 2024: 15233-15244 (2024) (link)
    • Daniele Rege Cambrin, Giuseppe Gallipoli, Irene Benedetto, Luca Cagliero, Paolo Garza: Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-Tuning. EMNLP (Findings) 2024: 12060-12079 (2024) (link)
  • Textual Data Summarization
    • Daniele Rege Cambrin, Luca Cagliero, Paolo Garza: DQNC2S: DQN-Based Cross-Stream Crisis Event Summarizer. ECIR (3) 2024: 422-430 (2024) (link)
    • Daniele Rege Cambrin, Paolo Garza: Paraphrase Loss for Abstractive Summarization. Tiny Papers @ ICLR 2024 (2024) (link)
    • Luca Cagliero, Paolo Garza, Elena Baralis: ELSA: A Multilingual Document Summarization Algorithm Based on Frequent Itemsets and Latent Semantic Analysis. ACM Trans. Inf. Syst. 37(2): 21:1-21:33 (2019) (link)
    • Elena Baralis, Luca Cagliero, Alessandro Fiori, Paolo Garza: MWI-Sum: A Multilingual Summarizer Based on Frequent Weighted Itemsets. ACM Trans. Inf. Syst. 34(1): 5:1-5:35 (2015) (link)
  • Big Data
    • Matteo Corain, Paolo Garza, Abolfazl Asudeh: DBSCOUT: A Density-based Method for Scalable Outlier Detection in Very Large Datasets. ICDE 2021: 37-48 (2021) (link)
    • Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, Fabio Pulvirenti, Pietro Michiardi: A Parallel MapReduce Algorithm to Efficiently Support Itemset Mining on High Dimensional Data. Big Data Res. 10: 53-69 (2017) (link)
  • Clustering
    • Luca Colomba, Luca Cagliero, Paolo Garza: Density-Based Clustering by Means of Bridge Point Identification. IEEE Trans. Knowl. Data Eng. 35(11): 11274-11287 (2023) (link)
  • Associative Classification
    • Elena Baralis, Luca Cagliero, Paolo Garza: EnBay: A Novel Pattern-Based Bayesian Classifier. IEEE Trans. Knowl. Data Eng. 25(12): 2780-2795 (2013) (link)
    • Elena Baralis, Silvia Chiusano, Paolo Garza: A Lazy Approach to Associative Classification. IEEE Trans. Knowl. Data Eng. 20(2): 156-171 (2008) (link)
    • Elena Baralis, Paolo Garza: Majority Classification by Means of Association Rules. PKDD 2003: 35-46 (2003) (link)
    • Elena Baralis, Paolo Garza: A Lazy Approach to Pruning Classification Rules. ICDM 2002: 35-42 (2002) (link)
  • Itemset Mining
    • Luca Cagliero, Paolo Garza: Infrequent Weighted Itemset Mining Using Frequent Pattern Growth. IEEE Trans. Knowl. Data Eng. 26(4): 903-915 (2014) (link)
    • Elena Baralis, Luca Cagliero, Tania Cerquitelli, Paolo Garza: Generalized association rule mining with constraints. Inf. Sci. 194: 68-84 (2012) (link)
  • Principal investigator/Co-Principal investigator
    • Machine learning for networksupervision and fault management, (2021-2021) – Funded by TIM S.P.A. – Principal investigator
    • ML4QoE (Machine Learning for QoE): re-enabling QoE for multiparty real time, (2019-2022) – Funded by Cisco Systems Inc. (USA) – Co-Principal investigator
  • Participant
    • (I-REACT) Improving Resilience to Emergencies through Advanced Cyber Technologies (2016-2019) – H2020 European project – Funded by the European Community – Data Protection Officer and Privacy and Security Manager, Task Leader
    • (ONTIC) Online Traffic Network Characterization (2014-2017) – FP7 Eupoean project – Funded by the European Community
  • Conference General Co-Chair
  • Workshop Chair
  • Local conference chair
    • The 18th IEEE International Conference Application of Information and Communication Technologies, IEEE AICT 2024
    • 26th European Conference on Advances in Databases and Information Systems, ADBIS 2022
  • AnalytiCup Co-chair
  • Challenge Organizer
    • ChaBuD: Change detection for Burned area Delineation, Challenge co-located with ECML/PKDD 2023
  • Program Committee Member (selected conferences)
    • ACM SIGMOD 2025
    • VLDB 2025
    • EDBT 2025
    • ECML/PKDD 2025 – Journal track
    • PAKDD 2025
    • IEEE ICDM 2024 – Area Chair
    • ACM CIKM 2024
    • ECML/PKDD 2024 – Journal track
    • PAKDD 2024
    • ACM RecSys 2024 – Reproducibility Track and Demo Track
    • IEEE ICDM 2023
    • ACM SIGMOD 2023
    • ACM CIKM 2023 – I am proud to be on the list of Distinguished Referees for CIKM’23
    • ECML/PKDD 2023 – Journal track
    • ACM RecSys 2023 – Reproducibility Track
    • DSAA 2023
    • IEEE ICDM 2022
    • ACM CIKM 2022 – Short Paper Track
    • ECML/PKDD 2022 – Journal track
    • ACM CIKM 2021 – Resource Track
    • ACM KDD 2020 – Applied Data Science
    • ACM CIKM 2020 – Resource Track
  • Editorial Role for International Journals
    • Associate Editor: Knowledge and Information Systems (KAIS), Springer – Since December 2024
    • Associate Editor: Expert Systems with Applications (ESWA), Elsevier – Since August 2022
  • Reviewer for International Journals (selected journals)
    • IEEE TKDE
    • ACM TOIS
  • External reviewer
    • ACM KDD
    • IEEE ICDM