DataBase and Data Mining Group

Paolo Garza

Paolo Garza, Associate Professor

I have been an associate professor at the Dipartimento di Automatica e Informatica, Politecnico di Torino since December 2018. Before that, I spent three years as an assistant professor at Politecnico di Milano. I received my master’s and Ph.D. degrees in computer engineering from Politecnico di Torino.
I coauthored more than 100 papers in the areas of data mining and machine learning. My research interests are in the fields of data mining, database systems, and big data analytics.

Research Interests

Big Data, Data Mining, Associative Classification, Textual Data Summarization, Itemset Mining, Clustering.

  • Deep Natural Language Processing
    • Lorenzo Vaiani, Yihao Ding, Luca Cagliero, Jean Lee, Paolo Garza, Josiah Poon, Soyeon Caren Han: KIEPrompter: Leveraging Lightweight Models’ Predictions for Cost-Effective Key Information Extraction using Vision LLMs. CIKM 2025: 2925-2934 (2025) (link)
    • Yihao Ding, Lorenzo Vaiani, Soyeon Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero: 3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding. ACL (Findings) 2024: 15233-15244 (2024) (link)
    • Daniele Rege Cambrin, Giuseppe Gallipoli, Irene Benedetto, Luca Cagliero, Paolo Garza: Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-Tuning. EMNLP (Findings) 2024: 12060-12079 (2024) (link)
  • Textual Data Summarization
    • Daniele Rege Cambrin, Luca Cagliero, Paolo Garza: DQNC2S: DQN-Based Cross-Stream Crisis Event Summarizer. ECIR (3) 2024: 422-430 (2024) (link)
    • Daniele Rege Cambrin, Paolo Garza: Paraphrase Loss for Abstractive Summarization. Tiny Papers @ ICLR 2024 (2024) (link)
    • Luca Cagliero, Paolo Garza, Elena Baralis: ELSA: A Multilingual Document Summarization Algorithm Based on Frequent Itemsets and Latent Semantic Analysis. ACM Trans. Inf. Syst. 37(2): 21:1-21:33 (2019) (link)
    • Elena Baralis, Luca Cagliero, Alessandro Fiori, Paolo Garza: MWI-Sum: A Multilingual Summarizer Based on Frequent Weighted Itemsets. ACM Trans. Inf. Syst. 34(1): 5:1-5:35 (2015) (link)
  • Machine learning applied to Earth Observation
    • Daniele Rege Cambrin, Eleonora Poeta, Eliana Pastor, Isaac Corley, Tania Cerquitelli, Elena Baralis, Paolo Garza: HydroChronos: Forecasting Decades of Surface Water Change. SIGSPATIAL/GIS 2025: in press (2025) (link) – Among the Best Papers Candidates (link)
    • Daniele Rege Cambrin, Luca Colomba, Paolo Garza: Magnifier: A Multigrained Neural Network-Based Architecture for Burned Area Delineation. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 18: 12263-12277 (2025) (link)
    • Daniele Rege Cambrin, Luca Colomba, Paolo Garza: CaBuAr: California Burned Areas dataset for delineation [Software and Data Sets]. IEEE Geoscience and Remote Sensing Magazine, vol. 11, no. 3: 106-113 (2023) (link)
    • Luca Colomba, Alessandro Farasin, Simone Monaco, Salvatore Greco, Paolo Garza, Daniele Apiletti, Elena Baralis, Tania Cerquitelli: A Dataset for Burned Area Delineation and Severity Estimation from Satellite Imagery. CIKM 2022: 3893-3897 (2022) (link)
  • Big Data
    • Matteo Corain, Paolo Garza, Abolfazl Asudeh: DBSCOUT: A Density-based Method for Scalable Outlier Detection in Very Large Datasets. ICDE 2021: 37-48 (2021) (link)
    • Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, Fabio Pulvirenti, Pietro Michiardi: A Parallel MapReduce Algorithm to Efficiently Support Itemset Mining on High Dimensional Data. Big Data Res. 10: 53-69 (2017) (link)
  • Clustering
    • Luca Colomba, Luca Cagliero, Paolo Garza: Density-Based Clustering by Means of Bridge Point Identification. IEEE Trans. Knowl. Data Eng. 35(11): 11274-11287 (2023) (link)
  • Associative Classification
    • Elena Baralis, Luca Cagliero, Paolo Garza: EnBay: A Novel Pattern-Based Bayesian Classifier. IEEE Trans. Knowl. Data Eng. 25(12): 2780-2795 (2013) (link)
    • Elena Baralis, Silvia Chiusano, Paolo Garza: A Lazy Approach to Associative Classification. IEEE Trans. Knowl. Data Eng. 20(2): 156-171 (2008) (link)
    • Elena Baralis, Paolo Garza: Majority Classification by Means of Association Rules. PKDD 2003: 35-46 (2003) (link)
    • Elena Baralis, Paolo Garza: A Lazy Approach to Pruning Classification Rules. ICDM 2002: 35-42 (2002) (link)
  • Pattern Mining
    • Luca Colomba, Luca Cagliero, Paolo Garza: Discovering SpatioTemporally Invariant Event Patterns From Mobility Data. IEEE Trans. Intell. Transp. Syst. 26(10): 15309-15322 (2025) (link)
    • Luca Colomba, Luca Cagliero, Paolo Garza: Mining spatiotemporally invariant patterns. SIGSPATIAL/GIS 2022: 63:1-63:4 (2022) (link)
    • Luca Cagliero, Paolo Garza: Infrequent Weighted Itemset Mining Using Frequent Pattern Growth. IEEE Trans. Knowl. Data Eng. 26(4): 903-915 (2014) (link)
    • Elena Baralis, Luca Cagliero, Tania Cerquitelli, Paolo Garza: Generalized association rule mining with constraints. Inf. Sci. 194: 68-84 (2012) (link)
  • Principal investigator/Co-Principal investigator
    • Machine learning for networksupervision and fault management, (2021-2021) – Funded by TIM S.P.A. – Principal investigator
    • ML4QoE (Machine Learning for QoE): re-enabling QoE for multiparty real time, (2019-2022) – Funded by Cisco Systems Inc. (USA) – Co-Principal investigator
  • Participant
    • (I-REACT) Improving Resilience to Emergencies through Advanced Cyber Technologies (2016-2019) – H2020 European project – Funded by the European Community – Data Protection Officer and Privacy and Security Manager, Task Leader
    • (ONTIC) Online Traffic Network Characterization (2014-2017) – FP7 Eupoean project – Funded by the European Community
  • Conference General Co-Chair
  • Workshop Chair
  • Local conference chair
    • The 18th IEEE International Conference Application of Information and Communication Technologies, IEEE AICT 2024
    • 26th European Conference on Advances in Databases and Information Systems, ADBIS 2022
  • AnalytiCup Co-chair
  • Challenge Organizer
    • ChaBuD: Change detection for Burned area Delineation, Challenge co-located with ECML/PKDD 2023
  • Program Committee Member (selected conferences)
    • VLDB 2027, VLDB 2026, VLDB 2025
    • ACM SIGMOD 2027, ACM SIGMOD 2025, ACM SIGMOD 2023
    • IEEE ICDE 2026
    • WWW 2026 (Track User Modeling, Personalization and Recommendation)
    • ECML/PKDD 2026 (Journal track), ECML/PKDD 2025 (Journal track), ECML/PKDD 2024 (Journal track), ECML/PKDD 2023 (Journal track), ECML/PKDD 2022 (Journal track)
    • PAKDD 2026, PAKDD 2025, PAKDD 2024
    • IEEE IDCM 2025, IEEE ICDM 2024 (Area Chair), IEEE ICDM 2023, IEEE ICDM 2022
    • ACM CIKM 2025, ACM CIKM 2024, ACM CIKM 2023 (I am proud to be included on the Distinguished Referees list for CIKM 2023), ACM CIKM 2022 (Short Paper Track), ACM CIKM 2021 (Resource Track), ACM CIKM 2020 (Resource Track)
    • EDBT 2025 (I am proud to be included on the Program Committee Honorable Mentions list for EDBT 2025)
    • ACM RecSys 2025 (Reproducibility Track), ACM RecSys 2024 (Reproducibility Track and Demo Track), ACM RecSys 2023 (Reproducibility Track)
    • ACM KDD 2020 (Applied Data Science)
  • Editorial Role for International Journals
    • Associate Editor: Knowledge and Information Systems (KAIS), Springer – Since December 2024
    • Associate Editor: Expert Systems with Applications (ESWA), Elsevier – Since August 2022
  • Reviewer for International Journals (selected journals)
    • IEEE TKDE
    • ACM TOIS