Large-scale itemset mining

Itemset mining focuses on the extraction of useful knowledge from huge quantities of data. A wide range of different domains need to deal with the ever-growing amounts of gathered data  (e.g., biological data, network traffic data, text mining, streams of sensor network data, spatio-temporal data). Traditional in-core mining algorithms do not scale well with large volumes of data and are hindered by critical issues such as main-memory exhaustion and long execution times. Scalable and alternative approaches have to be devised to efficiently perform large-scale data mining. In this research activity, innovative approaches exploiting disk-based data structures and memory-efficient algorithms to extract frequent itemsets are investigated.

Technical reports

Datasets

Real datasets

Synthetic datasets

 

Publications

Elena  Baralis, Tania Cerquitelli, Silvia Chiusano: A persistent HY-Tree to efficiently support itemset mining on large datasets. SAC 2010: 1060-1064

Master thesis

Alberto Grand. Master Thesis. Index support for itemset mining. Joint double-degree program between Politecnico di Torino and University of Illinois at Chicago. Master of Science in Electrical and Computer Engineering. November 2009 (pdf)


 Parent page 

 Menu 

 © 2024 - DataBase and Data Mining Group