Tania Cerquitelli
|
Born in Atri (Italy), on 04/08/1978 Telephone: (+39) 011 090 7178 E-mail: tania.cerquitelli @ polito.it |
Professional
Post-doctoral researcher in “Information and System Engineering” at the Dipartimento di Automatica e Informatica of the
Politecnico di Torino
since January 2007.
Education
Research activity
My research focus is on large-scale data mining and specifically, on innovative disk-based data structures and data retrieval algorithms, novel data mining algorithms, and compact disk-based representations of the extracted knowledge to efficiently support and speed up large-scale data mining.
During my PhD I have worked to fully integrate into the PostgreSQL DBMS kernel the IMine index. It is a novel data structure that provides a compact and complete representation of transactional data supporting efficient itemset extraction from a relational DBMS. The IMine index is a general structure which can be efficiently exploited by different itemset extraction algorithms (e.g., FP-growth, LCM v.2). To reduce the I/O cost, data accessed together during the same extraction phase are clustered on the same disk block. Furthermore, data access functions have been devised for efficiently loading in memory the index data. The main research results have been published in IEEE ICDE’05 and IEEE TKDE vol 4, 2009.
I am currently working on the design and development of new strategies to efficiently perform the task of large-scale data mining. To this aim, I’m studying innovative approaches exploiting disk-based data structures and novel data mining algorithms. My first research result is a new disk based structure, called HY-Tree, which smartly and compactly represents very large databases (i.e., not manageable by state of the art approaches). It is characterized by a hybrid structure that easily adapts to different data distribution. HY-Tree effectively supports the data retrieval step in the itemset mining process by reducing both the I/O cost and the memory requirements for data loading for different algorithms (e.g., LCM v.2, nonordFP). HY-Tree has been described in a paper accepted in the ACM SAC ’10 data mining track.
Another research area I addressed is data mining techniques on sensor readings. Two different issues have been addressed: (i) Analysis of clinical data to detect dangerous situations and (ii) analysis of sensor readings to reduce the query collection cost, in terms of both energy and bandwidth consumption. The main research results have been published in IEEE Transactions on Information Technology in Biomedicine vol.13, 2009, and in a book chapter in the IGI Global book entitled “Intelligent Techniques for Warehousing and Mining Sensor Network Data”, 2009.
Furthermore, I also devoted my research activity to the design and implementation of the NetMine framework to characterize network traffic. NetMine performs the complete knowledge discovery process from network traffic capture to in depth data analysis. It is able to perform (i) on-line stream analysis to aggregate and filter network traffic, (ii) refinement analysis to discover relationships among captured data, and (iii) rule classification into different semantic groups. The main research results have been published in the Computer Networks journal, vol. 53, issue 6, 2009.
Finally, I also designed and implemented the CAS-MINE framework, which provides personalized services in context-aware applications by means of generalized rules. A lazy user-provided taxonomy evaluation performed on different attributes (e.g., a geographic hierarchy on spatial coordinates, a classification of provided services) drives the rule generalization process. Extracted rules are exploited to tailor service supply both to the user and to the situation in which he/she is involved. The main research results have been presented in a paper submitted to the KAIS journal Special Issue on Context-aware Data Mining.
The list of publications can be found
here
Scholarly Service
Chair for:
- Data management and exploitation track in the IEEE Mexican International Conference on Computer Science (ENC’09)
Reviewer for:
- Data & Knowledge Engineering Journal (2010)
- Elsevier journal. Performance Evaluation journal (2010)
- Special Issue of Computer Networks on "Traffic classification and its applications to modern networks". Elsevier (2008)
- Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction. IGI Global book – 2008
Program committee memberships of:
- First International Workshop on Data Warehousing and Knowledge Discovery from Sensors and Streams (DKSS 2009)
- International Workshop on Data and Services Management in Mobile Environments (DS2ME’08) - ICDE Workshop 2008
- Mexican International Conference on Computer Science (ENC'08)
- International Applied Computing Conference 2006
External program committee memberships for:
- ACM Symposium on Applied Computing (SAC'10, SAC '09, SAC'07, SAC'06)
- European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'05)
- Euro-Par Conference European Conference on Parallel and Distributed Computing (Euro-Par'07)
- International Conference on Data Warehousing and Knowledge Discovery (DaWaK'09,!DaWaK'07, DaWaK'06, DaWaK'05)
- International Conference on Advanced Data Mining and Applications (ADMA'05)
- Italian Symposium on Advanced Database Systems (SEBD’09, SEBD’07, SEBD’06, SEBD’05)
Teaching Experience
- Teaching Assistant - “Database design and tuning” – curriculum in Computer Engineering, since academic year 2004-2005, Politecnico di Torino (in English), Turin.
- Teaching Assistant - “Database system technology” – curriculum in Computer Engineering, since academic year 2006-2007 – Politecnico di Torino, Turin.
- Teaching Assistant - “Data Warehousing and Data Mining” – curriculum in Computer Engineering, academic year 2007-2008 – Politecnico di Torino, Turin.
- Teaching Assistant - “Database management system” – curriculum in Management Engineering, since academic year 2005-2006 – Politecnico di Torino, Turin.
- Teaching Assistant - “Database” – Master in E-Business and ICT for Management, since academic year 2006-2007 – Politecnico di Torino (in English), Turin.
- Teaching Assistant - “Computer networks I” – distance learning curriculum in Computer Engineering, since academic year 2004-2005 – Ce.Te.M, Politecnico di Torino, Turin.
- Teaching Assistant - “Database systems” – distance learning curriculum in Computer Engineering and Management Engineering, since academic year 2007-2008 – Universitŕ Telematica Internazionale, Uninettuno, Rome
- Teaching Assistant - “Information systems” – distance learning curriculum in Computer Engineering and Management Engineering, since academic year 2007-2008 – Universitŕ Telematica Internazionale, Uninettuno, Rome
- Teaching Assistant - “Software engineering” – distance learning curriculum in Computer Engineering and Management Engineering, since academic year 2008-2009 – Universitŕ Telematica Internazionale, Uninettuno, Rome
- Teaching Assistant - “Data mining: Concepts and algorithms” – PhD School in Computer Engineering, since academic year 2007-2008 – Politecnico di Torino (in English), Turin.
- Teaching Assistant - “Index Support for Rule Extraction” – 5th Franco-Mexican Summer School on Distributed Systems. Ensenada, Baja California, August 2006 (in English)
Honors / Awards
Doctorate Research Fellowship. Granted by the Politecnico di Torino, Italy. 2004 – 2006
Best Student in the Master in Computer Sciences. Granted by the Universidad De Las Américas Puebla. July 2003
International Exchange Fellowship. Granted by the Regione Piemonte, Italy: exchange student at the Universidad de las Américas Puebla. July 2002
Tutoring scholarship. Politecnico di Torino, Italy. 1999 – 2002
Undergraduate and Graduate Scholarships. Granted by the Regione Piemonte, Italy. 1997 – 2003
Background / Interests
Interests include economy and finance, game theory, travelling, reading, latin dancing, swimming, and skiing.