home · mobile · calendar · defenses · 2002-2003 · 

Thesis Defense - Kurgan

Meta Mining System for Supervised Learning
Lukasz Kurgan
Computer Science PhD Candidate
5/8/2003
3:00pm-5:00pm

Supervised inductive machine learning is one of several powerful methodologies that can be used for performing a Data Mining task. Data Mining aims to find previously unknown, implicit patterns that exist in large data sets, but are hidden among large quantities of data. These patterns describe potentially valuable knowledge.

Data Mining techniques have been focused on finding knowledge, often expressed in terms of rules, directly from data. More recently, a new Data Mining concept, called Meta Mining, was introduced. It generates knowledge utilizing two-step procedure, where first meta-data is generated from the input data, and next the meta-data is used to generate meta-rules that constitute final data model.

In this dissertation we examine a new approach to generation of knowledge, using supervised inductive learning methodologies combined with Meta Mining. We propose a novel data mining system, called MetaSqueezer, for extraction of useful patterns that carry new information about input supervised data set. The major contribution of this thesis is design and development of the above system, supported by extensive benchmarking evaluation results. Two key advantages of the system are its scalability, which results from its linear complexity, and high compactness of user-friendly data models that it generates. These two features make it applicable for applications that use megabytes, or even gigabytes of data.

The usefulness of the system is evaluated theoretically and also empirically via thorough testing. The results show that the system generates very compact data models. They also confirm linear complexity of the system, which makes it highly applicable to real data. Results of application of the system to cystic fibrosis data are provided. This application generated very useful results, as evaluated by the domain experts.

Committee: Krzysztof Cios, University of Colorado at Denver (Chair)
Andrzej Ehrenfeucht, Professor
Clayton Lewis, Professor
Dennis Lezotte, CU School of Medicine
James Martin, Associate Professor
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
webmaster@cs.colorado.edu
www.cs.colorado.edu
May 5, 2012 (14:20)
XHTML 1.0/CSS2
©2012