Global Utilities

Seminars - Abstract

Department of Computer Science & Computer Engineering

Topic:   Robustness for Evaluating Rule's Generalization Capability in Data Mining
Speaker:   Dr. Justin Wang, Department of Computer Science & Computer Engineering
Date:   17-05-2004
Time:   3:00 PM
Venue:   HS2 223
Abstract:   The evaluation of production rules generated by different data mining algorithms currently depends upon the data set used, thus their generalization capability cannot be estimated. Our method consists of three steps. Firstly, we take a set of rules, copy these rules into a population of rules, and then perturb the parameters of individuals in this population. Secondly, the maximum robustness bounds for the rules is, then found using genetic algorithms, where the performance of each individual is measured with respect to the training data. Finally, the relationship between maximum robustness bounds and generalization capability is constructed using statistical analysis for a large number of rules. The significance of this relationship is that it allows the algorithms that mine rules to be com-pared in terms of robustness bounds, independent of the test data. This technique is applied in a case study to a protein sequence classification problem.
Content Approved by: Head of School
Page maintained by: Applications Programmer
Last Updated: 14 October, 2009