Computer Science PhD Candidate

9/30/1996

9:30am-11:30am

When many possible input variables to a statistical model exist, removing unimportant inputs can improve the model's performance significantly. A new method for selecting input variables is proposed. Components for the proposed method include:

Mutual information as a relevance measure

Kernel density estimation for estimating probabilities

Forward selection as an input variable search method

Analysis of mutual information shows that it is natural measure of input variable relevance. It is a more general measure of input variable relevance than expected conditional variance. Under certain conditions, the two measures order the relevance of input variable subsets in precisely the same manner, but these conditions do not generally hold. An unbiased approximation to mutual information exists, but it is unbiased only if the underlying probabilities are exact.

Analysis of kernel density estimation shows that the accuracy of mutual information estimates depends directly on how densely populated the points in the data set are. However, for a range of explored problems, the relative ordering of mutual information estimates remains correct, despite inaccuracies in individual estimates.

Analysis of forward selection explores the amount of data required to select a certain number of relevant input variables. It is shown that in order to select a certain number of relevant input variables, the amount of required data increases roughly exponentially as more relevant input variables are considered. It is also shown that the chances of forward selection ending up in a local minimum are reduced by bootstrapping the data.

Finally, the method is compared to two connectionist methods for input variable selection: Sensitivity Based Pruning and Automatic Relevance Determination. It is shown that the new method outperforms these two when the number of independent, candidate input variables is large. However, the method requires the number of relevant input variables to be relatively small. These results are confirmed on a number of real world prediction problems, including the prediction of energy consumption in a building, the prediction of heart rate in a patient with sleep apnea, and the prediction of wind force in a wind turbine.

Committee: |
Andreas Weigend, Assistant Professor (Chair)Michael Mozer, Associate ProfessorClayton Lewis, ProfessorKelvin Wagner, Department of Electrical and Computer EngineeringRichard Holley, Department of Mathematics |

Department of Computer Science

University of Colorado Boulder

Boulder, CO 80309-0430 USA

webmaster@cs.colorado.edu

University of Colorado Boulder

Boulder, CO 80309-0430 USA

webmaster@cs.colorado.edu