skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 2003-2004 · 

Colloquium - Grudic

ECCR 265

Regression and Classification Models with Probabilistic Confidence Estimates
Gregory Z. Grudic
Department of Computer Science

Most regression and classification models output only a predicted target value or class, and make no attempt to give a probabilistic confidence estimate for the output. However, in many applications, it is important to have such estimates. Consider a medical diagnosis problem where the goal is to predict whether a patient has a disease given a set of symptoms or tests. Here a YES or NO decision is not nearly as useful as a probabilistic estimate of how likely it is the patient has the disease -- if it is highly likely that the patient has cancer, perhaps she should be treated immediately, otherwise more tests might be in order. In fact, the general framework of reasoning under uncertainty using utility functions is based on having good estimates of such conditional classification probabilities. Similarly, in regression, it is much more useful to know the probability that future values fall within some interval, than to have a single predicted value.

Gregory Grudic photo

We propose a probabilistic regression and classification framework for basis function models, which includes widely used kernel methods such as Support Vector Machines. In the case of regression, we present a theoretical framework for obtaining point specific estimates of the probability that the true output is within some user specified range. In the case of classification, our framework estimates the point specific probability that the predicted class is the true class. We make minimal distribution assumptions, and no specific distributions (e.g. no Gaussian or other distributions) are assumed. We show that, under appropriate assumptions, as the number of training examples increases, the probability estimates approach the true values. Experimental results show that our framework can give better probability estimates than those obtained with algorithms that make specific distribution assumptions, such as Gaussian Process Regression and Support Vector Machine classification with probabilistic outputs.

Finally, we outline two new important applications of probabilistic models. The first is based on recently gathered clinical data, where the goal is to use machine learning algorithms to identify the presence of serious heart disease in children. We show that probabilistic classification models can be used to give accurate estimates of the probability that a child has congenital heart disease, using only an electronic stethoscope sensor. In the second application we outline the use of probabilistic regression and classification models for end-to-end learning of complex robotic tasks.

The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)