CSCI
5622 Project: (worth 25% of your mark)
11/6/2001
Instructor: Professor Grudic
Due date: December 18, 2002
You have two choices for the project:
OR
Bayes Nets:
Probabilistic Clustering in Relational Data, B. Taskar, E. Segal, and D. Koller. Seventeenth International Joint Conference on Artificial Intelligence, Seattle, Washington, August 2001, pages 870--876.
Active
Learning for Structure in Bayesian Networks, S. Tong and D. Koller. Seventeenth
International Joint Conference on Artificial
Exact Inference in Networks with Discrete Children of Continuous Parents, U. Lerner, E. Segal, and D. Koller. Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), Seattle, Washington, August 2001, pages 319--328.
Ensemble Algorithms:
Friedman, J. H. "Tutorial: Getting Started with MART in Splus ." (Sept.1999) (software)
Friedman, J. H. "Stochastic Gradient Boosting ." (March 1999b) (software)
Friedman, J. H. "Greedy Function Approximation: A Gradient Boosting Machine." (Feb. 1999a) (software)
Friedman, J. H., Hastie, T. and Tibshirani, R. "Additive Logistic Regression: a Statistical View of Boosting." (Aug. 1998)
Random Forests (Leo Breiman)
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest.
A technical report is available explaining the theory and implementation of random forests.
Software for Random Forests
For standalone use, the following are available:
Support Vector Machines:
Support Vector Machine Active Learning with Applications to Text Classification, S. Tong and D. Koller. Machine Learning Journal, 2002, to appear.
Feature Selection for SVMs -- Jason Weston, Sayan Mukherjee, Olivier Chapelle, Massimiliano Pontil, Tomaso Poggio, Vladimir Vapnik