Topical Hidden Markov Models for Skill Discovery in Tutorial Data
Jose P. 
joseg@cs.cmu.edu 
Language Technologies Institute, Carnegie Mellon University Pittsburgh, PA 15213 USA 

Jack Mostow 
mostow@cs.cmu.edu 
Language Technologies Institute, Carnegie Mellon University Pittsburgh, PA 15213 USA
Abstract
The rst step for an Intelligent Tutoring Sys tem to adapt teaching is inferring students' understanding of the subject matter (Van Lehn, 1988). Existing automatic approaches for inferring students' knowledge requires a cognitive model { the mapping between the tutor problems and the set of skills required. This is a very expensive requirement, since it often depends on expert domain knowledge (Beck, 2007).
The success of previous methods for auto matic construction of cognitive models has been limited (Desmarais, 2011). Previous work on inferring students' knowledge from temporal data has relied on expert annota tors to nd the grouping of problems into skills
Our proposed model, Topical Hidden Markov Model (HMM), uses an input sequence to model the order in which students solve prob lems, and an output sequence to model stu dents' performance on the problems they solve. We propose a Gibbs Sampling algo
rithm that infers a factorization of problems into skills, and estimates the student knowl edge of the skills across time. We validate our approach with data collected with the Bridge to Algebra Cognitive Tutor R (Koedinger et al., 2010).
References
Barnes, T. The
Beck, J. Di culties in inferring student knowledge from observations (and why you should care). In
Educational Data Mining: Supplementary Proceed ings of the 13th International Conference of Arti  cial Intelligence in Education, pp. 21{30, Marina del Rey, CA, 2007.
Cen, Hao, Koedinger, Kenneth, and Junker, Brian. Learning factors analysis: A general method for cognitive model evaluation and improvement. In Ikeda, Mitsuru, Ashley, Kevin, and Chan,
Desmarais, Michel. Conditions for E ectively Deriving a
Topical Hidden Markov Models for Skill Discovery in Tutorial Data
J. (ed.), Proceedings of the 5th International Conference on Educational Data Mining, pp. 49{56, 2012. URL http://educationaldatamining.org/ EDM2012/uploads/procs/Full_Papers/edm2012_ full_7.pdf.
Koedinger, K.R., Baker, R.S.J., Cunningham, K., Skogsholm, A., Leber, B., and Stamper, J. A data repository for the community: The PSLC DataShop. CRC Press, Boca Raton, FL, 2010.
VanLehn, Kurt. Student Modeling. In Polson, M. C. and Richardson, J. J. (eds.), Foundations of intelli gent tutoring systems, Hillsdale, NJ, USA, 1988. L. Erlbaum Associates Inc. ISBN
Winters, T., Shelton, C., Payne, T., and Mei, G. Topic extraction from