Topical Hidden Markov Models for Skill Discovery in Tutorial Data

Jose P. Gonzalez-Brenes

joseg@cs.cmu.edu

Language Technologies Institute, Carnegie Mellon University Pittsburgh, PA 15213 USA

Jack Mostow

mostow@cs.cmu.edu

Language Technologies Institute, Carnegie Mellon University Pittsburgh, PA 15213 USA

Abstract

The rst step for an Intelligent Tutoring Sys- tem to adapt teaching is inferring students' understanding of the subject matter (Van- Lehn, 1988). Existing automatic approaches for inferring students' knowledge requires a cognitive model { the mapping between the tutor problems and the set of skills required. This is a very expensive requirement, since it often depends on expert domain knowledge (Beck, 2007).

The success of previous methods for auto- matic construction of cognitive models has been limited (Desmarais, 2011). Previous work on inferring students' knowledge from temporal data has relied on expert annota- tors to nd the grouping of problems into skills (Gonzalez-Brenes & Mostow, 2012). For example, matrix-based methods (Winters et al., 2005), such as Principal Component Analysis, Non-Negative Matrix Factorization and the Q-Matrix Method (Barnes, 2005) ig- nore the temporal dimension of the data. On the other hand, Learning Factors Anal- ysis (Cen et al., 2006) is designed for tem- poral data, but still requires expert's anno- tations. Our goal is a data-driven approach to model students' time varying knowledge, without requiring expert annotation of the skills needed by the students.

Our proposed model, Topical Hidden Markov Model (HMM), uses an input sequence to model the order in which students solve prob- lems, and an output sequence to model stu- dents' performance on the problems they solve. We propose a Gibbs Sampling algo-

rithm that infers a factorization of problems into skills, and estimates the student knowl- edge of the skills across time. We validate our approach with data collected with the Bridge to Algebra Cognitive Tutor R (Koedinger et al., 2010).

References

Barnes, T. The Q-matrix method: Mining student response data for knowledge. In Beck, J. (ed.), Pro- ceedings of AAAI 2005: Educational Data Mining Workshop, pp. 978{980, Pittsburgh, PA, 2005.

Beck, J. Di culties in inferring student knowledge from observations (and why you should care). In

Educational Data Mining: Supplementary Proceed- ings of the 13th International Conference of Arti - cial Intelligence in Education, pp. 21{30, Marina del Rey, CA, 2007.

Cen, Hao, Koedinger, Kenneth, and Junker, Brian. Learning factors analysis: A general method for cognitive model evaluation and improvement. In Ikeda, Mitsuru, Ashley, Kevin, and Chan, Tak-Wai (eds.), Intelligent Tutoring Systems, volume 4053 of Lecture Notes in Computer Science, pp. 164{175. Springer Berlin / Heidelberg, 2006. URL http: //dx.doi.org/10.1007/11774303_17.

Desmarais, Michel. Conditions for E ectively Deriving a Q-Matrix from Data with Non-negative Matrix Factorization. Best Paper Award. In M. Pechenizkiy and T. Calders and C. Conati and S. Ventura and C. Romero and J. Stamper (ed.), Proceedings of the 4th International Conference on Educational Data Mining, pp. 169{178, 2011.

Gonzalez-Brenes, Jose P. and Mostow, Jack. Dynamic Cognitive Tracing: Towards Uni ed Discovery of Student and Cognitive Models. In Yacef, K., Zaane, O., Hershkovitz, H., Yudelson, M., and Stamper,

Topical Hidden Markov Models for Skill Discovery in Tutorial Data

J. (ed.), Proceedings of the 5th International Conference on Educational Data Mining, pp. 49{56, 2012. URL http://educationaldatamining.org/ EDM2012/uploads/procs/Full_Papers/edm2012_ full_7.pdf.

Koedinger, K.R., Baker, R.S.J., Cunningham, K., Skogsholm, A., Leber, B., and Stamper, J. A data repository for the community: The PSLC DataShop. CRC Press, Boca Raton, FL, 2010.

VanLehn, Kurt. Student Modeling. In Polson, M. C. and Richardson, J. J. (eds.), Foundations of intelli- gent tutoring systems, Hillsdale, NJ, USA, 1988. L. Erlbaum Associates Inc. ISBN 0-805-80053-0.

Winters, T., Shelton, C., Payne, T., and Mei, G. Topic extraction from item-level grades. In Beck, J. (ed.), American Association for Arti cial Intelli- gence 2005 Workshop on Educational Datamining, Pittsburgh, PA, 2005.

Convert PDF to HTML