Modeling Student Strategy Usage

with Mixed Membership Models

April Galyardt

Educational Psychology and Instructional Technology

University of Georgia

Athens, GA 30602

The strategies that students use to complete a task or solve a problem are critical indicators of their developing expertise. Experts know more strategies and use different strategies than novices, but growth toward effective expert strategies is not straightforward. Individuals use multiple strategies, and will switch strategies between subsequent tasks. As children learn, the mixture of strategies changes; they use efficient strategies more frequently, but will still fall back on the more rudimentary strategies (Pellegrino et al., 2001). In order to estimate a student’s expertise, we need to estimate the mixture of strategies that each student uses.

The majority of current psychometric models do not model strategies at all. A few models, such as mixture IRT models, allow for the existence of different strategies, but assume that each student uses a single strategy consistently (Junker, 1999; Pellegrino et al., 2001). We need to develop assessment models that not only allow for the existence of multiple strategies, but allow for students to switch strategies.

Mixed membership models, which have recently become popular in the machine learning community, address this problem. Latent Dirichlet allocation (Blei et al., 2003) and the admixture model (Pritchard et al., 2000) are the two most common versions of the larger class of mixed membership models (Erosheva, 2002; Erosheva et al., 2004).

The basic idea of mixed membership is that each individual can have membership in multiple classes. For example, topic models, such as latent Dirichlet allocation, allow documents to be about multiple topics. The admixture model was built to describe populations where individuals may have genetic heritage from multiple sub-populations. In the context of modeling student knowledge, each class represents a possible solution strategy. Students then have membership in each class according to how much they use that strategy.

The mixed membership model framework also allows us to combine different types of observed student data. For each student i who attempts task j, we observe a feature vector Xij . Features may include the student’s response, the response time, eye tracking information, or any other observed aspect of behavior. The joint distribution of the features Xij depends on the strategy that the student uses to solve the problem.

This mixed membership strategy model allows creates model of individual student knowledge that is consistent with current cognitive psychology research indicating that strategy choice is a critical component of expertise.


David Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003. ISSN 1533-7928.

Elena Erosheva. Grade of Membership and Latent Structure Models With Application to Disability Survey Data. PhD thesis, Carnegie Mellon University, Pittsburgh, PA 15213, August 2002.

Elena Erosheva, Stephen Fienberg, and John Lafferty. Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1):5220–5227, 2004. doi: 10.1073/pnas.0307760101. URL

Brian W. Junker. Some statistical models and computational methods that may be useful for cognitively-relevant assessment. Technical report, Committee on the Foundations of Assessment, National Research Council, November 1999.

James W. Pellegrino, Naomi Chudowsky, and Robert Glaser, editors. Knowing What Students Know: The Science and Design of Educational Assessment. National Academy Press, Washington, DC, 2001.

Jonathan K. Pritchard, Matthew Stephens, and Peter Donnelly. Inference of population structure using multilocus genotype data. Genetics, 155:945–959, 2000.