Probabilistic
Models of
Human and Machine Intelligence
CSCI
7222
Fall 2012
Assigned
11/8/2012
Due 11/17/2012
Goal
To
further develop the model we discussed in class
that infers a student's latent learning state from their performance
over a sequence of
problems. The relevant material is contained in the
lecture
of 11/6. If you need additional
background, see the optional paper
in the syllabus for 11/6.
Baker's data is available
here. In
addition to the data, there is a file called notes.txt that explains
the various data files and contains some summary statistics. And
there is a file called process.m that I wrote with matlab code to read
the excel spreadsheets and pull out the data that we care about.
Task 1
Make a concrete suggestion for computing P(Ts | {Xs,i}, hyperparameters) in the
graphical model we came up with in class (slide 14 of the lecture
notes). This posterior represents the probability distribution
over the (discrete) time at which the student s learns the concept, given not
only the data from that particular student, but exploiting data from
the entire population. It's almost certainly the case that the
posterior can't be computed analytically, but if you can do that,
fabulous! Otherwise, you will have to suggest a sampling
scheme The variables that need to be jointly sampled are: α0,
α1
, ρ0,
ρ1
, λ, γ, and T. You should write
out equations
(as relevant), describe proposal distributions (as relevant), and
possibly even present pseudocode.
I don't have a good answer to this problem. Some of the
variables
can be efficiently sampled via Gibbs sampling. To do Gibbs
sampling, you'll have to determine the Markov Blanket of each variable
and compute its probability conditioned on its Markov blanket. I
suspect that other variables cannot be sampled from Gibbs sampling, so
you'll need an inner loop (with Gibbs sampling in the outer loop) to do
sampling for the nasty variables.
Task
2
We have a lot of hyperparameters that have to be
set by hand.
Based on your understanding of the Gamma, Poisson, and Beta
distributions (i.e., their means and variances), what values would you
assign to these hyperparameters? Briefly justify.
Optional
If you have ideas for an improved version of the model, propose it
here. The changes can be small or large. For
example, the
suggestion was made by -- I believe -- Nicole to augment the model to
incorporate multiple
learning tasks. As another example, we talked about
alternative
prior distributions, such as the Geometric distribution instead of a
Poisson. And finally, we're covering sequential models in class
now, and perhaps some model that takes advantage of sequences might
conceivably be useful. Sequence models such as HMMs or linear
dynamical systems will probably be useful only if we allow for more
than 2 states of knowledge (don't know and know). But extending
the model to allow for graded states of knowledge will be an
interesting direction to move in. We can use the likelihood of a
test set to compare our models.