Probabilistic Models of
Human and Machine Learning
goal of this assignment is get some first-hand appreciation of the
chinese restaurant process, in particular, the variant that draws
samples from a Pitman-Yor Process.
I've included the pseudocode in my lecture notes
(pp. 13-15), though
please make sure you understand the algorithm because there's always
the possibility that I screwed up some notation.
Regenerate the first two graphs in Teh's
Figure 1 (reproduced in my lecture notes on p. 16).
Plot the number of words types generated as a function of the number of
drawn, using a log-log plot for
(a) d=.5, alpha = 1, 10, and 100; and for (b) alpha = 10 and d =
0, .5, and .9.
Assume that G0 is a discrete, uniform distribution over 10^5 word
types. To be clear about the meaning of 'word type' and 'word
token', in the sentence "The big boy
picked up the big ball", there are 8 word tokens but only 6 word
types, because the words the
and big are repeated.
Generate a plot like the Zipf's law plot on p. 17
of my lecture notes. Rank the word types by number of tokens
drawn, and plot the number of tokens as a function of rank on a log-log
scale. If Zipf's law holds for the Pitman Yor process, you should
see a straightish line. Try different parameters, but my guess is
that parameters the yield a fairly straight line in Task 1 are most
likely to produce Zipf's law. For example, you might try alpha =
1 and d = .5
Using the simpler Dirichlet Process (d = 0), pick a value of alpha and
generate, say, 1000 samples from a DP each with, say, 100 customers.
For each sample, determine
the total number of tables that are occupied (= number of components in
the DP mixture model). Make a histogram showing number of tables
on the x-axis (which will vary from 1 to 100) and relative frequency on
the y-axis. Repeat this
process for a different value of alpha, such that the two histograms