Assignment 6
Probabilistic Models of
Human and Machine Intelligence

CSCI 5822

Assigned Thu March 15, 2018
Part I Due Tue March 20, 2018
Part II Due Thu Mar 22, 2018

Goal

The goal of this assignment is to introduce you to probabilistic programming languages.  These languages allow you to specify graphical models with random variables and to perform inference by sampling.

PART I

I want you to investigate probabilistic programming languages and identify one that you want to work with.  Install the software, and run through one or more tutorial examples to convince yourself that you understand basically how the language works. I list a bunch of options on the course home page, and there are even more at probabilistic-programming.org.

I have no expertise in these languages but after spending a few days looking at the options, there are five I'd suggest to investigate further. The first three are likely to be the most valuable in the future, because they allow for the integration of Bayesian methods and neural networks. The field is definitely headed in this direction. Each of these three languages is built on top of a gradient-based optimization library, with efficient GPU operations for multidimensional array. The five languages I'd recommend, roughly ordered from strongest to weakest recommendation, are:
For Part I, there is nothing to hand in.

PART II

Perform either exact or approximate inference to obtain answers to part III of Assignment 4.  You solved this inference problem exactly, and the answers should be P(G1=2|X2=50) = 0.1054 and p(X3=50|X2=50) = 0.1024.  If you're going to use Edward, I wasn't able to get any of the sampling-based inference procedures (Metropolis-Hastings, Gibbs, hybrid Monte Carlo) to work on discrete RVs; however KLpq does seem to get a solution, as long as you include the argument n_samples=100 or larger. Because there aren't any good examples of discrete RVs in Edward, we found this implementation of the sprinkler/rain graphical model to be helpful. Read the description of KLpq carefully: it does a search over Gaussian RVs, so you need to constrain the variable if you want it to be nonnegative or binary. We also found that for estimating p(X3=50|X2=50), the distribution needs to be initialized to be in the right neighborhood.

If you get really stuck and can't get this example to run, implement the burglar alarm network from class and show some inference results. The burglar alarm should be a straightforward extension of the sprinkler/rain net. We will give a max of 80% credit for this model.

As I mentioned on Piazza, one student has had success with PyMC3 and the code produced was quite sensible and readable.

For Part II, we would like you to hand in your code, and the runs that produce the two answers.