Assignment 1
Probabilistic Models of
Human and Machine Intelligence

CSCI 5822
Spring 2018

Assigned Jan 25
Due Feb 1

Goal

The goal of this assignment is to give you a concrete understanding of the Tenenbaum (1999) work by implementing and experimenting with a small scale version of the model applied to learning concepts in two-dimensional spaces.  The further goal is to get hands-on experience representing and manipulating probability distributions and using Bayes' rule.

Simplified Model

Consider a two-dimensional input space with features in the range [-10, 10]. We will consider only square concepts, and concepts centered on the origin (0,0).  We will also consider only a discrete set of concepts, H = {hi, i=1...10}, where hi is the concept with lower left corner (-i, -i) and upper right corner (+i, +i), i.e., a square with the length of each side being 2i.

You will have to define a discrete prior distribution over the 10 hypotheses, and you will have to do prediction by marginalizing over the hypothesis space.  Use Tenenbaum's expected-size prior.  Because the expected size prior is defined over a continuous distribution, you will need to compute the value of the prior for each of the 10 hypotheses, and renormalize the resulting probabilities so that the prior distribution sums to 1.  (You don't actually need to do this renormalization, because the normalization factor cancels out when you do the Bayesian calculations, but go ahead and do it anyhow, just to have a clean representation of the priors.)

Task 1

Make a bar graph of the prior distribution, P(H), for 𝜎1 = 𝜎2 = 6.  Make a graph of the prior distribution for 𝜎1 = 𝜎2 = 12.

Task 2

Given one observation, X = {(1.5, 0.5)}, compute the posterior P(H|X) with 𝜎1 = 𝜎2 = 12. You will get one probability for each possible hypothesis. Display your result either as a bar graph or a list of probabilities. Use Tenenbaum's Size Principle as the likelihood function.

Task 3

Using the results of Task 2, compute generalization predictions, P(y ∈ concept | X), over the whole space of possible generalization points, y,  for X = {(1.5, 0.5)} and 𝜎1 = 𝜎2 = 10. (I used the notation P(Q|X) in the lectures slides.) The input space should span the region from (-10,-10) to (+10,+10). Display your result as a contour map in 2D space where the coloring of the contour map represents the probability that an input at that point in the space will be a member of the concept. (If the probabilities span a wide dynamic range, you can always plot the logarithm of the probability in the contour map. If you choose to do this, be sure to label the graph.)

Task 4

Repeat Task 3 for X = {(4.5, 2.5)}.

Task 5

Compute generalization predictions, P(y ∈ concept | X), over the whole input space for 𝜎1 = 𝜎2 = 30 and three different sets of input examples: X = {(2.2, -.2)}, X = {(2.2, -.2), (.5, .5)}, and X = {(2.2, -.2), (.5, .5), (1.5, 1)}. Describe how the posterior is changing as new examples are added, and explain why this occurs.

Task 6 (Optional)

Do some other interesting experiment with the model.  One possibility would be to extend the model to accommodate negative as well as positive examples.  Another possibility would be to compare generalization surfaces with and without the size principle, and with an uninformative prior (here, uniform would work) compared to the expected-size prior.