In this assignment you will experiment
with discrete probability distributions, and use these distributions to
make predictions about the environment.
Part 1
When the Titanic struck and iceberg and sank, there were 2201 people on
board. Some survived, some died. How does survival relate
to other attributes of the individuals? We explore this question
with a probabilistic approach. We consider a sample space of
individuals who are characterized by four random variables:
 Class: What
status did the individual have on the ship? (1st, 2nd, or 3rd class passenger; crew member)
 Age: what
age was the individual? (child, adult)
 Gender:
what gender was the individual? (male, female)
 Survival:
did the individual survive the shipwreck? (yes, no)
For example, the sample point (Class=1st, Age=adult, Gender=male,
Survival=yes) characterizes a subset of individuals. The data set
of 2201 individuals is available from
http://www.cs.colorado.edu/~mozer/courses/3202/titanic.dat.
(a) Using the data set, compute the full joint distribution, i.e.,
P(Class, Age, Gender,
Survival). This distribution has 4x2x2x2 = 32
probabilities. Display the distribution as follows:
P(Class, Age, Gender,
Survival)

Survival=yes 
Survival=no 
Gender=male

Gender=female

Gender=male

Gender=female

Age=child

Age=adult

Age=child

Age=adult

Age=child

Age=adult

Age=child

Age=adult

Class=1st









Class=2nd









Class=3rd









Class=crew









(b) Using the data set and the joint distribution, compute
P(Survival=yes  Class, Age,
Gender). Warning: Be alert to the possibility of a cell whose
value is undefined. Display the distribution as follows:
P(Survival=yes  Class, Age, Gender)

Gender=male

Gender=female

Age=child

Age=adult

Age=child

Age=adult

Class=1st





Class=2nd





Class=3rd





Class=crew





(c) Construct the unconditional distribution
P(Survival).
(d) Construct the conditional distributions
P(GenderSurvival),
P(AdultSurvival), and
P(ClassSurvival).
(e) Using the distributions you computed in parts (c) and (d), estimate
P(Survival=yes 
Class,Age,Gender) under the Naive Bayes
assumption. See the text and class notes for a description of
Naive Bayes. It boils down to this equation:
P(Survival
 Class,Age,Gender) = alpha P(ClassSurvival)
P(AgeSurvival) P(GenderSurvival) P(Survival)
(f) How well does the Naive Bayes assumption do in matching the
probabilities you obtained in (b)? Are there any advantages of
estimating the conditional probability using the Naive Bayes assumption?
Part 2
In this portion of the assignment, you are to do an analysis of pit
probabilities in the Wumpus World, analogous to the analysis that was
done in the text and in class. The particular situation you
should consider is as follows:





OK
breeze


OK
breeze


OK


OK

OK

OK

OK

OK

The rooms labeled "OK" have been visited and contain no wumpus or
pit. In the two rooms labeled "breeze", the agent sensed a
breeze. Estimate the probability of a pit in each of the
remaining rooms. Show the logic of your work.