Syllabus
Probabilistic Models of
Human and Machine Intelligence
CSCI
7222
Fall 2012
Tu,
Th 14:00-15:15
ECCR 1B51
Instructor
Professor
Michael
Mozer
Department of Computer Science
Engineering Center Office Tower 741
(303) 492-4103
Office Hours: Tu 15:30-16:30, Th 13:00-13:45
Course Objectives
A new
paradigm has emerged in cognitive science and artificial intelligence
which views the mind as a computer extraordinarily tuned to the
statistics of the environment in which it operates, and views learning
and adaptation in terms of changes to these statistics over time. The
goal of the course is
to understand
the latest advances in theory in cognitive science and artificial
intelligence that take a statistical and probabilistic perspective.
One virtue of probabilistic models is that they
straddle
the gap between cognitive science, artificial intelligence, and machine
learning. The same
methodology
is useful for both understanding
the brain and building intelligent
computer systems. Indeed, for much of the research we'll
discuss,
the models contribute both to machine learning and to cognitive
science. Whether your primary interest is in engineering
applications of machine learning or in cognitive modeling, you'll see
that there's a lot of itnerplay between the two fields.
The course participants are likely to
be a
diverse group of students, some with primarily an
engineering/CS focus and others primarily interested in cognitive
modeling (building computer simulation and mathematical models to
explain human perception, thought, and learning).
Prerequisites
The
course is open to any
students who
have some background in cognitive science or artificial intelligence
and who have taken an introductory probability/statistics course.
If your background in probability/statistics is weak, you'll
have
to do some catching up with the text.
Course Readings
We
will be using a new text by Kevin Murphy (
Machine
Learning: A Probabilistic Perspective,
MIT Press, 2012).
Unfortunately, the text is so new that it will not arrive
from
the printer until early September. I chose this book because
it
is accessible and includes matlab software that students can download
and run (follow the book link to grab the software).
For additional references,
wikipedia
is
often a useful resource. The pages on various probability
distributions are great references. If you want additional reading, I
recommend the following two texts:
We will also be reading research articles from the literature, which
can be downloaded from the links on the class-by-class syllabus below.
Course
Requirements
Readings
In
the style
of graduate
seminars, your will be responsible to read
chapters from the text and research articles before
class and be prepared to come into class to discuss the material
(asking
clarification questions, working through the math,
relating papers to each other, critiquing the
papers, presenting
original ideas related to the paper).
Homework Assignments
We can
all delude ourselves into believing we
understand some math or algorithm by reading, but implementing and
experimenting with the algorithm is both fun and valuable for obtaining
a true understanding. Students will implement small-scale
versions of as many of the models
we discuss as possible. I will give 5-10 homework assignments
that involve implementation over the semester, details to be
determined. My preference is for you to work in matlab, both because
you can leverage software available with the Murphy text, and because
matlab has become the de facto work horse in cognitive modeling and
machine learning.
Written
Commentaries
For
some of the research articles,
I'll ask you to write a one-page commentary on the paper, The
commentary consists of
approximately one page of comments,
questions, or
critiques of the assigned reading(s) for that class. This page will be
due the
day of class, and can include one or more of the following:
- a summary of what you think
the main or most interesting
ideas
are behind the reading(s).
- questions about the
material
for further discussion, either
clarification questions or points of disagreement with the authors (``I
don't see how such and such will work as the author claims...'').
- comments on how the
assigned reading relates to other
course
readings, or, if you feel ambitious and want to track down some related
work in the field, how the assigned reading compares to this other
work.
- a critique of the work.
- What are the flaws in the
ideas
presented? What are the limitations? Do the authors place their work in
the appropriate theoretical perspective? Do the authors overstate their
results? In what direction might the work be extended?
These
commentaries are
intended to
promote careful thought about a
paper
before
the session in
which it is discussed. The point is not
to give you more busy work, but rather to encourage you to jot down
notes and questions as you read the papers.
They will not be accepted
after
the
class in which the paper is discussed.
Semester
Grades
Semester
grades will be based 5% on class
attendance and participation and 95% on the homework assignments and
commentaries. I will weight the assignments and commentaries
in
proportion to their difficulty, in the range of 5% to 10% of the course
grade. Students with backgrounds in the area and specific
expertise may wish to do in-class presentations for extra credit.
Class-By-Class Plan and Course
Readings
This schedule is
tentative
and will be adjusted as the semester goes
on. The number and homework assignments is subject to change.
The
assignments are listed on the date I expect to hand them out, and they
will typically be due 1 week later.
| Date |
Activity |
Required
Reading |
Optional
Reading |
Lecture
Notes |
Assignments
|
| Aug
28 |
introductory
meeting |
Murphy
1.1-1.4 |
Chater,
Tenenbaum, & Yuille (2006) |
lecture |
|
| Aug 30 |
basic
probability, Bayes rule |
Murphy
2.1-2.3, 3.5 |
Griffiths
&
Yuille (2006) |
lecture |
Assignment 0
OPTIONAL |
| Sep 4 |
concept
learning,
Bayesian Occam's razor
|
Murphy
3.1-3.2, 5.3 |
Tenenbaum
(1999)
Jefferys
& Berger (1991) |
lecture
|
Assignment
1 |
| Sep 6 |
continuous
distributions
|
Murphy
2.4-2.6 |
|
lecture |
|
| Sep 11 |
Gaussians, intro
to motion illusions
|
Murphy 4.1-4.6, 5.4
|
motion
demo 1
motion
demos 2
|
|
| Sep 13 |
motion
illusions as optimal
percepts
|
Weiss,
Simoncelli,
Adelson (2002)
Murphy 5.1-5.2 |
lecture |
Commentary
due on Weiss et al. |
| Sep
18 |
Bayesian statistics
(conjugate priors, hierarchical Bayes) |
Murphy
3.3-3.4, 5.5-5.6 (5.7 for fun) |
|
lecture |
Assignment
2 |
| Sep 20 |
Bayes
nets: Representation |
Murphy
10.1-10.5 |
Cowell (1999)
Jordan
& Weiss (2002)
|
lecture
|
|
| Sep 25 |
Bayes
nets: Inference
|
Murphy
20.1-20.5 |
optional:
Huang
& Darwiche, (1994)
|
lecture
|
|
| Sep 27 |
Assignment 3 |
| Oct 2 |
Bayes
nets: Approximate
inference |
Murphy
23.1-23.6, 24.1-24.3 |
Murphy
21.1-22.6, 21.1-21.8
Andrieu
et al.
(2003) |
ppt
pdf |
|
| Oct 4 |
|
| Oct 9 |
Assignment
4 |
| Oct 11 |
Bayes
nets: Learning
|
|
Murphy
26.1-26.7,
Heckerman
(1995) |
ppt pdf |
|
| Oct 16 |
text mining
latent
Dirichlet
allocation |
Murphy
27.1-27.3 |
Griffiths,
Steyvers
& Tenenbaum (2007)
Blei,
Ng, & Jordan (2003)
video
tutorial
on Dirichlet Processes by Teh or Teh
introductory paper |
pdf |
|
| Oct 18 |
text mining
Inferring social
networks |
Murphy
27.4 |
McCallum,
Corrado-Emmanuel, & Wang (2005) |
ppt
pdf
|
Assignment 5
|
| Oct 23 |
text mining
nonparametric
Bayes |
Murphy
25.1-25.2
|
Orbanz &
Teh (2010) |
ppt
pdf |
|
| Oct 25 |
text mining
hierarchical models |
Teh
(2006) |
|
lecture |
|
| Oct 30 |
vision/attention
search |
Mozer
& Baldwin (2008) |
|
lecture |
Assignment
6 |
| Nov 1 |
vision/attention
search |
Najemnik
& Geisler, (2005)
supplemental
material |
|
ppt
pdf |
|
| Nov 6 |
project discussion
PLEASE ATTEND |
|
Baker,
Goldstein, & Heffernan (2012)
|
lecture
|
|
| Nov 8 |
sequential models
hidden markov models |
Murphy
17.1-17.6 |
Gharamani
(2001) |
ppt pdf
pdf2 |
Assignment 7 |
| Nov 13 |
sequential models
Kalman filters |
Murphy
18.1-18.3
Koerding,
Tenenbaum, & Shadmehr (2007) |
|
ppt
pdf
|
|
| Nov 15 |
sequential models
conditional random fields |
Murphy
19.1-19.6 |
Sutton &
McCallum
Mozer
et
al. (2010)
Lafferty,
McCallum, Pereira (2001) |
pdf |
|
| Nov 27 |
sequential models
particle filters,
changepoint detection |
Adams
& MacKay (2008) |
optional:
Wagle
& Frew (2010) |
ppt
pdf |
|
| Nov 29 |
sequential models
sequential dependencies |
Yu
& Cohen (2009) |
Wilder,
Jones, & Mozer (2010) |
pdf
|
Assignment 8 |
| Dec 4 |
sequential models
implementating Bayesian sampling [Matt Wilder, guest lecturer] |
|
|
lecture |
|
| Dec 6 |
NO CLASS
|
|
|
|
|
| Dec 11 |
Gaussian
processes
|
Murphy
15.1-15.6 |
|
part 1 (pdf)
part 2 (ppt) |
|
| Dec 13 |
Deep learning
Student
presenter: Karl Ridgeway (deep learning for acoustic modeling) |
Murphy
28.1-28.5 |
|
part 1 pptx
part 2 pptx |
Assignment 9 |
Queue
Poon & Domingos (2011) Sum-Product Networks: A new deep
architecture.
Gens & Domingos (2012). Discriminative learning of sum-product
networks.
Ullman, T.D., Baker, C.L., Macindoe, O., Evans, O., Goodman, N.D.,
& Tenenbaum, J.B. (2010). Help or hinder: Bayesian models of
social
goal inference. Advances in Neural Information Processing Systems (Vol.
22, pp. 1874-1882).
Baker, C.L., Saxe, R., & Tenenbaum, J.B. (2009). Action
Understanding as Inverse Planning. Cognition, 113, 329-349.
[Supplementary material].
Kemp & Tenenbaum, PNAS, Discovery of Structural Form
Peter
Welinder, Steve Branson, Serge Belongie, Pietro Perona
The Multidimensional Wisdom of Crowds
The
Wisdom of Crowds in the Recollection of Order Information (2009)
Mark Steyvers, Michael Lee, Brent Miller, Pernille Hemmer
Interesting
Links