Syllabus
Probabilistic Models of
Human and Machine Intelligence
CSCI
7222
Fall 2013
Tu,
Th 14:00-15:15
ECCR 151
Instructor
Professor
Michael
Mozer
Department of Computer Science
Engineering Center Office Tower 741
(303) 492-4103
Office Hours: Tu 15:30-16:30, Th 13:00-13:45
Course Objectives
A new
paradigm has emerged in cognitive science and artificial intelligence
which views the mind as a computer extraordinarily tuned to the
statistics of the environment in which it operates, and views learning
and adaptation in terms of changes to these statistics over time. The
goal of the course is
to understand
the latest advances in theory in cognitive science and artificial
intelligence that take a statistical and probabilistic perspective.
One virtue of probabilistic models is that they
straddle
the gap between cognitive science, artificial intelligence, and machine
learning. The same
methodology
is useful for both understanding
the brain and building intelligent
computer systems. Indeed, for much of the research we'll
discuss,
the models contribute both to machine learning and to cognitive
science. Whether your primary interest is in engineering
applications of machine learning or in cognitive modeling, you'll see
that there's a lot of itnerplay between the two fields.
The course participants are likely to
be a
diverse group of students, some with primarily an
engineering/CS focus and others primarily interested in cognitive
modeling (building computer simulation and mathematical models to
explain human perception, thought, and learning).
Prerequisites
The
course is open to any
students who
have some background in cognitive science or artificial intelligence
and who have taken an introductory probability/statistics course.
If your background in probability/statistics is weak, you'll
have
to do some catching up with the text.
Course Readings
We
will be using a text by David Barber (
Bayesian
Reasoning And Machine Learning,
Cambridge University Press, 2012.
The author has made available an
electronic
version of the text. Note that
the electronic version is a 2013
revision.
For additional references,
wikipedia
is
often a useful resource. The pages on various probability
distributions are great references. If you want additional reading, I
recommend the following texts:
We will also be reading research articles from the literature, which
can be downloaded from the links on the class-by-class syllabus below.
Course Discussions
We will use Piazza for class discussion.
Rather than emailing me, I encourage you to post your questions
on Piazza. This is my first experience with Piazza but I will strive to
respond quickly. If I do not, please email me personally. The
Piazza class page is:
https://piazza.com/colorado/fall2013/csci7222/home
Course
Requirements
Readings
In
the style
of graduate
seminars, your will be responsible to read
chapters from the text and research articles before
class and be prepared to come into class to discuss the material
(asking
clarification questions, working through the math,
relating papers to each other, critiquing the
papers, presenting
original ideas related to the paper).
Homework Assignments
We can
all delude ourselves into believing we
understand some math or algorithm by reading, but implementing and
experimenting with the algorithm is both fun and valuable for obtaining
a true understanding. Students will implement small-scale
versions of as many of the models
we discuss as possible. I will give about 10 homework
assignments
that involve implementation over the semester, details to be
determined. My preference is for you to work in matlab, both because
you can leverage software available with the Barber text, and because
matlab has become the de facto work horse in machine learning.
For one or two assignments, I'll ask you to write a one-page
commentary on a research article.
Semester
Grades
Semester
grades will be based 5% on class
attendance and participation and 95% on the homework assignments.
I will weight the assignments
in
proportion to their difficulty, in the range of 5% to 10% of the course
grade. Students with backgrounds in the area and specific
expertise may wish to do in-class presentations for extra credit.
Class-By-Class Plan and Course
Readings
The
greyed out portion of this schedule is tentative and will be adjusted
as the semester goes on. I may adjust assignments, assignment dates,
and lecture topics based on the class's interests.
Date |
Activity |
Required
Reading
(Section numbers refer to Barber) |
Optional
Reading |
Lecture
Notes |
Assignments
|
Aug
27 |
introductory
meeting |
29.1
(Appendix in hardcopy edition),
13.1-13.3 |
Chater,
Tenenbaum, & Yuille (2006) |
lecture |
Assignment 0 |
Aug 29 |
basic
probability, Bayes rule |
1.1-1.4 |
Griffiths
&
Yuille (2006) |
lecture |
|
Sep 3 |
continuous
distributions
|
8.1-8.3 |
|
lecture
|
|
Sep 5 |
concept
learning,
Bayesian Occam's razor |
12.1-12.3 (requires a bit
of probability we haven't talked about, so don't sweat the details) |
Tenenbaum
(1999)
Jefferys
& Berger (1991) |
lecture |
Assignment
1 |
Sep 10 |
Gaussians
|
8.4-8.7 |
useful reference:
Murphy (2007)
|
lecture |
|
Sep 12 |
UNIVERSITY CLOSED:
STAY DRY |
|
|
|
|
Sep 17 |
motion
illusions as optimal
percepts
|
Weiss,
Simoncelli,
Adelson (2002) |
motion
demo 1
motion
demo 2
|
lecture |
Assignment 2 |
Sep
19 |
Bayesian
statistics
(conjugate priors, hierarchical Bayes) |
9.1 |
|
lecture |
|
Sep 24 |
Bayes
nets: Representation |
2.1-2.3,
3.1-3.5
|
Cowell (1999)
Jordan
& Weiss (2002)
4.1-4.6
|
lecture
|
Assignment 3
|
Sep 26 |
Bayes
nets: Exact Inference
|
5.1-5.5 |
Huang
& Darwiche, (1994)
|
lecture
|
|
Oct 1 |
Assignment 4 |
Oct 3 |
Bayes
nets: Approximate
inference |
27.1-27.6 |
Andrieu
et al.
(2003) |
lecture
|
|
Oct 8 |
|
Oct 10 |
Learning I: Parameter learning
|
9.2-9.4 |
Heckerman
(1995)
9.5 |
lecture |
Assignment 5 |
Oct 15 |
Learning II: Missing data, latent variables, EM, GMM |
11.1-5, 20.2-3 |
|
lecture |
|
Oct 17 |
text mining
latent
Dirichlet
allocation |
20.6 |
Griffiths,
Steyvers
& Tenenbaum (2007)
Blei,
Ng, & Jordan (2003)
video
tutorial
on Dirichlet Processes by Teh or Teh
introductory paper |
lecture |
|
Oct 22 |
text mining
Inferring social
networks |
McCallum,
Corrado-Emmanuel, & Wang (2005) |
|
lecture
|
Assignment 6
|
Oct 24 |
text mining
nonparametric
Bayes |
Orbanz &
Teh (2010) |
|
lecture |
|
Oct 29 |
text mining
hierarchical models |
Teh
(2006) |
|
lecture |
|
Oct 31 |
catch up day |
|
|
|
|
Nov 5 |
sequential models
hidden markov models |
23.1-23.3 |
Gharamani
(2001) |
lecture
|
Assignment 7
|
Nov 7 |
sequential models
conditional random fields
|
23.4-23.5 |
Sutton &
McCallum
Mozer
et
al. (2010)
Lafferty,
McCallum, Pereira (2001) |
lecture |
|
Nov 12 |
final project |
21.1-21.2, 22.1-22.2 |
|
lecture |
Assignments 8 and 9 |
Nov 14 |
sequential models
sequential dependencies (Matt Wilder
guest lecturer) |
Yu
& Cohen (2009) |
Wilder,
Jones, & Mozer (2010) |
|
|
Nov 19 |
sequential models
exact and approximate inference (particle filters,
changepoint detection)
[Janeen presents] |
27.6
Adams
& MacKay (2008) |
|
ppt
pdf |
|
Nov 21 |
sequential models
Kalman filters
[Ian, David, Matt present] |
24.1-24.4 |
Koerding,
Tenenbaum, & Shadmehr (2007)
24.5 |
lecture |
|
Dec 3 |
Gaussian
processes |
19.1-19.5 |
|
lecture1
lecture2 |
|
Dec 5 |
vision/attention
search [Arafat presents] |
Mozer
& Baldwin (2008)
Najemnik
& Geisler, (2005) |
supplemental
material for Najemnik & Geisler |
lecture
lecture |
|
Dec 10 |
NO CLASS [Mozer
at NIPS conference] |
|
|
|
|
Dec 12 |
Deep learning |
|
|
part 1 pptx |
|
Dec 14
13:30-16:00 |
Final project
presentations |
|
|
|
|
Queue
Poon & Domingos (2011) Sum-Product Networks: A new deep
architecture.
Gens & Domingos (2012). Discriminative learning of sum-product
networks.
Ullman, T.D., Baker, C.L., Macindoe, O., Evans, O., Goodman, N.D.,
& Tenenbaum, J.B. (2010). Help or hinder: Bayesian models of
social
goal inference. Advances in Neural Information Processing Systems (Vol.
22, pp. 1874-1882).
Baker, C.L., Saxe, R., & Tenenbaum, J.B. (2009). Action
Understanding as Inverse Planning. Cognition, 113, 329-349.
[Supplementary material].
Kemp & Tenenbaum, PNAS, Discovery of Structural Form
Peter
Welinder, Steve Branson, Serge Belongie, Pietro Perona
The Multidimensional Wisdom of Crowds
The
Wisdom of Crowds in the Recollection of Order Information (2009)
Mark Steyvers, Michael Lee, Brent Miller, Pernille Hemmer
Interesting
Links