Assignment
5
Neural Networks and Deep Learning
CSCI
7222
Spring 2015
Assigned Mar 4
Due mid April
Assignment submission
Submit
all assignments via CU's desire2learn
system.
Submission instructions are as follows:
- Go to Desire2learn web
site and enter identikey and
password, or access through mycuinfo -> Student -> Course
information -> Website
- Once on the desire2learn web
site, select our course, which is labeled "Tpcs Nonsymbolic Ai"
- Select the "assessments" tab
in the upper right corner of the page
- Select "dropbox" from the
drop down menu
- Select "assignment 5" and
then upload the file
Goals
The goal of this assignment is to hit the big
time: to explore a state-of-the-art data set using a state-of-the-art
neural net model. For this assignment, we will switch over to using a
simulator package of your choice.
Data Set
The data set we'll use is the ongoing
Data Science Bowl
set available on Kaggle. This data set is a collection of labeled
images of plankton and sea creatures. There are about 30k examples in
the training set and 121 classes. The classes include various
sorts of "unknown" objects. Some of the classes have relatively little
data, and you may decide to simply ignore these classes.
Simulators
Although you're welcome to use your own code for
these images, I suspect you won't do very well unless you have a deep
convolutional net. I recommend switching to a neural net simulator. I
list a variety of simulators on the course page, but I don't have any
experience with these simulators. The best bet to me looks like torch7.
Hopefully as you gain experience, you'll share it with your colleagues
and help them to avoid choosing the wrong simulator.
Most of the simulators let you do GPU computing. The
CSEL
has a machine with a high powered GPU, and all of the machines in CSEL
appear to have reasonable GPUs.
The Assignment
Build the best classifier you can. You can use the
Kaggle competition submission mechanism to estimate your test set
performance, and once the Kaggle competition is over, we can use the
remainder of the test set to rank submissions within the class (and to
compare to the competition entrants).
Insights
I suspect that it may serve you well to do some
image preprocessing to put the images in a more canonical form (e.g.,
aligned with a central axis). Just my guess. You may also think about
nontraditional and neurobiologically motivated approaches based on
multiple fixations or analysis of parts (see
Kanan
2013).