Assignment
4
Neural Networks and Deep Learning
CSCI
7222
Spring 2015
Assigned Feb 17
Due Mar 4, before class
Assignment submission
Submit
all assignments via CU's desire2learn
system.
Submission instructions are as follows:
- Go to Desire2learn web
site and enter identikey and
password, or access through mycuinfo -> Student -> Course
information -> Website
- Once on the desire2learn web
site, select our course, which is labeled "Tpcs Nonsymbolic Ai"
- Select the "assessments" tab
in the upper right corner of the page
- Select "dropbox" from the
drop down menu
- Select "assignment 4" and
then upload the file
Goals
The goals of this assignment are: (1) to gain
experience on a second data set, one which differs from the handprinted
digits in that it has more input features, more output classes, but
fewer total training examples; (2) to explore modern techniques
for preventing neural network overfitting; and (3) to use validation or
cross validation methods for selecting model complexity, architecture,
regularizers, etc.
Data Set
The data set comes from the UCI Machine Learning
Repository.
ISOLET
is a famous early data set used for speech recognition. The data set
consists of 150 speakers producing the letters A-Z. Each letter is
spoken twice. Thus, there are roughly 150x26x2 = 7800 examples in the
data set. (Three examples are missing, so the total is actually
7797.) We'll split the data into training and test sets in the
way researchers have done in the past: all tokens from 120 speakers are
part of the training set, and all tokens from the remaining 30 speakers
are part of the test set. Thus, test set performance indicates how well
the model generalizes to new speakers.
Even though you will have access to the test set,
you are not to peek at the test set or use
it in any way until you get to Part 4 of the assignment. All
development for Parts 1-3 should be based solely on the training set.
For model development, you will want to set aside some data from the
training set to use as a validation set to determine model
architecture, hyperparameters, regularizers, etc.
You may wish to form the validation set by setting aside certain
speakers, so that the validation set -- like the test set -- evaluates
performance on new speakers. You may instead wish to do cross
validation with the training set, pulling aside one speaker at a time
and training the model on the remaining speakers, then using the
held-out speaker to estimate model performance.
The auditory input has been processed and reduced to a set of 617
features. The features include spectral coefficients, contour features,
sonorant features, pre-sonorant features, and post-sonorant features.
The output represents one of the 26 letters and is coded by an integer
ranging from 1 to 26.
I've formatted the data in three ways for you:
matlab data structures containing
train
and
test sets
MS Excel spreadsheets containing
train
and
test sets
CSV text files containing
train and
test sets
In the matlab structures, the input features and output classes are
placed in separate arrays. In the Excel and CSV files, each row
contains 618 columns, the first 617 of which are input features and the
last is the output class (1-26).
Part 1
Using the code you wrote for Assignment 3, build a
network using the training set. You can choose the output unit
activation function (e.g., logistic, normalized exponential), loss
function (squared error, cross entropy), number of hidden units, weight
initialization scheme, etc. Use logistic or tanh hidden units.
("tanh" is the sigmoid function that maps every input to the
range [-1, +1].)
Using the validation or cross validation method, decide on the specific
architecture, etc.
Once you've chosen the architecture, train a final network on the
entire data set (including the portion may have set aside for
validation).
Part 2
Now incorporate one or more of the regularization
techniques we talked about in class, such as dropout (strongly
recommended), soft weight constraints, or a hard weight constraint
(which prevents the weight vector length from exceeding an upper
threshold).
Using the validation or cross validation method, decide on the specific
regularizers you want to include in your model.
Once you've chosen the regularizer, train a final network on the entire
data set (including the portion you may have set aside for validation).
Part 3
Now create a version of your model that uses
rectified linear (ReLU) hidden units. The change to your code should be
pretty minor. Using a validation set, tweak the model to get the best
estimated performance.
Once you've chosen the final model architecture and regularizers, train
a final network on the entire data set (including the portion you may
have set aside for validation).
Part 4
Now, and only now, use the training set to
evaluate performance of the models you built in Parts 1-3. Report
performance on each of the 3 models.