Assignment 4
Neural Networks and Deep Learning

CSCI 7222
Spring 2015

Assigned Feb 17
Due Mar 4, before class

Assignment submission

Submit all assignments via CU's desire2learn system.  Submission instructions are as follows:
  1. Go to Desire2learn web site and enter identikey and password, or access through mycuinfo -> Student -> Course information -> Website
  2. Once on the desire2learn web site, select our course, which is labeled "Tpcs Nonsymbolic Ai"
  3. Select the "assessments" tab in the upper right corner of the page
  4. Select "dropbox" from the drop down menu
  5. Select "assignment 4" and then upload the file

Goals

The goals of this assignment are: (1) to gain experience on a second data set, one which differs from the handprinted digits in that it has more input features, more output classes, but fewer total training examples; (2) to explore modern techniques for preventing neural network overfitting; and (3) to use validation or cross validation methods for selecting model complexity, architecture, regularizers, etc.

Data Set

The data set comes from the UCI Machine Learning Repository. ISOLET is a famous early data set used for speech recognition. The data set consists of 150 speakers producing the letters A-Z. Each letter is spoken twice. Thus, there are roughly 150x26x2 = 7800 examples in the data set.  (Three examples are missing, so the total is actually 7797.)  We'll split the data into training and test sets in the way researchers have done in the past: all tokens from 120 speakers are part of the training set, and all tokens from the remaining 30 speakers are part of the test set. Thus, test set performance indicates how well the model generalizes to new speakers.

Even though you will have access to the test set, you are not to peek at the test set or use it in any way until you get to Part 4 of the assignment. All development for Parts 1-3 should be based solely on the training set. For model development, you will want to set aside some data from the training set to use as a validation set to determine model architecture, hyperparameters, regularizers, etc. You may wish to form the validation set by setting aside certain speakers, so that the validation set -- like the test set -- evaluates performance on new speakers. You may instead wish to do cross validation with the training set, pulling aside one speaker at a time and training the model on the remaining speakers, then using the held-out speaker to estimate model performance.

The auditory input has been processed and reduced to a set of 617 features. The features include spectral coefficients, contour features, sonorant features, pre-sonorant features, and post-sonorant features.

The output represents one of the 26 letters and is coded by an integer ranging from 1 to 26.

I've formatted the data in three ways for you:
matlab data structures containing train and test sets
MS Excel spreadsheets containing train and test sets
CSV text files containing train and test sets

In the matlab structures, the input features and output classes are placed in separate arrays. In the Excel and CSV files, each row contains 618 columns, the first 617 of which are input features and the last is the output class (1-26).

Part 1

Using the code you wrote for Assignment 3, build a network using the training set. You can choose the output unit activation function (e.g., logistic, normalized exponential), loss function (squared error, cross entropy), number of hidden units, weight initialization scheme, etc. Use logistic or tanh hidden units.  ("tanh" is the sigmoid function that maps every input to the range [-1, +1].)

Using the validation or cross validation method, decide on the specific architecture, etc.

Once you've chosen the architecture, train a final network on the entire data set (including the portion may have set aside for validation).

Part 2

Now incorporate one or more of the regularization techniques we talked about in class, such as dropout (strongly recommended), soft weight constraints, or a hard weight constraint (which prevents the weight vector length from exceeding an upper threshold).

Using the validation or cross validation method, decide on the specific regularizers you want to include in your model.

Once you've chosen the regularizer, train a final network on the entire data set (including the portion you may have set aside for validation).

Part 3

Now create a version of your model that uses rectified linear (ReLU) hidden units. The change to your code should be pretty minor. Using a validation set, tweak the model to get the best estimated performance.

Once you've chosen the final model architecture and regularizers, train a final network on the entire data set (including the portion you may have set aside for validation).

Part 4

Now, and only now, use the training set to evaluate performance of the models you built in Parts 1-3.  Report performance on each of the 3 models.