Assignment 4
Neural Networks and Deep Learning

CSCI 5922
Spring 2017

Assigned Thu Oct 12, 2017
Due: Thu Oct 26, 2017

Assignment submission

Please read instructions here.

Goal

The goal of this assignment is to use tensorflow to explore convolutional nets and to optimize a network for a challenging visual classification task.

Data Set

You will experiment with CIFAR-10, a data set of tiny (32x32) color images in 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck. This task is nontrivial: an error rate of 15-25% is likely.

The data set is divided into 50,000 training images and 10,000 test images. For part 1 of the assignment, do not peek at the test images. Instead, split your 50k training examples into a training set and a validation set. Decide how many examples to use for each.

TensorFlow has a very nice tutorial using the CIFAR-10 data with a sample architecture and code for reading the data and annealing learning rates, etc. It is fair to start with this demo and modify it to optimize your architecture.

Part 1

(1a) Split your 50k training examples into a training set and a validation set. Decide how many examples to use for each. Describe your split into training and validation. (You may wish to do k-fold cross validation instead of just a single fold of validation, or you may want to do many-fold validation where you resample the training and validation sets each time. Simply describe the strategy  you have selected.)

(1b) Use tensorflow to build a convolutional neural net for this task. You can use any of the tricks we've discussed in the course, including pooling, max-pooling, batch normalization, data augmentation, residual networks, drop out, etc. Of course there is a huge literature on CIFAR-10 on the web (including results from various approaches), but I ask you not to pay much attention to this literature and try to invent your own architecture from scratch. You will want to experiment with different variations of your architecture and the combination of tricks you use. I want you to report the history of experiments you've performed by presenting the following information for each variant:
You will undoubtably play with some minor variations (e.g., changing learning rates, loss function, etc.). You needn't report every such tweek. My goal is for you to convey what you believe are the most important architectural manipulations needed to get good performance on this task.

Part 2

(2a) For the variation that obtained the best performance in Part 1, compute error on the test set (the 10k examples you had set aside until now).

Part 3 (Extra Credit)

It might be interesting to pull some examples of images from the 80 million tiny images data set from classes other than the 10 classes you trained on, e.g., buildings or goats or mailboxes. Run these images through your trained network and see whether you can use the output entropy (or some similar measure) to determine whether the test image can be rejected as not belonging to any object category. If you pick an entropy threshold, you will not only reject some of the out-of-class examples, but you will also falsely reject some in-class examples.

(3a) Describe the rejection criterion you use, and for that criterion, report on the % of out-of-class examples you've correctly rejected, but also the % of in-class examples (from the test set) that you erroneously rejected.

Present objects from an untrained class.