CSCI 5622 Final Exam: (worth
20% of your final mark)
Your goal is to build the best you can model using the training data contained in the following text file FinalTrainData.txt. This data file contains 501 examples, each having 200 input features and one classification output (which is either 0 or 1). The file is organized as follows:

Where
is feature
of training example
, and
is the classification
of training example
. You are free to use any learning algorithm to generate a
model based on this data (you are not limited to the algorithms we studied or
those you implemented for homework assignments). Once you have built your best
model, you will use it to predict the class outputs for the inputs contained in
this data file FinalTestInputData.txt.
This file contains 502 examples of input features with no classifications. Its
format is as follows

You will use your best model to generate a set of
predictions
where
and generate a file
called TestOutput.txt that has the following
format:
![]()
Note that I have the actual outputs
associated with the
inputs in FinalTestInputData.txt
(but you don’t!).
You will email me the following (by Dec 18 please!):
![]()
where the error rates are calculated as
![]()
and
![]()
Therefore, if you produce a
model that is better than mine, you can get better than 100% on the exam! But I’m
not telling you what
is until everyone has
handed in the final exam.
Best of luck!