News

News concerning the class will be posted here.

Assignment 3 Test Set

The test set for the final assignment is available now.  Run this test through your system and send me the results. 

Some of you have had problems with the blank lines in these files.  If you provide “open” with the rU option these should go away (as in open(“filename”, 'rU') ). This normalizes the various windows, linux and mac conventions for newlines.


Assignment 3

If you’re happy with the results you’re getting with the third assignment, let me know and I’ll send you the test set.  Run it through your system and send me the resulting tags.  Only do this if you really don’t want to do any more work on this one.  I’ll post the test set for everyone later next week.

Eval script

I’ve posted an evaluation script for HW3 that gives recall, precision and F1 measures.  Call it from a shell with a gold standard file and a system output file.  Note its very simple and doesn’t check for inconsistencies in your system output file.  

Slides

All the slides should be posted correctly now.  The extra lecture on Information Extraction (Ch. 22) is listed as Lecture 27. It’s now available on the caete site under that name as well.

Assignment Three: Expectations

You should expect your performance on the new HW to be much much lower than on Assignment 2.  If you reach an F-measure of .40 you’re doing really well. 


Assignment Three: training data format bug

I had a formatting bug in the training data (tags were separated by a space and a tab instead of just a tab).  Depending on how you’re reading the data it may not matter to you. Nevertheless you should download the new data and work from that.

Assignment 3

The third assignment has been posted.

Quiz 2 Readings

Here are the relevant readings for next week's quiz.

Chapter 12: Skip 12.7.2, 12.8, and 12.9

Chapter 13: Skip 13.4.2, 13.4.3

Chapter 14: Skip 14.6.2, 14.8, 14.9, 14.10

Chapter 17: Skip 17.4.2, 17.5, 17.6

Chapter 18: Skip 18.3.2, 18.4, 18.5 and 18.6

For dependency parsing: Kubler et al. pages 1 to 34.



Sumly article (Summarization app)

WSJ did an article on the young founder of the Sumly startup and it’s acquisition by Yahoo! This link may only work if you’re on campus, or via the vpn.


PBS Nova Show on Watson

PBS’s Nova did an hour long show on IBM’s Watson system that competed on Jeopardy! several years ago. It’s quite entertaining. 

Assignment 2 Test File

Use the following data as your test set for assignment 2.  Run this file through your system and mail me the resulting file. The results format should be the same as with our training data. 

Assignment 2

The second assignment has been posted.


Test File

You can use a new ascii-only test file for the first assignment if you desire.

Old Quizzes and Exams

I've posted a collection of old quizzes and exams from past classes.  Note that the readings and schedules differ from year to year so there may be material on some of the Quiz/Exam 1's that will not be on your quiz.

Assignment 1: Test File

Please use this article from the New York Times as your test set.

Extension

I've granted extensions to a number of folks due to the recent flooding.  I'll post the test file for everyone later this week.  Go ahead and submit your homework without the test results today.


Assignment 1

The first assignment has been posted.  It is due next Tuesday (9/17).