| This class will cover a range of topics that broadly come under the heading of text-based information retrieval. We'll begin with the basic techniques used to index and query large collections of documents. From there we'll go on to study more advanced topics including web-crawling, graph-based retrieval algorithms such as PageRank, digital libraries, document categorization/clustering, text-based social network analysis as well as sentiment analysis. Nothing beyond a typical undergraduate computer science background is required for this course; familiarity with natural language processing, machine learning, probability, and linear algebra will be helpful. |