home · mobile · calendar · defenses · 2009-2010 · 

Thesis Defense - Kireyev

Applications of Distributional Vector Space Models to Modeling of Psycholinguistic Phenomena
Kirill Kireyev
Computer Science PhD Candidate

Distributional vector-space models (DVSMs) are unsupervised statistical models of meaning of words and text documents. They derive their representation by analyzing word occurrence patterns in large collections of natural language text. Distributional models provide an efficient and robust way to represent semantics of words and text making them useful in various natural language processing applications.

This work will focus on applying distributional vector space models to three new areas. These areas are interesting both theoretically, by providing insights into psycholinguistic properties of language, as well as for practical applications, such as information retrieval, automated language tutoring and cognitive accessibility.

The first area is word specificity: the notion that some words are more precise and carry more semantic content than other words that are more vague. The second is meaning maturity of words and texts: estimations of how well we would expect certain words to be known or text passages to be understood by typical language learners at particular level of language exposure. The third is word importance: the relative contribution of individual words for constructing the meaning of particular text passages or shaping the development of meaning of other words.

Finally, I briefly describe ongoing work in personalized vocabulary instruction that uses all three of these aspects as part of a sophisticated educational software system.

Committee: James Martin, Professor (Chair)
Martha Palmer, Department of Linguistics
Eliana Colunga, Department of Psychology
Elizabeth Jessup, Professor
Thomas Landauer, Department of Psychology
Walter Kintsch, Department of Psychology
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:20)