home · mobile · calendar · colloquia · 2009-2010 · 

Colloquium - Erk

Semantic Space Models for Word Meaning in Context
University of Texas at Austin

Semantic spaces are a popular framework for the representation of word meaning. They encode the meaning of words as high-dimensional vectors, with dimensions representing context elements, for example other words, or documents in which the target word has appeared. Semantic space models can be induced automatically from text. They have been used very successfully in natural language processing, in particular information retrieval and ontology learning. They have also been popular in cognitive science, where they have been used for modeling experimental results on synonymy, lexical priming, and similarity judgments.

This talk, will focus on the use of semantic spaces for representing word meaning in context. The meaning of a word changes according to the context it is used, for example the meaning of "bat" in "The bats flew out of the cave" differs from its meaning in "He hit the ball with his bat". The task of characterizing word meaning in context is usually phrased as one of word sense disambiguation (WSD), choosing the best-fitting sense out of a list of dictionary senses. But this task has turned out to be very hard for humans as well as machines. This may be due to the underlying model: WSD frames word meaning as a list of distinct dictionary senses, and the task as a classification task. However, research on the psychology of concepts has shown that concepts in the human mind do not work like sets with clear-cut boundaries; they show graded membership, and there are typical cases and borderline cases.

We discuss an alternative model that represents word meaning in context without recourse to dictionary senses, as points in semantic space, which immediately yields a model of semantic similarity as distance in space. We present a semantic space model of word meaning that explicitly represents argument structure and selectional preferences and that can be integrated modularly with existing syntactic representations. The model presents a first step towards a compositional account of word meaning based in semantic space models.

The adoption of a graded, semantic space based model immediately raises the question of usability: Traditional, dictionary-based models of word meaning in context yield sense labels that can easily be integrated in processing pipelines; how would one use semantic space models in applications? This can be framed as a question of performing inferences (in the widest sense) based on graded representations of word meaning. We propose viewing infererence in this setting as driven by attachment points in semantic space. Each inference rule is associated with attachment points, and a rule is triggered by an occurrence that is sufficiently close to its attachment point.

Katrin Erk is an assistant professor in the Department of Linguistics at the University of Texas at Austin. She completed her dissertation on tree description languages and ellipsis at Saarland University in 2002, under the supervision of Gert Smolka. From 2002 to 2006, she held a researcher position in the Salsa project at Saarland University, working on manual and automatic meaning analysis of natural language text. Her current research focuses on computational models for word meaning and the automatic acquisition of lexical information from text corpora.

Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:13)