home · mobile · calendar · defenses · 2003-2004 · 

Thesis Defense - Coccaro

Latent Semantic Analysis as a Tool to Improve Automatic Speech Recognition Performance
Noah Coccaro
Computer Science PhD Candidate
RL6 C181

This thesis explores the use of Latent Semantic Analysis to augment an N-gram language model to improve the accuracy of a large vocabulary speech recognition system. This thesis discusses possible solutions to three problems presented when integrating LSA with an N-gram model.

First, two approaches to deriving a probability from a semantic distance are examined. Numerous parameters are introduced and optimal values found. Second, because the N-gram and LSA model have different strengths, it is necessary to develop confidence metrics that indicate when to rely more strongly on a particular model. Several confidence metrics are developed and used. Lastly, the problem of combining the two probability models is explored. Several different approaches to combining the models, including geometric mean and a decision tree were evaluated. Experimental results compared to a standard trigram model showed a reduction in perplexity of approximately 14%, and a significant reduction in the word error rate of a speech recognizer by 0.5%, 3.0% relative.

Committee: James Martin, Associate Professor (Chair)
Daniel Jurafsky, Stanford University
Clayton Lewis, Professor
Wayne Ward, Center for Spoken Language Research
Thomas Landauer, Department of Psychology
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:20)