home · mobile · calendar · defenses · 2009-2010 · 

Thesis Defense - White

Pattern-Based Recovery of Argumentation from Scientific Text
Computer Science PhD Candidate

As the number of publications in the biomedical field continues its exponential increase, techniques for automatically summarizing information from this body of literature have become more diverse. In addition, the targets of summarization have become more subtle: initial work focused on extracting the factual assertions from full-text papers, but more recently, interest has shifted to recovering speculations and agreements or disagreements with other research. Scientific writing is rife with such argumentation, and the premises, evidence, conjectures, objections and rebuttals that writers use to persuade the reader represent a rich vein of expert knowledge for summarization.

Agreement, disagreement, and conjecture are often expressed in highly scripted ways; likewise, the higher-order discourse structures that underpin multisentence arguments tend to assume particular forms into which claims and evidence can be nested. These features make these kinds of arguments readily recoverable by pattern-based search. Here, I present PARROT, which uses OpenDMAP patterns in combination with a Protégé ontology. PARROT first matches simple argumentative claims using a set of concepts relevant to scientific discourse and then exploits discourse cues and inference to combine these claims recursively into higher-order argument trees. PARROT outperforms an SVM classifier system in identifying statements of support and conflict at the sentence level. Additionally, PARROT provides a graphical representation of the arguments it finds, which makes it an valuable tool for summarizing the reasoning behind scientists' conclusions and identifying areas of consensus and contention.

Committee: Lawrence Hunter, University of Colorado School of Medicine (Co-Chair)
Elizabeth Bradley, Professor (Co-Chair)
James Martin, Professor
Debra Goldberg, Assistant Professor
Richard Osborne, University of Colorado Denver
Andrzej Ehrenfeucht, Distinguished Professor
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:20)