home · mobile · calendar · defenses · 1998-1999 · 

Thesis Defense - Cole

Optimizing Dynamic Query Evaluation Plans
Richard Cole
Computer Science PhD Candidate

Traditional query optimizers assume accurate knowledge of run-time parameters such as predicate selectivities and the availability of resources such as memory, processors, or internet sites at compile-time. In reality, this assumption is often not justified. Thus, the static query evaluation plans produced by traditional optimizers may not be optimal for many of their actual run-time invocations. Instead, we propose a novel optimization model that assigns the bulk of the optimization effort to compile-time, and delays carefully selected optimization decisions until run-time.

Previous work defined the run-time primitives -- dynamic plans using choose-plan operators -- for executing such delayed decisions, but did not solve the problem of constructing dynamic plans at compile-time. This thesis introduces techniques that solve this problem, thereby enabling the creation of two types of dynamic plans: start-up-time-dynamic plans, defined by optimization decisions delayed until the start of run-time, and run-time-dynamic plans. Start-up-time-dynamic plans are appropriate when cost-model parameters are less uncertain at query start-up than at compile-time. Run-time-dynamic plans go beyond start-up-time by combining query optimization and evaluation during run-time.

Since the classic optimization work in System R, query optimization has completely preceded query evaluation. Unfortunately, errors in cost model parameters such as selectivity estimation compromise the optimality of query evaluation plans optimized at compile-time. The only promising remedy is to interleave strategy selection and data access using run-time-dynamic plans. Our technique can employ all known query evaluation algorithms, avoids repetitive decisions, and is not limited by schema type. Most importantly, it performs careful query analysis and prepares alternative query evaluation plans at compile-time, delaying relatively few, selected decisions until run-time. Based on the statistical principles of decision theory, our technique materializes only those intermediate results for which materialization costs are expected to be less than the benefits from the improved decision quality. Experiments with a working prototype (using the Volcano optimizer generator) demonstrate that our approach significantly improves the performance of complex database queries.

Committee: Goetz Graefe, Assistant Professor (Chair)
José Blakeley
David Maier, OGI
James Martin, Associate Professor
Richard Byrd, Professor
Benjamin Zorn, Associate Professor
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:20)