home · mobile · calendar · defenses · 1998-1999 · 

Thesis Defense - Humphries

An Infrastructure to Generate Experimental Workloads for Persistent Object System Performance Evaluation
Computer Science PhD Candidate

Performance evaluation of persistent object system implementations requires the use and evaluation of experimental workloads. Such workloads include a schema describing how the data are related, and application behaviors that capture how the data are manipulated over time. The tools for generating these experimental workloads are currently insufficient or weak.

Although trace-driven simulation has been used for several years to effectively evaluate the performance of memory management systems, few researchers have used this approach to evaluate the performance of persistent object systems. Building on the work of the POSSE group in the area of trace-driven simulation, this dissertation contributes an instrumentation infrastructure for generating and sharing experimental workloads to be used in evaluating the performance of persistent object system implementations. The infrastructure consists of a toolkit that aids the analyst in modeling and instrumenting experimental workloads, and a trace format that allows the analyst to easily reuse and share the workloads.

The POSSE Trace Format (PTF) is a general-purpose trace format that is the specification of a set of events characterizing application operations on persistent object stores. PTF is novel in that the semantics of the higher-level application is maintained through the trace events (e.g., the notion of an object is captured in the trace events). It also captures the information about an application that is not specific to a particular object system implementation.

The instrumentation infrastructure designed in this dissertation also captures the behaviors of a multi-user workload. By multi-user workload, we mean a workload that is a combined workload of multiple clients concurrently accessing a persistent store. The infrastructure is novel in that single-client workloads are generated and captured in trace files and these trace files are later interleaved using a concurrency control model and a transaction model to generate a multi-user workload.

The benefits that can be derived from the use of the infrastructure are as follows: the process of building new experiments for analysis is made easier; experiments to evaluate the performance of implementations can be conducted and reproduced with less effort; and pertinent information can be gathered in a cost-effective manner.

Committee: Alexander Wolf, Professor (Co-Chair)
Benjamin Zorn, Associate Professor (Co-Chair)
Clarence (Skip) Ellis, Professor
Dennis Heimbigner, Research Associate Professor
Akhil Kumar, College of Business
Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:20)