|
An Infrastructure to Generate Experimental Workloads for Persistent Object System Performance Evaluation
Computer Science PhD Candidate
Performance evaluation of persistent object system implementations requires the
use and evaluation of experimental workloads. Such workloads include a schema
describing how the data are related, and application behaviors that capture how
the data are manipulated over time. The tools for generating these experimental
workloads are currently insufficient or weak.
Although trace-driven simulation has been used for several years to effectively
evaluate the performance of memory management systems, few researchers have
used this approach to evaluate the performance of persistent object systems.
Building on the work of the POSSE group in the area of trace-driven simulation,
this dissertation contributes an instrumentation infrastructure for generating
and sharing experimental workloads to be used in evaluating the performance of
persistent object system implementations. The infrastructure consists of a
toolkit that aids the analyst in modeling and instrumenting experimental
workloads, and a trace format that allows the analyst to easily reuse and share
the workloads.
The POSSE Trace Format (PTF) is a general-purpose trace format that is the
specification of a set of events characterizing application operations on
persistent object stores. PTF is novel in that the semantics of the
higher-level application is maintained through the trace events (e.g., the
notion of an object is captured in the trace events). It also captures the
information about an application that is not specific to a particular object
system implementation.
The instrumentation infrastructure designed in this dissertation also captures
the behaviors of a multi-user workload. By multi-user workload, we mean a
workload that is a combined workload of multiple clients concurrently accessing
a persistent store. The infrastructure is novel in that single-client workloads
are generated and captured in trace files and these trace files are later
interleaved using a concurrency control model and a transaction model to
generate a multi-user workload.
The benefits that can be derived from the use of the infrastructure are as
follows: the process of building new experiments for analysis is made easier;
experiments to evaluate the performance of implementations can be conducted and
reproduced with less effort; and pertinent information can be gathered in a
cost-effective manner.
|