skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · thesis defenses · 2008-2009 · 

Thesis Defense - Moseley

DLC 170

Performance Accountability for Optimizing Compilers
Computer Science PhD Candidate

Compilers employ many aggressive code transformations to achieve highly optimized code. However, because of complex target architectures and unpredictable optimization interactions, these transformations may not always be beneficial. Advances in hardware ensure that instruction set architectures are undergoing continual evolution. As a result, compilers are under constant pressure to adapt and take full advantage of available features.

Current techniques for evaluating code generation only compare profiles at the application level, but a fundamental step in tuning compiler performance is identifying the specific examples that can be improved. Quantitative function- and loop-level comparisons were previously not possible because techniques did not exist to compare them coherently after optimizations have been applied. To ensure the best performance is achieved, a more rigorous approach is necessary.

This work presents two toolchain-independent techniques to better measure and understand relative profile differences across binary programs (produced from the same source) compiled with different compilers, optimizations, or target architectures. First, OptiScope uses aggregate profile counts to match analogous code regions in different binaries. OptiScope has low overhead, but it only works on programs with identical inter-procedural optimizations applied. For programs with arbitrary transformations applied, including inter-procedural optimization, the Chainsaw tool uses more heavyweight analysis on execution logs to match semantically identical intervals of execution. To support each of these applications, I present novel techniques for low-overhead profile collection and seekable event log compression. Both binary matching approaches generate thousands of comparable regions from only a small set of benchmarks, and hundreds of key performance metrics for each region are loaded into a relational database to enable rapid querying for outliers. Case studies show the tools are proficient in identifying performance differences from 32.5% to 893% on select regions of SPEC 2006 benchmarks.

Committee: Dirk Grunwald, Associate Professor (Chair)
Amer Diwan, Associate Professor
Ramesh Peri, Intel Corporation
Jeremy Siek, Assistant Professor
Manish Vachharajani, Assistant Professor

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:40)