skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 2008-2009 · 
 

Colloquium - Stanzione

 
12/4/2008
2:00pm-3:00pm
NCAR

A Scalable Framework for Offline Parallel Debugging
Arizona State University
Daniel Stanzione photo

As supercomputers continue to grow larger, a growing community has increasingly easy access to run jobs on thousands, tens of thousands, or even hundreds of thousands of cores. However, the ability to debug these jobs at scale has not kept up with the growth in hardware. In this talk, the GDBase framework for offline debugging will be presented. GDBase solves three problems in large scale debugging: (1) it integrates with batch systems to allow debugging jobs to be run without the need to interrupt production operation; (2) it moves debugging from online to offline, to reduce the amount of system time consumed; (3) it stores results in a database to allow automated analysis of the vast quantities of debugging data that large jobs can produce. GDBase has been used to date to debug runs of more than 8,000 MPI tasks.

Dr. Daniel Stanzione, Director of the High Performance Computing Initiative (HPCI) at Arizona State University, joined the Ira A. Fulton School of Engineering in 2004. Prior to ASU, he served as an AAAS Science Policy Fellow in the Division of Graduate Education at the National Science Foundation. Stanzione began his career at Clemson University, where he earned his doctoral and master degrees in computer engineering as well as his bachelor of science in electrical engineering. He then directed the supercomputing laboratory at Clemson and also served as an assistant research professor of electrical and computer engineering. At the HPCI, Stanzione's team collaborates with UT-Austin in operating the "Ranger" supercomputer for NSF's TeraGrid, currently the 6th largest system in the world.

Dr. Stanzione's research focuses on parallel programming, scientific computing, scheduling in computational grids, alternative architectures for high end computing, reconfigurable/adaptive computing, and algorithms for high performance bioinformatics. Also an advocate of engineering education, he facilitates student research through the HPCI and teaches specialized computation engineering courses.

This talk is sponsored by the National Center for Atmospheric Research Computational & Information Systems Laboratory and will be held in the Main Seminar Room at the Mesa Lab.


The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

 
See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Questions/Comments?
Send email to

Engineering Center Office Tower
ECOT 717
+1-303-492-7514
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)
 
.