skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · undergraduate program · senior project · projects · 
 

Senior Project - cray watch

 

Machine Room Diagnostic Daemon

Senior Project: 1996-1997
Greg Bachmeyer, Christopher Corliss, Theodore Hong and Michael Melanson
Scientific Computing Division
Boulder, CO

The NCAR Scientific Computing Division (SCD) is responsible for providing state-of-the-art, high-performance computing resources to support the research activities of atmospheric and related sciences around the country. The supercomputers, mass storage devices, and associated high-speed networks that comprise the core of NCAR's computing resources are, in turn, supported by an array of peripheral devices and diagnostic equipment essential to keeping the machine room operators informed of the physical status of the machines in the room.

Problems arise however, in ascertaining when a given device will require operator attention, due either to hardware failure or software error. While many of the devices in the machine room have on-board diagnostics and are capable of reporting their individual condition, there was no way for the machine room operators to monitor all such devices from a single monitoring station. This leads to device instabilities that may go unnoticed (or are unattended to) until it fails, resulting in many lost hours of research productivity until the device is brought back online.

This project involved the building of a software interface between a number of key peripheral devices that reports the diagnostic information to a single communications node on an operator's workstation. The reports are generated in real time and provide current device status, diagnostic information, and other critical component information as available on the device. Moreover, the diagnostics daemon is highly extensible, capable of adding or deleting devices as SCD acquires new hardware and test software over time. The software was implemented in C++ using an object-oriented approach in a UNIX and Motif environment.

 
See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Questions/Comments?
Send email to

Engineering Center Office Tower
ECOT 717
+1-303-492-7514
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (14:07)
 
.