skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · undergraduate program · senior project · projects · 

Senior Project - ARMSS


Automated Mapping between RNA Sequence and Structure

Senior Project: 1997-1998
James Gibbens, Tera Newman, Tuancuong Nguyen and Eric Scott

The workings of a living cell are being deciphered in great detail, due in large part to new sequencing technology. It is now possible to determine the nucleotide sequence for an organism's entire genetic blueprint, stored in the organism's chromosomes, also called its genome. The genomes in smaller bacteria contain 500,000 nucleotides, while genomes are considerably larger in multicellular organisms.

Researchers in the Department of Chemistry seek to understand the biological meaning for this large amount of nucleotide sequence information. The structure and evolution for a few select RNA molecules are being studied. In this work, thousands of sequences are aligned and analyzed. A better understanding of these "structural building blocks" will greatly increase the understanding of RNA structure and conformation, and how a sequence (primary structure) folds up into its biologically functional secondary and tertiary structure.

To date, most of the mappings between sequence and higher-order structure have been identified by visual inspection of the sequence and structure data. This manual method can at best only determine the obvious and strongest sequence/structure relationships. Software was developed approximately ten years ago with the goal of finding more of these relationships. Inherent in this program is an "RNA structure language", in which a user writes a descriptor to describe a specific structural motif. The program then searches a sequence database for cases where a part of a sequence can form the structure defined in the user-specified descriptor.

The RNA structure language and the software both had limitations. The goal of the project was to develop an improved descriptor language, able to describe more complex structural motifs, along with software to implement the corresponding search of the databases. The project was implemented in C++ with an object-oriented approach and runs in a UNIX environment.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (14:07)