skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · undergraduate program · senior project · projects · 

Senior Project - HuGe Man


Trace File Assembler for Genomics Research

Senior Project: 2002-2003
Ty Alexander, Sidharth Bhat, Jacob Borer, Joslin Dunning, Jing Fang and Nathan Kurach

Genome research has made dramatic progress in recent years. After decades of efforts from laboratories around the world, the entire human genome has finally been sequenced, and the whole human chromosomal DNA sequence has become available for biomedical research. By taking advantage of a computer's computational power, scientists have mapped the large amount of genetic and biochemical data from human and other organisms. Bioinformatics has thus emerged as a fast-growing field.

The different functions of a gene are reflected by its phenotype. For example, the development of breast and ovarian cancer has been found to be related to the genes called BRCA-1 and 2. However, among the hundreds of thousands of genes on human chromosomes, only a few genes and their related functions have been identified. Today, many of the genetic traits undergoing study have only been roughly located within a certain region on a human chromosome.

The DNA region where scientists believe a particular gene is located is called quantitative trait loci (QTLs). A QTL may contain anywhere from hundreds to thousands of genes. To correctly locate and identify a gene, a consensus DNA sequence is compared to other DNA sequences with those particular clinical traits. If any mismatch of base pairs between the genes is detected from DNA sequences among the mapped and the trait-carrying strains, it is highly likely that such a gene may account for the generation of the clinical traits.

A research group in the Department of Pharmacology at the University of Colorado Health Science Center was interested in finding the genes which cause alcohol/cocaine preference and tolerance. Currently, they are trying to identify such genes in mice by the strategy called in silico identification of coding sequence variation within a QTL.

This project automates this identification process, providing an order of magnitude improvement in productivity. The software was implemented in Java.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (14:07)