skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 2011-2012 · 

Colloquium - Yeh

DLC 170

Making Computer Vision Accessible for GUI Testing and Automation
University of Maryland

Most people dislike manual and repetitive tasks and like to automate them if the right tool is available. For GUI automation, the right tool does not always exist. Some require programming and are inaccessible for end-users who have little knowledge about programming. Some require interacting with a GUI's internal structure and are unable to deal with proprietary and legacy applications whose internal structure is inaccessible. To make automation accessible, we need to find a new modality that is commonly available in all GUI applications and easily understood by end-users. One such modality I have tried with great success is vision. In this talk, I will introduce computational techniques that use images of GUI applications as first-class objects to allow end-users to automate any GUI application they see on a computer screen. I will present Sikuli, software I created that has enabled tens of thousands of users to automate repetitive tasks they were unable to automate before. I will show many real uses of Sikuli such as automating daily disk cleanup, automating a complex sign-up process, automating Facebook status updates, automating dialing on an Android phone, and automating Angry Birds. I will illustrate the real benefit of automation with case studies such as the one about a software project that uses Sikuli to automate 400+ previously manual tests, doubling the software's release rate. I will discuss lessons learned from Sikuli's user community and new research problems it has inspired. Finally, I will outline key challenges for future research to make automation accessible for the entire lifecycle of software including design, development, testing, use, and support.

Tom Yeh is an assistant research scientist in the University of Maryland Institute for Advanced Computer Studies (UMIACS). He received his PhD degree in Computer Science at MIT in 2009. He then spent two years doing a postdoc at the University of Maryland College Park. His research interests span human-computer interaction, computer vision, and software engineering. He has written over 30 research publications on algorithms for interactive computer vision, vision-based interactive systems, multimedia information retrieval, and visual software test automation. He has served on the program committees of the conferences in his area including the Symposium on User Interface Software and Technology and the Workshop on Computer Vision Application. He has been awarded Best Student Paper at UIST 2009 and Best Paper at UIST 2010. He earned his master's degree in Computer Science at MIT, and a bachelor's degree in Computer Science at Simon Fraser University.

The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)