home · mobile · calendar · colloquia · 2011-2012 · 

Colloquium - Yeh

Making Computer Vision Accessible for GUI Testing and Automation
University of Maryland

Most people dislike manual and repetitive tasks and like to automate them if the right tool is available. For GUI automation, the right tool does not always exist. Some require programming and are inaccessible for end-users who have little knowledge about programming. Some require interacting with a GUI's internal structure and are unable to deal with proprietary and legacy applications whose internal structure is inaccessible. To make automation accessible, we need to find a new modality that is commonly available in all GUI applications and easily understood by end-users. One such modality I have tried with great success is vision. In this talk, I will introduce computational techniques that use images of GUI applications as first-class objects to allow end-users to automate any GUI application they see on a computer screen. I will present Sikuli, software I created that has enabled tens of thousands of users to automate repetitive tasks they were unable to automate before. I will show many real uses of Sikuli such as automating daily disk cleanup, automating a complex sign-up process, automating Facebook status updates, automating dialing on an Android phone, and automating Angry Birds. I will illustrate the real benefit of automation with case studies such as the one about a software project that uses Sikuli to automate 400+ previously manual tests, doubling the software's release rate. I will discuss lessons learned from Sikuli's user community and new research problems it has inspired. Finally, I will outline key challenges for future research to make automation accessible for the entire lifecycle of software including design, development, testing, use, and support.

Tom Yeh is an assistant research scientist in the University of Maryland Institute for Advanced Computer Studies (UMIACS). He received his PhD degree in Computer Science at MIT in 2009. He then spent two years doing a postdoc at the University of Maryland College Park. His research interests span human-computer interaction, computer vision, and software engineering. He has written over 30 research publications on algorithms for interactive computer vision, vision-based interactive systems, multimedia information retrieval, and visual software test automation. He has served on the program committees of the conferences in his area including the Symposium on User Interface Software and Technology and the Workshop on Computer Vision Application. He has been awarded Best Student Paper at UIST 2009 and Best Paper at UIST 2010. He earned his master's degree in Computer Science at MIT, and a bachelor's degree in Computer Science at Simon Fraser University.

Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:13)