skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 2000-2001 · 

Colloquium - Grudic

ECCR 245

Fast Reinforcement Learning in Large Continuous Domains
Gregory Z. Grudic
University of Pennsylvania
Gregory Grudic photo

Reinforcement Learning (RL) is a framework by which an agent autonomously learns to improve the rewards it receives from its environment, thus conceptually embodying one of the fundamental aims of AI. RL algorithms have been shown to be effective on a variety of discrete problem domains, where simulation allows millions of learning episodes to be executed. However, the application of RL to large continuous problem domains where only limited numbers of learning episodes can be afforded, has not met with as much success. I propose a set of methods in which the policy definitions used to encode how an agent interacts with its environment are represented by a finite set of parameters. Policy Gradient (PG) RL is used to incrementally modify this parameterized policy along a gradient of improved reward. I will describe three new PG algorithms we have developed to make RL feasible in high dimensional continuous state spaces: Boundary Localized Reinforcement Learning (BLRL), Action Transition Policy Gradient (ATPG), and Deterministic Policy Gradient (DPG). These algorithms directly encode prior domain knowledge, vastly improving convergence speed, and are all theoretically guaranteed to converge to locally optimal control policies. The computational feasibility of these algorithms is demonstrated experimentally on simulated and real problems taken from robotics. We show convergence to locally optimal policies in less than a few hundred episodes, unlike other PG algorithms that require orders of magnitude more episodes to converge.

Hosted by Michael Mozer.
Refreshments will be served immediately following the talk in ECOT 831.

The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)