skip to main content
Department of Computer Science University of Colorado Boulder
cu: home | engineering | mycuinfo | about | cu a-z | search cu | contact cu cs: about | calendar | directory | catalog | schedules | mobile | contact cs
home · events · colloquia · 1995-1996 · 

Colloquium - Singh

ECCR 2-28

Learning to Act in Dynamic Environments Under Uncertainty and Incomplete Knowledge
Harlequin, Inc.
Satinder Singh photo

Making intelligent decisions/plans in the face of uncertainty and incomplete knowledge about their consequences in dynamic environments is a central problem in artificial intelligence. It is also a central problem in control theory, operations research, and statistics. An exciting confluence of ideas from these diverse disciplines with their complementary strengths is taking place in the field of reinforcement learning. I will motivate this confluence and present some of my contributions to it.

Reinforcement learning algorithms require little a priori knowledge about the environment, continually learn about the consequences of their decisions, and seek policies that optimize expected cumulative reward. The theoretical framework inherited from optimal control allows the formulation of precise questions: I will show that reinforcement learning algorithms are Dvoretzky's form of stochastic approximation methods to performing dynamic programming (from operations research), and also how they are related to Monte-Carlo methods from statistics. These relationships lead to a general technique for proving convergence to optimal policies of reinforcement learning algorithms as well as for deriving new reinforcement learning algorithms.

These results assume completely observable states and lookup-table representations, both of which are not feasible in many AI applications. I will derive bounds on the effect of hidden state on reinforcement learning and present a neural-network architecture based on soft-clustering that uses compact representations to accelerate learning. I will also demonstrate the utility of reinforcement learning algorithms with results from an application of reinforcement learning to the difficult problem of channel assignment in cellular telephone systems.

Refreshments will be served immediately before the talk at 3:30pm.

The Department holds colloquia throughout the Fall and Spring semesters. These colloquia, open to the public, are typically held on Thursday afternoons, but sometimes occur at other times as well. If you would like to receive email notification of upcoming colloquia, subscribe to our Colloquia Mailing List. If you would like to schedule a colloquium, see Colloquium Scheduling.

Sign language interpreters are available upon request. Please contact Stephanie Morris at least five days prior to the colloquium.

See also:
Department of Computer Science
College of Engineering and Applied Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
Send email to

Engineering Center Office Tower
ECOT 717
FAX +1-303-492-2844
XHTML 1.0/CSS2 ©2012 Regents of the University of Colorado
Privacy · Legal · Trademarks
May 5, 2012 (13:29)