home · mobile · calendar · colloquia · 1999-2000 · 

Colloquium - Precup

Options: A Framework for Temporal Abstraction in Reinforcement Learning
Department of Computer Science, University of Massachusetts

Decision making routinely involves choosing among different courses of action over a broad range of time scales. For instance, a person planning a trip to a distant location makes high-level decisions regarding what means of transportation to use, but also chooses low-level actions, such as the movement units for getting into a car. The problem of picking an appropriate time scale for reasoning and learning has been explored in artificial intelligence, control theory and robotics.

In this talk I will present a novel approach to this problem, in the context of Markov Decision Processes (MDPs) and reinforcement learning. I will present a formal framework for representing temporally extended actions, called options. Options are a minimal extension to MDPs, allowing the incorporation of existing controllers, heuristics for picking actions, or learned courses of action. The outcomes of options can be predicted using multi-time models, learned by interacting with the environment. Such models can then be used to produce plans of behavior very quickly, using classical dynamic programming or reinforcement learning techniques.

The most interesting feature of the framework is that it allows an agent to work simultaneously with high-level and low-level temporal representations. The interplay of these levels can be exploited in order to learn and plan more efficiently and more accurately. I will present new algorithms that take advantage of this structure to improve the quality of plans, and to learn in parallel about the effects of many different options.

Hosted by Clayton Lewis.

Department of Computer Science
University of Colorado Boulder
Boulder, CO 80309-0430 USA
May 5, 2012 (14:13)