Big Data and Analytics at Verizon

Ashok Srivastava

DOI

CIS

Members: Free
IEEE Members: Free
Non-members: Free

Length: 00:48:52

09 Dec 2014

Approximation of cost functions within a low-dimensional space of basis functions in a major approach in approximation dynamic programming. It may be implemented by well-established methods such as temporal differences, aggregation, and Bellman error. We show that all of these methods can be viewed within a unified framework, based on an extended form of Galerkin approximation approach that involves projected equations. However, there are two major differences: the first is that the implementation is simulation-based, and the second is that the projection is done using a (weighted) Euclidean seminorm (rather than norm). This extension carries over to weighted multistep projected Bellman equations, similar to those of multistep TD(?)-type methods. An important new feature is that the associated weights need not be geometrically distributed and may be state-dependent. This allows greater flexibility to design simulation methods with desirable bias-variance and exploration characteristics, in the context of standard and optimistic policy iteration methods. Moreover, it allows us to establish a close connection between projected equation and aggregation methods, and to develop for the first time multistep aggregation methods of the TD(?)-type

Tags:

others

IEEE

cis

video

Big Data and Analytics at Verizon

Ashok Srivastava

More Like This

High Power Grounding: Changes to state-of-the-art over the past 150 years

Chiplets for AI and Data Centers: Trends and Innovations in I/O Circuit Designs Video

Chiplets for AI and Data Centers: Trends and Innovations in I/O Circuit Designs Slides

Join an IEEE Society