Fast Approximate Dynamic Programming for Infinite-Horizon Markov Decision Processes

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

8 Downloads (Pure)

Abstract

In this study, we consider the infinite-horizon, discounted cost, optimal control of stochastic nonlinear systems with separable cost and constraints in the state and input variables. Using the linear-time Legendre transform, we propose a novel numerical scheme for implementation of the corresponding value iteration (VI) algorithm in the conjugate domain. Detailed analyses of the convergence, time complexity, and error of the proposed algorithm are provided. In particular, with a discretization of size X and U for the state and input spaces, respectively, the proposed approach reduces the time complexity of each iteration in the VI algorithm from O(XU) to O(X+U), by replacing the minimization operation in the primal domain with a simple addition in the conjugate domain.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems (NeurIPS 2021)
Subtitle of host publicationNeurIPS Proceedings
EditorsM. Ranzato, A. Beygelzimer, Y. Dauphin , P.S. Liang, J. Wortman Vaughan
Pages23652-23663
Volume34
ISBN (Electronic)9781713845393
Publication statusPublished - 2021
Event35th Conference on Neural Information Processing Systems (Virtual) -
Duration: 6 Dec 202114 Dec 2021
Conference number: 35th
https://nips.cc/Conferences/2021

Conference

Conference35th Conference on Neural Information Processing Systems (Virtual)
Abbreviated titleNeurIPS 2021
Period6/12/2114/12/21
Internet address

Fingerprint

Dive into the research topics of 'Fast Approximate Dynamic Programming for Infinite-Horizon Markov Decision Processes'. Together they form a unique fingerprint.

Cite this