Data-driven approximate dynamic programming: A linear programming approach

Tobias Sutter, Angeliki Kamoutsi, Peyman Mohajerin Esfahani, John Lygeros

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)
14 Downloads (Pure)

Abstract

This article presents an approximation scheme for the infinite-dimensional linear programming formulation of discrete-time Markov control processes via a finite-dimensional convex program, when the dynamics are unknown and learned from data. We derive a probabilistic explicit error bound between the data-driven finite convex program and the original infinite linear program. We further discuss the sample complexity of the error bound which translates to the number of samples required for an a priori approximation accuracy. Our analysis sheds light on the impact of the choice of basis functions for approximating the true value function. Finally, the relevance of the method is illustrated on a truncated LQG problem.

Original languageEnglish
Title of host publicationProceedings of the 2017 IEEE 56th Annual Conference on Decision and Control
EditorsA Astolfi et al
Place of PublicationPiscataway, NJ, USA
PublisherIEEE
Pages5174-5179
ISBN (Electronic)978-150902873-3
DOIs
Publication statusPublished - 2017
EventCDC 2017: 56th IEEE Annual Conference on Decision and Control - Melbourne, Australia
Duration: 12 Dec 201715 Dec 2017
http://cdc2017.ieeecss.org/

Conference

ConferenceCDC 2017: 56th IEEE Annual Conference on Decision and Control
CountryAustralia
CityMelbourne
Period12/12/1715/12/17
OtherThe CDC is recognized as the premier scientific and engineering conference dedicated to the advancement of the theory and practice of systems and control. The CDC annually brings together an international community of researchers and practitioners in the field of automatic control to discuss new research results, perspectives on future developments, and innovative applications relevant to decision making, systems and control, and related areas.
Internet address

Fingerprint

Dive into the research topics of 'Data-driven approximate dynamic programming: A linear programming approach'. Together they form a unique fingerprint.

Cite this