Abstract
Globalized dual heuristic programming (GDHP) is the most comprehensive adaptive critic design, which employs its critic to minimize the error with respect to both the cost-to-go and its derivatives simultaneously. Its implementation, however, confronts a dilemma of either introducing more computational load by explicitly calculating the second partial derivative term or sacrificing the accuracy by loosening the association between the cost-to-go and its derivatives. This article aims at increasing the online learning efficiency of GDHP while retaining its analytical accuracy by introducing a novel GDHP design based on a critic network and an associated dual network. This associated dual network is derived from the critic network explicitly and precisely, and its structure is in the same level of complexity as dual heuristic programming critics. Three simulation experiments are conducted to validate the learning ability, efficiency, and feasibility of the proposed GDHP critic design.
Original language | English |
---|---|
Pages (from-to) | 10079-10090 |
Number of pages | 12 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 34 |
Issue number | 12 |
DOIs | |
Publication status | Published - 2022 |
Keywords
- Adaptation models
- Adaptive critic designs (ACDs)
- Backpropagation
- Complexity theory
- Costs
- globalized dual heuristic programming
- incremental model
- Mathematical models
- neural networks
- Programming
- radial basis functions
- reinforcement learning (RL).
- Task analysis