Online learning algorithms: For passivity-based and distributed control

Subramanya Nageshrao

doi:10.4233/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd

Online learning algorithms: For passivity-based and distributed control

Subramanya Nageshrao

Research output: Thesis › Dissertation (TU Delft)

75 Downloads (Pure)

Abstract

Over the last couple of decades the demand for high precision and enhanced performance of physical systems has been steadily increasing. This demand often results in miniaturization and complex design, thus increasing the need for complex nonlinear control methods. Some of the state of the art nonlinear methods are stymied by the requirement of full state information, model and parameter uncertainties, mathematical complexity, etc. For many scenarios it is nearly impossible to consider all the uncertainties during the design of a feedback controller. Additionally, while designing a modelbased nonlinear control there is no standard mechanism to incorporate performance measures. Some of the mentioned issues can be addressed by using online learning.
Animals and humans have the ability to share, explore, act or respond, memorize the outcome and repeat the task to achieve a better outcome when they encounter the same or a similar scenario. This is called learning from interaction. One instance of this approach is reinforcement learning (RL). However, RL methods are hindered by the curse of dimensionality, non-interpretability and non-monotonic convergence of the learning algorithms. This can be attributed to the intrinsic characteristics of RL, as it is a modelfree approach and hence no standard mechanism exists to incorporate à priori model information.
In this thesis, learning methods are proposed which explicitly use the available system knowledge. This can be seen as a new class of approaches that bridge model-based and model-free methods. These methods can address some of the hurdles mentioned earlier. For example, i) a prior system information can speed up the learning, ii) new control objectives can be achieved which otherwise would be extremely difficult to attain using only model-based methods, iii) physical meaning can be attributed to the learned controller.
The developed approach is as follows: themodel of the given physical system is represented in the port-Hamiltonian (PH) form. For the system dynamics in PH form a passivity-based control (PBC) law is formulated, which often requires the solution to a set of partial differential equations (PDEs). Instead of finding an analytical solution, the PBC control law is parameterized using an unknown parameter vector. Then, by using a variation of the standard actor-critic learning algorithm, the unknown parameters can be learned online. Using the principles of stochastic approximation theory, a proof of convergence for the developed method is shown. The proposedmethods are evaluated for the stabilization and regulation ofmechanical and electro-mechanical systems. The simulation and experimental results show comparable learning curves.
In the final part of the thesis a novel integral reinforcement learning approach is developed to solve for the optimal output tracking control problem for a set of linear heterogeneous multi-agent systems. Unlike existing methods, this approach does not need to solve either the output regulator equation or requires a p-copy of the leader’s dynamics in the agent’s control law. A detailed numerical evaluation has been conducted to show the feasibility of the developed method.

Original language	English
Awarding Institution	Delft University of Technology
Supervisors/Advisors	Babuska, R., Supervisor Delgado Lopes, Gabriel, Advisor
Award date	18 Apr 2016
Place of Publication	Delft, The Netherlands
Print ISBNs	9789461866219
DOIs	https://doi.org/10.4233/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd
Publication status	Published - 2016

Access to Document

10.4233/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd

dissertationFinal published version, 5.6 MB

Cite this

@phdthesis{9f3a2496785140f6a947102080bdd5fd,

title = "Online learning algorithms: For passivity-based and distributed control",

abstract = "Over the last couple of decades the demand for high precision and enhanced performance of physical systems has been steadily increasing. This demand often results in miniaturization and complex design, thus increasing the need for complex nonlinear control methods. Some of the state of the art nonlinear methods are stymied by the requirement of full state information, model and parameter uncertainties, mathematical complexity, etc. For many scenarios it is nearly impossible to consider all the uncertainties during the design of a feedback controller. Additionally, while designing a modelbased nonlinear control there is no standard mechanism to incorporate performance measures. Some of the mentioned issues can be addressed by using online learning.Animals and humans have the ability to share, explore, act or respond, memorize the outcome and repeat the task to achieve a better outcome when they encounter the same or a similar scenario. This is called learning from interaction. One instance of this approach is reinforcement learning (RL). However, RL methods are hindered by the curse of dimensionality, non-interpretability and non-monotonic convergence of the learning algorithms. This can be attributed to the intrinsic characteristics of RL, as it is a modelfree approach and hence no standard mechanism exists to incorporate {\`a} priori model information.In this thesis, learning methods are proposed which explicitly use the available system knowledge. This can be seen as a new class of approaches that bridge model-based and model-free methods. These methods can address some of the hurdles mentioned earlier. For example, i) a prior system information can speed up the learning, ii) new control objectives can be achieved which otherwise would be extremely difficult to attain using only model-based methods, iii) physical meaning can be attributed to the learned controller.The developed approach is as follows: themodel of the given physical system is represented in the port-Hamiltonian (PH) form. For the system dynamics in PH form a passivity-based control (PBC) law is formulated, which often requires the solution to a set of partial differential equations (PDEs). Instead of finding an analytical solution, the PBC control law is parameterized using an unknown parameter vector. Then, by using a variation of the standard actor-critic learning algorithm, the unknown parameters can be learned online. Using the principles of stochastic approximation theory, a proof of convergence for the developed method is shown. The proposedmethods are evaluated for the stabilization and regulation ofmechanical and electro-mechanical systems. The simulation and experimental results show comparable learning curves.In the final part of the thesis a novel integral reinforcement learning approach is developed to solve for the optimal output tracking control problem for a set of linear heterogeneous multi-agent systems. Unlike existing methods, this approach does not need to solve either the output regulator equation or requires a p-copy of the leader{\textquoteright}s dynamics in the agent{\textquoteright}s control law. A detailed numerical evaluation has been conducted to show the feasibility of the developed method.",

author = "Subramanya Nageshrao",

year = "2016",

doi = "10.4233/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd",

language = "English",

isbn = "9789461866219",

type = "Dissertation (TU Delft)",

school = "Delft University of Technology",

}

TY - THES

T1 - Online learning algorithms

T2 - For passivity-based and distributed control

AU - Nageshrao, Subramanya

PY - 2016

Y1 - 2016

N2 - Over the last couple of decades the demand for high precision and enhanced performance of physical systems has been steadily increasing. This demand often results in miniaturization and complex design, thus increasing the need for complex nonlinear control methods. Some of the state of the art nonlinear methods are stymied by the requirement of full state information, model and parameter uncertainties, mathematical complexity, etc. For many scenarios it is nearly impossible to consider all the uncertainties during the design of a feedback controller. Additionally, while designing a modelbased nonlinear control there is no standard mechanism to incorporate performance measures. Some of the mentioned issues can be addressed by using online learning.Animals and humans have the ability to share, explore, act or respond, memorize the outcome and repeat the task to achieve a better outcome when they encounter the same or a similar scenario. This is called learning from interaction. One instance of this approach is reinforcement learning (RL). However, RL methods are hindered by the curse of dimensionality, non-interpretability and non-monotonic convergence of the learning algorithms. This can be attributed to the intrinsic characteristics of RL, as it is a modelfree approach and hence no standard mechanism exists to incorporate à priori model information.In this thesis, learning methods are proposed which explicitly use the available system knowledge. This can be seen as a new class of approaches that bridge model-based and model-free methods. These methods can address some of the hurdles mentioned earlier. For example, i) a prior system information can speed up the learning, ii) new control objectives can be achieved which otherwise would be extremely difficult to attain using only model-based methods, iii) physical meaning can be attributed to the learned controller.The developed approach is as follows: themodel of the given physical system is represented in the port-Hamiltonian (PH) form. For the system dynamics in PH form a passivity-based control (PBC) law is formulated, which often requires the solution to a set of partial differential equations (PDEs). Instead of finding an analytical solution, the PBC control law is parameterized using an unknown parameter vector. Then, by using a variation of the standard actor-critic learning algorithm, the unknown parameters can be learned online. Using the principles of stochastic approximation theory, a proof of convergence for the developed method is shown. The proposedmethods are evaluated for the stabilization and regulation ofmechanical and electro-mechanical systems. The simulation and experimental results show comparable learning curves.In the final part of the thesis a novel integral reinforcement learning approach is developed to solve for the optimal output tracking control problem for a set of linear heterogeneous multi-agent systems. Unlike existing methods, this approach does not need to solve either the output regulator equation or requires a p-copy of the leader’s dynamics in the agent’s control law. A detailed numerical evaluation has been conducted to show the feasibility of the developed method.

AB - Over the last couple of decades the demand for high precision and enhanced performance of physical systems has been steadily increasing. This demand often results in miniaturization and complex design, thus increasing the need for complex nonlinear control methods. Some of the state of the art nonlinear methods are stymied by the requirement of full state information, model and parameter uncertainties, mathematical complexity, etc. For many scenarios it is nearly impossible to consider all the uncertainties during the design of a feedback controller. Additionally, while designing a modelbased nonlinear control there is no standard mechanism to incorporate performance measures. Some of the mentioned issues can be addressed by using online learning.Animals and humans have the ability to share, explore, act or respond, memorize the outcome and repeat the task to achieve a better outcome when they encounter the same or a similar scenario. This is called learning from interaction. One instance of this approach is reinforcement learning (RL). However, RL methods are hindered by the curse of dimensionality, non-interpretability and non-monotonic convergence of the learning algorithms. This can be attributed to the intrinsic characteristics of RL, as it is a modelfree approach and hence no standard mechanism exists to incorporate à priori model information.In this thesis, learning methods are proposed which explicitly use the available system knowledge. This can be seen as a new class of approaches that bridge model-based and model-free methods. These methods can address some of the hurdles mentioned earlier. For example, i) a prior system information can speed up the learning, ii) new control objectives can be achieved which otherwise would be extremely difficult to attain using only model-based methods, iii) physical meaning can be attributed to the learned controller.The developed approach is as follows: themodel of the given physical system is represented in the port-Hamiltonian (PH) form. For the system dynamics in PH form a passivity-based control (PBC) law is formulated, which often requires the solution to a set of partial differential equations (PDEs). Instead of finding an analytical solution, the PBC control law is parameterized using an unknown parameter vector. Then, by using a variation of the standard actor-critic learning algorithm, the unknown parameters can be learned online. Using the principles of stochastic approximation theory, a proof of convergence for the developed method is shown. The proposedmethods are evaluated for the stabilization and regulation ofmechanical and electro-mechanical systems. The simulation and experimental results show comparable learning curves.In the final part of the thesis a novel integral reinforcement learning approach is developed to solve for the optimal output tracking control problem for a set of linear heterogeneous multi-agent systems. Unlike existing methods, this approach does not need to solve either the output regulator equation or requires a p-copy of the leader’s dynamics in the agent’s control law. A detailed numerical evaluation has been conducted to show the feasibility of the developed method.

UR - http://resolver.tudelft.nl/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd

U2 - 10.4233/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd

DO - 10.4233/uuid:9f3a2496-7851-40f6-a947-102080bdd5fd

M3 - Dissertation (TU Delft)

SN - 9789461866219

CY - Delft, The Netherlands

ER -

Online learning algorithms: For passivity-based and distributed control

Abstract

Access to Document

Other files and links

Fingerprint

Cite this