Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

Jing Na; Guido Herrmann

Volume 1 Issue 4

Oct. 2014

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2014 > 1(4): 412-422

Jing Na and Guido Herrmann, "Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 4, pp. 412-422, 2014.

Citation:

Jing Na and Guido Herrmann, "Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 4, pp. 412-422, 2014.

Citation:

Jing Na and Guido Herrmann, "Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems," IEEE/CAA J. of Autom. Sinica, vol. 1, no. 4, pp. 412-422, 2014.

PDF( 2042 KB)

Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

Jing Na¹,
Guido Herrmann²

1. with the Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, 650093, China;
2. Department of Mechanical Engineering, University of Bristol, BS8 1TR, UK

Funds:

This work was supported by National Natural Science Foundation of China (61203066).

Abstract

Abstract

This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network (NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman (HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method.
- Adaptive control,
- optimal control,
- approximate dynamic programming,
- system identification

FullText(HTML)

References(32)

References

[1]	Lewis F L, Vrabie D, Syrmos V L. Optimal Control. Wiley. com, 2012.
[2]	Vrabie D, Lewis F L. Neural network approach to continuous-time directadaptive optimal control for partially unknown nonlinear systems. NeuralNetworks, 2009, 22(3): 237-246
[3]	Sastry S, Bodson M. Adaptive Control: Stability, Convergence, andRobustness. New Jersey: Prentice Hall, 1989.
[4]	Ioannou P A, Sun J. Robust Adaptive Control. New Jersey: PrenticeHall, 1996.
[5]	Sutton R S, Barto A G. Reinforcement Learning: An Introduction.Cambridge: Cambridge University Press, 1998.
[6]	Doya K J. Reinforcement learning in continuous time and space. Neuralcomputation, 2000, 12(1): 219-245
[7]	Sutton R S, Barto A G, Williams R J. Reinforcement learning is directadaptive optimal control. IEEE Control Systems Magazine, 1992, 12(2):19-22
[8]	Werbos P J. A menu of designs for reinforcement learning over time.Neural Networks for Control. MA, USA: MIT Press Cambridge, 1990.67-95
[9]	Si J, Barto A G, Powell W B, Wunsch D C. Handbook of Learning andApproximate Dynamic Programming. Los Alamitos: IEEE Press, 2004.
[10]	Wang F Y, Zhang H G, Liu D R. Adaptive dynamic programming:an introduction. IEEE Computational Intelligence Magazine, 2009, 4(2):39-47
[11]	Lewis F L, Vrabie D. Reinforcement learning and adaptive dynamic programmingfor feedback control. IEEE Circuits and Systems Magazine,2009 9(3): 32-50
[12]	Zhang H G, Zhang X, Luo Y H, Yang J. An overview of research onadaptive dynamic programming. Acata Automatica Sinica, 2013, 39(4):303-311
[13]	Dierks T, Thumati B T, Jagannathan S. Optimal control of unknownaffine nonlinear discrete-time systems using offline-trained neural networkswith proof of convergence. Neural Networks, 2009, 22(5):851-860
[14]	Al-Tamimi A, Lewis F L, Abu-Khalaf M. Discrete-time nonlinearHJB solution using approximate dynamic programming: convergenceproof. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics, 2008, 38(4): 943-949
[15]	Wang D, Liu D R, Wei Q L, Zhao D B, Jin N. Optimal control ofunknown nonaffine nonlinear discrete-time systems based on adaptivedynamic programming. Automatica, 2012, 48(8): 1825-1832
[16]	Hanselmann T, Noakes L, Zaknich A. Continuous-time adaptive critics.IEEE Transactions on Neural Networks, 2007, 18(3): 631-647
[17]	Abu-Khalaf M, Lewis F L. Nearly optimal control laws for nonlinearsystems with saturating actuators using a neural network HJB approach.Automatica, 2005, 41(5): 779-791
[18]	Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis F L. Adaptive optimalcontrol for continuous-time linear systems based on policy iteration.Automatica, 2009, 45(2): 477-484
[19]	Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve thecontinuous-time infinite horizon optimal control problem. Automatica,2010, 46(5): 878-888
[20]	Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis K G, Lewis F L,Dixon W E. A novel actor-critic-identifier architecture for approximateoptimal control of uncertain nonlinear systems. Automatica, 2013, 49(1):82-92
[21]	Zhang H G, Cui L, Zhang X, Luo Y. Data-driven robust approximateoptimal tracking control for unknown general nonlinear systems usingadaptive dynamic programming method. IEEE Transactions on NeuralNetworks, 2011, 22(12): 2226-2236
[22]	Mannava A, Balakrishnan S N, Tang L, Landers R G. Optimal trackingcontrol of motion systems. IEEE Transactions on Control SystemsTechnology, 2012, 20(6): 1548-1558
[23]	Nodland D, Zargarzadeh H, Jagannathan S. Neural network-basedoptimal adaptive output feedback control of a helicopter UAV. IEEETransactions on Neural Networks and Learning Systems, 2013, 24(7):1061-1073
[24]	Na J, Herrmann G, Ren X M, Mahyuddin M N, Barber P. Robust adaptivefinite-time parameter estimation and control of nonlinear systems.In: Proceedings of IEEE International Symposium on Intelligent Control(ISIC). Denver, CO: IEEE, 2011. 1014-1019
[25]	Uang H J, Chen B S. Robust adaptive optimal tracking design foruncertain missile systems: a fuzzy approach. Fuzzy Sets and Systems,2002, 126(1): 63-87
[26]	Krstic M, Kokotovic P V, Kanellakopoulos I. Nonlinear and AdaptiveControl Design. New York: Wiley, 1995.
[27]	Kosmatopoulos E B, Polycarpou M M, Christodoulou M A, Ioannou PA. High-order neural network structures for identification of dynamicalsystems. IEEE Transactions on Neural Networks, 1995, 6(2): 422-431
[28]	Abdollahi F, Talebi H A, Patel R V. A stable neural network-based observerwith application to flexible-joint manipulators. IEEE Transactionson Neural Networks, 2006, 17(1): 118-129
[29]	Lin J S, Kanellakopoulos I. Nonlinearities enhance parameter convergencein strict feedback systems. IEEE Transactions on AutomaticControl, 1999, 44(1): 89-94
[30]	Edwards C, Spurgeon S K. Sliding Mode Control: Theory and Applications.Boca Raton: CRC Press, 1998.
[31]	Sira-Ramirez H. Differential geometric methods in variable-structurecontrol. International Journal of Control, 1988, 48 (4): 1359-1390
[32]	Nevistic V, Primbs J A. Constrained Nonlinear Optimal Control: AConverse HJB Approach, Technical Report CIT-CDS 96-021, CaliforniaInstitute of Technology, Pasadena, CA, 1996.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Get Citation

PDF

XML

Article Metrics

Article views (1545) PDF downloads(31)

Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Export File

Citation

Format

Content