A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 6 Issue 3
May  2019

IEEE/CAA Journal of Automatica Sinica

• JCR Impact Factor: 6.171, Top 11% (SCI Q1)
CiteScore: 11.2, Top 5% (Q1)
Google Scholar h5-index: 51， TOP 8
Turn off MathJax
Article Contents
Ruizhuo Song and Liao Zhu, "Optimal Fixed-Point Tracking Control for Discrete-Time Nonlinear Systems via ADP," IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 657-666, May 2019. doi: 10.1109/JAS.2019.1911453
 Citation: Ruizhuo Song and Liao Zhu, "Optimal Fixed-Point Tracking Control for Discrete-Time Nonlinear Systems via ADP," IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 657-666, May 2019.

# Optimal Fixed-Point Tracking Control for Discrete-Time Nonlinear Systems via ADP

##### doi: 10.1109/JAS.2019.1911453
Funds:

the National Natural Science Foundation of China 61873300

the National Natural Science Foundation of China 61722312

the Fundamental Research Funds for the Central Universities FRF-GF-17-B45

• Based on adaptive dynamic programming (ADP), the fixed-point tracking control problem is solved by a value iteration (Ⅵ) algorithm. First, a class of discrete-time (DT) nonlinear system with disturbance is considered. Second, the convergence of a Ⅵ algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value, and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks (NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method.

•  [1] K. V. Berkel, B. D. Jager, T. Hofman, and M. Steinbuch, "Implementation of dynamic programming for optimal control problems with continuous states, " IEEE Trans. Control Syst. Technol., vol. 23, no. 3, pp. 1172-1179, May 2015. [2] K. Deng, Y. Sun, S. Li, Y. Lu, J. Brouwer, P. G. Mehta, M. Zhou, and A. Chakraborty, "Model predictive control of central chiller clant cith thermal energy storage via dynamic programming and mixed-integer linear programming, " IEEE Trans. Autom. Sci. Eng., vol. 12, no. 2, pp. 565-579, Apr. 2015. [3] B. E. Richard, Dynamic Programming. Princeton, NJ, USA: Princeton Univ. Press, 1957. [4] W. T. Miller, R. S. Sutton, and P. J. Werbos, Eds., Neural Networks for Control, Cambridge, MA, USA: MIT Press, 1990. [5] P. J. Werbos, "Advanced forecasting methods for global crisis warning and models of intelligence, " General Syst. Yearbook, vol. 22, pp. 25-38, 1977. [6] D. V. Prokhorov and Wunsch D C, "Adaptive critic designs, " IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997. [7] R. Padhi, N. Unnikrishnan, X. Wang, and S. N. Balakrishnan, "A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems, " Neural Netw., vol. 19, no. 10, pp. 1648-1660, Dec. 2006. [8] Q. Wei, F. L. Lewis, D. Liu, and R. Song, "Discrete-time local value iteration adaptive dynamic programming: convergence analysis, " IEEE Trans., Syst., Man, Cybern., Syst., vol. 48, no. 6, pp. 875-891, Jun. 2016. [9] D. P. Bertsekas, "Value and policy iterations in optimal control and adaptive dynamic programming, " IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 500-509, Mar. 2017. [10] B. Fan, Q. Yang, X. Tang, and Y. Sun, "Robust ADP design for continuous-time nonlinear systems with output constraints, " IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 6, pp. 2127-2138, Jun. 2018. [11] D. Liu, Y. Xu, Q. Wei, and X. Liu, "Residential energy scheduling for variable weather solar energy based on adaptive dynamic programming, " IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 36-46, Jan. 2018. [12] Q. Wei, D. Liu, Y. Liu, and R. Song, "Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming, " IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 168-176, Apr. 2017. [13] Z. Wang, L. Liu, and H. Zhang, "Neural network-based model-free adaptive fault-tolerant control for discrete-time nonlinear systems with sensor fault, " IEEE Trans., Syst., Man, Cybern., Syst., vol. 47, no. 8, pp. 2351-2362, Aug. 2017. [14] R. Song, F. L. Lewis, and Q. Wei, "Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games, " IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 704-713, Mar. 2017. [15] D. Liu, H. Li, and D. Wang, "Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamics, " IEEE Trans., Syst., Man, Cybern., Syst., vol. 44, no. 8, pp. 1015-1027, Aug. 2014. [16] H. Zhang, L. Cui, and Y. Luo, "Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP, " IEEE Trans. Cybern., vol. 43, no. 1, pp. 206-216, Feb. 2013. [17] Q. Wei, D. Liu, G. Shi, and Y. Liu, "Multibattery optimal coordination control for home energy management systems via distributed iterative adaptive dynamic programming, " IEEE Trans. Ind. Electron., vol. 62, no. 7, pp. 4203-4214, Jul. 2015. [18] A. Isidori and W. Kang, "$H_{infty}$ control via measurement feedback for general nonlinear systems, " IEEE Trans. Autom. Control, vol. 40, no. 3, pp. 466-472, Mar. 1995. [19] T. Basar and P. Bernhard, $H_{infty}$ Optimal Control and Related Minimax Design Problems. Boston, MA, USA: Birkhuser, 1995. [20] T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory. Philadelphia, PA, USA: SIAM, 1999. [21] A. Al-Tamimi, M. Abu-Khalaf, and F. L. Lewis, "Adaptive critic designs for discrete-time zero-sum games with application to $H_{infty}$ control, " IEEE Trans. Syst., Man, Cybern., Part B: Cybern., vol. 37, no. 1, pp. 240-247, Feb. 2007. [22] Q. Wei, R. Song, and P. Yan, "Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP, " IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 2, pp. 444-458, Feb. 2016. [23] Y. Zhu, D. Zhao, and X. Li, "Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data, " IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 714-725, Mar. 2017. [24] A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear hjb solution using approximate dynamic programming: convergence proof, " IEEE Trans. Syst., Man, Cybern., Part B: Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008. [25] Q. Wei, D. Liu, Q. Lin, and R. Song, "Adaptive dynamic programming for discrete-time zero-sum games, " IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 4, pp. 957-969, Apr. 2018. [26] D. Liu, H. Javaherian, O. Kovalenko, and T. Huang, "Adaptive critic learning techniques for engine torque and airfuel ratio control, " IEEE Trans. Syst., Man, Cybern., Part B: Cybern., vol. 38, no. 4, pp. 988-993, Aug. 2008. [27] H. Zhang, R. Song, Q. Wei, and T. Zhang, "Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming, " IEEE Trans. Neural Netw., vol. 22, no. 12, pp. 1851-1862, Dec. 2011. [28] R. Song, Q. Wei, W. Xiao, and Z. Du, "Nearly optimal tracking control for continuous time nonlinear systems using a policy iteration based HJB approach, " in Proc. 34th IEEE Chinese Control Conference (CCC), Hangzhou, China, 2015, pp. 3169-3172. [29] Q. Wei, R. Song, and Q. Sun, "Nonlinear neuro-optimal tracking control via stable iterative Q-learning algorithm, " Neurocomputing, vol. 168, pp. 520-528, Nov. 2015. [30] B. Zhao, D. Liu, Y. Li, Q. Wei, and R. Song, Adaptive dynamic programming based decentralized tracking control for unknown large-scale systems, in Proc. 36th IEEE Chinese Control Conference (CCC), Dalian, China, 2017, pp. 3575-3580. [31] Y. Lv, X. Ren, J. Na, and L. Li, $H_{infty}$ tracking control problem for completely unknown nonlinear system based on augmented matrix, in Proc 9th IEEE International Conference on Modelling, Identification and Control (ICMIC), Kunming, China, 2017, pp. 7-12. [32] B. Luo, D. Liu, T. Huang, and J. Liu, "Output tracking control based on adaptive dynamic programming with multistep policy evaluation, " IEEE Trans., Syst., Man, Cybern., Syst., 2017, DOI: 10.1109/TSMC.2017. 2771516. [33] Q. Yang and S. Jagannathan, "Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators, " IEEE Trans. Syst., Man, Cybern., Part B: Cybern., vol. 42, no. 2, pp. 377-390, Apr. 2012. [34] H. Modares, F. L. Lewis, and Z. P. Jiang, "$H_{infty}$ tracking control of completely unknown continuous-time systems via off-policy reinforcement learning, " IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 10, pp. 2550-2562, Oct. 2015. [35] H. Zhang, X. Cui, Y. Luo, and H. Jiang, "Finite-horizon $H_{infty}$ tracking control for unknown nonlinear systems with saturating actuators, " IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 4, pp. 1200-1212, Apr. 2018. [36] D. Wang, D. Liu, and Q. Wei, "Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach, " Neurocomputing, vol. 78, no. 1, pp. 14-22, Feb. 2012. [37] A. Rantzer, "Relaxed dynamic programming in switching systems, " IEE Proc., Control Theory, vol. 153, no. 5, pp. 567-574, Sep. 2006. [38] B. Lincoln and A. Rantzer, "Relaxing dynamic programming, " IEEE Trans. Autom. Control, vol. 51, no. 8, pp. 1249-1260, Aug. 2006. [39] H. Zhang, Y. Luo, and D. Liu, "Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints, " IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490-1503, Sep. 2009.

### Catalog

###### 通讯作者: 陈斌, bchen63@163.com
• 1.

沈阳化工大学材料科学与工程学院 沈阳 110142

Figures(18)