A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 8 Issue 4
Apr.  2021

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 6.171, Top 11% (SCI Q1)
    CiteScore: 11.2, Top 5% (Q1)
    Google Scholar h5-index: 51, TOP 8
Turn off MathJax
Article Contents
Xueli Wang, Derui Ding, Hongli Dong and Xian-Ming Zhang, "Neural-Network-Based Control for Discrete-Time Nonlinear Systems with Input Saturation Under Stochastic Communication Protocol," IEEE/CAA J. Autom. Sinica, vol. 8, no. 4, pp. 766-778, Apr. 2021. doi: 10.1109/JAS.2021.1003922
Citation: Xueli Wang, Derui Ding, Hongli Dong and Xian-Ming Zhang, "Neural-Network-Based Control for Discrete-Time Nonlinear Systems with Input Saturation Under Stochastic Communication Protocol," IEEE/CAA J. Autom. Sinica, vol. 8, no. 4, pp. 766-778, Apr. 2021. doi: 10.1109/JAS.2021.1003922

Neural-Network-Based Control for Discrete-Time Nonlinear Systems with Input Saturation Under Stochastic Communication Protocol

doi: 10.1109/JAS.2021.1003922
Funds:  This work was supported in part by the Australian Research Council Discovery Early Career Researcher Award (DE200101128), and Australian Research Council (DP190101557)
More Information
  • In this paper, an adaptive dynamic programming (ADP) strategy is investigated for discrete-time nonlinear systems with unknown nonlinear dynamics subject to input saturation. To save the communication resources between the controller and the actuators, stochastic communication protocols (SCPs) are adopted to schedule the control signal, and therefore the closed-loop system is essentially a protocol-induced switching system. A neural network (NN)-based identifier with a robust term is exploited for approximating the unknown nonlinear system, and a set of switch-based updating rules with an additional tunable parameter of NN weights are developed with the help of the gradient descent. By virtue of a novel Lyapunov function, a sufficient condition is proposed to achieve the stability of both system identification errors and the update dynamics of NN weights. Then, a value iterative ADP algorithm in an offline way is proposed to solve the optimal control of protocol-induced switching systems with saturation constraints, and the convergence is profoundly discussed in light of mathematical induction. Furthermore, an actor-critic NN scheme is developed to approximate the control law and the proposed performance index function in the framework of ADP, and the stability of the closed-loop system is analyzed in view of the Lyapunov theory. Finally, the numerical simulation results are presented to demonstrate the effectiveness of the proposed control scheme.

     

  • loading
  • [1]
    M. Mazouchi, M. B. N. Sistani, and S. K. H. Sani, “A novel distributed optimal adaptive control algorithm for nonlinear multi-agent differential graphical games,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 1, pp. 331–341, Jan. 2018.
    [2]
    Y. J. Liu, L. Tang, S. Tong, C. L. Chen, and D. J. Li, “Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 1, pp. 165–176, Jan. 2015.
    [3]
    R. Song and L. Zhu, “Optimal fixed-point tracking control for discrete-time nonlinear systems via ADP,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 657–666, May 2019.
    [4]
    L. Sun, and Z. Zheng, “Disturbance-observer-based robust backstepping attitude stabilization of spacecraft under input saturation and measurement uncertainty,” IEEE Trans. Ind. Electron., vol. 64, no. 10, pp. 7994–8002, 2017.
    [5]
    D. Wang, H. He, X. Zhong, and D. Liu, “Event-driven nonlinear discounted optimal regulation involving a power system application,” IEEE Trans. Ind. Electron., vol. 64, no. 10, pp. 8177–8186, 2017.
    [6]
    H. Li, Y. Wu and M. Chen, “Adaptive fault-tolerant tracking control for discrete-time multi-agent systems via reinforcement learning algorithm,” IEEE Trans. Cybern., to be published. DOI: 10.1109/TCYB.2020.2982168.
    [7]
    T. Wang, H. Gao, and J. Qiu, “A combined adaptive neural network and nonlinear model predictive control for multirate networked industrial process control,” IEEE Trans. Ind. Electron., vol. 27, no. 2, pp. 416–425, 2016.
    [8]
    H. Zhang, Y. Luo, and D. Liu, “Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints,” IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490–1503, 2009.
    [9]
    Z. Shi and Z. Wang, “Optimal control for a class of complex singular system based on adaptive dynamic programming,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 1, pp. 188–197, Jan. 2019.
    [10]
    R. Song, Q. Wei, H. Zhang, and F. L. Lewis, “Discrete-time non-zero-sum games with completely unknown dynamics,” IEEE Trans. Cybern., vol. 99, pp. 1–15, 2019. doi: 10.1109/TCYB.2019.2957406
    [11]
    Q. Wei, and D. Liu, “Data-driven neuro-optimal temperature control of waterCgas shift reaction using stable iterative adaptive dynamic programming,” IEEE Trans. Ind. Electron., vol. 61, no. 11, pp. 6399–6408, 2014.
    [12]
    P. J. Werbos, “Foreword-ADP: the key direction for future research in intelligent control and understanding brain intelligence,” IEEE Trans. Syst. Man,Cybern. Part B, vol. 38, pp. 898–900, 2008.
    [13]
    D. P. Bertsekas and J. N. Tsitsiklis, “Neuro-Dynamic Programming,” Athena Scientific, USA, Belmont, MA, 1996.
    [14]
    R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction”, Cambridge, MA, USA: MIT Press, 1998.
    [15]
    A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Trans. Syst. Man,Cybern. Part B, vol. 38, no. 4, pp. 943–949, 2008.
    [16]
    A. Heydari, “Stability analysis of optimal adaptive control under value iteration using a stabilizing initial policy,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 9, pp. 4522–4527, Sept. 2018.
    [17]
    Q. Wei, D. Liu, and H. Lin, “Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems,” IEEE Trans. Cybern., vol. 46, pp. 840–853, 2016.
    [18]
    W. B. Powell, “Approximate Dynamic Programming,” IHoboken, NJ, USA: Wiley, 2007.
    [19]
    D. V. Prokhorov and D. C. Wunsch, “Adaptive critic designs,” IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997–1007, 1997.
    [20]
    X. Zhong, N. Zhen, and H. He, “A theoretical foundation of goal representation heuristic dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst, vol. 27, no. 12, pp. 2513–2525, 2017.
    [21]
    Y. Yuan, Z. Wang, P. Zhang, and H. Liu, “Near-optimal resilient control strategy design for state-saturated networked systems under stochastic communication protocol,” IEEE Trans. Cybern., vol. 49, no. 8, pp. 1–13, 2018.
    [22]
    M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, 2005.
    [23]
    X. Yang and B. Zhao, “Optimal neuro-control strategy for nonlinear systems with asymmetric input constraints,” IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 575–583, Mar. 2020.
    [24]
    D. Liu, X. Yang, D. Wang, and Q. Wei, “Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints,” IEEE Trans. Cybern., vol. 45, no. 7, pp. 1372–1385, Jul. 2015.
    [25]
    Y. J. Liu, S. Li, S. Tong, and C. L. P. Chen, “Neural approximation-based adaptive control for a class of nonlinear nonstrict feedback discrete-time systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 7, pp. 1531–1541, Jul. 2017.
    [26]
    H. Xu, Q. Zhao, and S. Jagannathan, “Finite-horizon near-optimal output feedback neural network control of quantized nonlinear discrete-time systems with input constraint,” IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 8, pp. 1776–1788, Aug. 2015.
    [27]
    Y. Zhu, D. Zhao, H. He, and J. Ji, “Event-triggered optimal control for partially unknown constrained-input systems via adaptive dynamic programming,” IEEE Trans. Ind. Electron., vol. 64, no. 5, pp. 4101–4109, 2017.
    [28]
    H. Modares, F. L. Lewis, and M. Naghibi-Sistani, “Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 10, pp. 1513–1525, 2013.
    [29]
    D. Ding, Q. L. Han, X. Ge, and J. Wang, “Secure state estimation and control of cyber-physical systems: A survey,” IEEE Trans. Syst. Man, Cybern.: Syst., to be published. DOI: 10.1109/TSMC.2020.3041121.
    [30]
    V. Ugrinovskii and E. Fridman, “A round-robin type protocol for distributed estimation with $$ consensus,” Syst. Control Lett., vol. 69, pp. 103–110, 2014.
    [31]
    G. Walsh, H. Ye, and L. Bushnell, “Stability analysis of networked control systems,” IEEE Trans. Control Syst. Tech., vol. 10, no. 3, pp. 438–446, 2002.
    [32]
    L. Zou, Z. Wang, and H. Gao, “Observer-based $$ Control of networked systems with stochastic communication protocol: The finite-horizon case,” Automatica, vol. 63, pp. 366–373, 2016.
    [33]
    H. Ma, H. Li, R. Lu, and T. Huang, “Adaptive event-triggered control for a class of nonlinear systems with periodic disturbances,” Sci China Inf. Sci., vol. 63, no. 5, pp. 157–171, 2020.
    [34]
    Z. Wang, Q. Wei, and D. Liu, “Event-triggered adaptive dynamic programming for discrete-time multi-player games,” Inf. Sci., vol. 506, pp. 457–470, Jan. 2020.
    [35]
    D. Ding, Z. Wang, and Q. L. Han, “Neural-network-based consensus control for multi-agent systems with input constraints: The event-triggered case,” IEEE Trans. Cybern., vol. 50, no. 8, pp. 1–12, 2019.
    [36]
    X. Zhong, H. He, H. Zhang, and Z. Wang, “Optimal control for unknown discrete-time nonlinear Markov jump systems using adaptive dynamic programming,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 12, pp. 2141–2155, 2014.
    [37]
    D. Ding, Z. Wang, and Q. L. Han, “Neural-network-based output-feedback control with stochastic communication protocols,” Automatica, vol. 106, pp. 221–229, Aug. 2019.
    [38]
    N. Azevedo, D. Pinheiro, and G.-W. Weber, “Dynamic programming for a Markov-switching jump-diffusion,” J Comput. Appl. Math., vol. 267, no. 6, pp. 1–19, Sep. 2014.
    [39]
    M. C. F. Donkers, W. P. M. H. Heemels, and D. Bernardini, A. Bemporad, and V. Shneer, “Stability analysis of stochastic networked control systems,” Automatica, vol. 48, no. 4, pp. 917–925, 2012.
    [40]
    T. Dierks, B. T. Thumati, and S. Jagannathan, “Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence,” Neural Networks, vol. 22, no. 5–6, pp. 851–860, 2009.
    [41]
    B. Lincoln and A. Rantzer, “Relaxing dynamic programming,” IEEE Trans. Autom. Control, vol. 51, no. 8, pp. 1249–1260, Aug. 2006.
    [42]
    J. Song, Y. Niu, and Y. Zou, “Convergence analysis for an identifier-based adaptive dynamic programming algorithm,” In Proc. the 34th Chinese Control Conf., 2015.
    [43]
    D. Liu, D. Wang, and X. Yang, “An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs,” Inf. Sci., vol. 220, no. 1, pp. 331–342, 2013.
    [44]
    D. Liu and Q. Wei, “Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear Systems,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 3, pp. 621–634, 2014.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(7)

    Article Metrics

    Article views (885) PDF downloads(79) Cited by()

    Highlights

    • An NN-based identifier with a robust term is presented to approximate the unknown nonlinear system, where weight update rules are constructed by an additional tunable parameter;
    • A value iterative ADP algorithm is proposed to solve the suboptimal control issue of protocol-induced switching systems with saturation constraints in an off-line way;
    • The convergence of the ADP algorithm is discussed and further performed via an actor-critic NN scheme;
    • A set of conditions are derived to check the stability of both identification error dynamics and updated error dynamics of NN weights.

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return