A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 7 Issue 2
Mar.  2020

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 6.171, Top 11% (SCI Q1)
    CiteScore: 11.2, Top 5% (Q1)
    Google Scholar h5-index: 51, TOP 8
Turn off MathJax
Article Contents
Teng Liu, Bin Tian, Yunfeng Ai and Fei-Yue Wang, "Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 617-626, Mar. 2020. doi: 10.1109/JAS.2020.1003072
Citation: Teng Liu, Bin Tian, Yunfeng Ai and Fei-Yue Wang, "Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System," IEEE/CAA J. Autom. Sinica, vol. 7, no. 2, pp. 617-626, Mar. 2020. doi: 10.1109/JAS.2020.1003072

Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System

doi: 10.1109/JAS.2020.1003072
Funds:  The work was supported in part by the National Natural Science Foundation of China (61533019, 91720000), Beijing Municipal Science and Technology Commission (Z181100008918007), and the Intel Collaborative Research Institute for Intelligent and Automated Connected Vehicles (pICRI-IACVq)
More Information
  • As a complex and critical cyber-physical system (CPS), the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy. Energy management strategy (EMS) is playing a key role to improve the energy efficiency of this CPS. This paper presents a novel bidirectional long short-term memory (LSTM) network based parallel reinforcement learning (PRL) approach to construct EMS for a hybrid tracked vehicle (HTV). This method contains two levels. The high-level establishes a parallel system first, which includes a real powertrain system and an artificial system. Then, the synthesized data from this parallel system is trained by a bidirectional LSTM network. The lower-level determines the optimal EMS using the trained action state function in the model-free reinforcement learning (RL) framework. PRL is a fully data-driven and learning-enabled approach that does not depend on any prediction and predefined rules. Finally, real vehicle testing is implemented and relevant experiment data is collected and calibrated. Experimental results validate that the proposed EMS can achieve considerable energy efficiency improvement by comparing with the conventional RL approach and deep RL.

     

  • loading
  • [1]
    F.-Y. Wang, “The emergence of intelligent enterprises: from CPS to CPSS,” IEEE Trans. Intell. Transp. Syst., vol. 25, no. 4, pp. 85–88, 2010.
    [2]
    F.-Y. Wang, “Control 5.0: from Newton to Merton in popper’s cyber-social-physical spaces,” IEEE/CAA J. Autom. Sinica, vol. 3, no. 3, pp. 233–234, 2016. doi: 10.1109/JAS.2016.7508796
    [3]
    X. L. Tang, X. S. Hu, W. Yang, and H. S. Yu, “Novel torsional vibration modeling and assessment of a power-split hybrid electric vehicle equipped with a dual mass flywheel,” IEEE Trans. Veh. Technol., vol. 67, no. 3, pp. 1900−2000, 2018.
    [4]
    T. Liu, X. S. Hu, W. H. Hu, and Y. Zou, “A heuristic planning reinforcement learning-based energy management for power-split plug-in hybrid electric vehicles,” IEEE Trans. Industrial Informatics, Mar. 2019.
    [5]
    T. Liu, X. S. Hu, S. E. Li, and D. P. Cao, “Reinforcement learning optimized look-ahead energy management of a parallel hybrid electric vehicle,” IEEE/ASME Trans. Mechatronics, vol. 22, no. 4, pp. 1497–1507, 2017. doi: 10.1109/TMECH.2017.2707338
    [6]
    Y. Zou, T. Liu, D. X. Liu, and F. C. Sun, “Reinforcement learning-based real-time energy management for a hybrid tracked vehicle,” Applied Energy, vol. 171, pp. 372–382, 2016. doi: 10.1016/j.apenergy.2016.03.082
    [7]
    C. Lv, Y. H. Liu, X. S. Hu, H. Guo, D. P. Cao, and F.-Y. Wang, “Simultaneous observation of hybrid states for cyber-physical systems: a case study of electric vehicle powertrain,” IEEE Trans. Cybernetics, vol. 48, no. 8, pp. 2357–2367, 2018.
    [8]
    X. S. Hu, H. Wang, and X. L. Tang, “Cyber-physical control for ener-gy-saving vehicle following with connectivity,” IEEE Trans. Indus. Electron., vol. 64, no. 11, pp. 8578–8587, 2017.
    [9]
    Y. Zou, Z. H. Kong, T. Liu, and D. X. Liu, “A real-time Markov chain driver model for tracked vehicles and its validation: its adaptability via stochastic dynamic programming,” IEEE Trans. Veh. Technol., vol. 66, no. 5, pp. 3571–3582, 2017.
    [10]
    T. Liu, Y. Zou, D. X. Liu, and F. C. Sun, “Reinforcement learning of adaptive energy management with transition probability for a hybrid electric tracked vehicle,” IEEE Trans. Ind. Electron., vol. 62, no. 12, pp. 7837–7846, 2015.
    [11]
    C. M. Martinez, X. S. Hu, D. P. Cao E. Velenis, B. Gao, and M. Wellers, “Energy management in plug-in hybrid electric vehicles: recent progress and a connected vehicles perspective,” IEEE Trans. Veh. Technol., vol. 66, no. 6, pp. 4534–4549, 2017. doi: 10.1109/TVT.2016.2582721
    [12]
    Y. C. Qin, F. Zhao, Z. F. Wang L. Gu, and M. M. Dong, “Comprehensive analysis for influence of controllable damper time delay on semi-active suspension control strategies,” J. Vibration and Acoustics-Trans. ASME, vol. 139, no. 3, pp. 031006-1–031006-12, 2017. doi: 10.1115/1.4035700
    [13]
    T. Liu, B. Wang, and C. L. Yang, “Online Markov chain-based energy management for a hybrid tracked vehicle with speedy Q-learning,” Energy, vol. 160, pp. 544–555, 2018. doi: 10.1016/j.energy.2018.07.022
    [14]
    H. S. Ramadan, M. Becherif, and F. Claude, “Energy management improvement of hybrid electric vehicles via combined GPS/rule-based methodology,” IEEE Trans. Autom. Sci. Eng., vol. 14, no. 2, pp. 586–597, 2017. doi: 10.1109/TASE.2017.2650146
    [15]
    K. Li, F. C. Chou, and J. Y. Yen, “Real-time, energy-efficient traction allocation strategy for the compound electric propulsion system,” IEEE/ASME Trans. Mechatronics, vol. 22, no. 3, pp. 1371–1380, 2017. doi: 10.1109/TMECH.2017.2667725
    [16]
    M. Muratori and G. Rizzoni, “Residential demand response: dynamic energy management and time-varying electricity pricing,” IEEE Trans. Power syst., vol. 31, no. 2, pp. 1108–1117, 2016. doi: 10.1109/TPWRS.2015.2414880
    [17]
    S. Delprat, T. Hofman, and S. Paganelli, “Hybrid vehicle energy management: singular optimal control,” IEEE Trans. Veh. Technol, vol. 66, no. 6, pp. 9654–9666, 2017. doi: 10.1109/TVT.2017.2746181
    [18]
    L. L. Guo, B. Z. Gao, Q. F. Liu, J. H. Tang, and H. Chen, “On-line optimal control of the gearshift command for multispeed electric vehicles,” IEEE/ASME Trans. Mechatronics, vol. 22, no. 4, pp. 1519–1530, 2017. doi: 10.1109/TMECH.2017.2716340
    [19]
    J. H. Han, D. Kum, and Y. Park, “Synthesis of predictive equivalent consumption minimization strategy for hybrid electric vehicles based on closed-form solution of optimal equivalence factor, ” IEEE Trans. Veh. Technol., 2017.
    [20]
    P. Nyberg, E. Frisk, and L. D. Nielsen, “Using real-world driving databases to generate driving cycles with equivalence properties,” IEEE Trans. Veh. Technol, vol. 65, no. 6, pp. 4095–4105, Jun. 2016. doi: 10.1109/TVT.2015.2502069
    [21]
    T. Liu, X. L. Tang, H. Wang, H. Yu, and X. S. Hu, “Adaptive hierarchical energy management design for a plug-in hybrid electric vehicle,” IEEE Trans. Veh. Technol., Jul, 2019.
    [22]
    T. Liu, Y. Zou, D. X. Liu, and F. C. Sun, “Reinforcement learning-based energy management strategy for a hybrid electric tracked vehicle,” Energies, vol. 8, no. 7, pp. 7243–7260, 2015. doi: 10.3390/en8077243
    [23]
    M. Deniša, A. Gams, A. Ude, and T. Petric, “Learning compliant movement primitives through demonstration and statistical generalization,” IEEE/ASME Trans. Mechatronics, vol. 21, no. 5, pp. 2581–2594, 2017.
    [24]
    V. Mnih, K. Kavukcuoglu, D. Silver, and A. Graves, “Playing atari with deep reinforcement learning, ” arXiv preprint, arXiv: 1312.5602, 2013.
    [25]
    M. Hagan, H. Demuth, M. Beale, and O. De Jess, Neural Network Design, Boston, MA, Martin Hagan, 2014.
    [26]
    F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise timing with LSTM recurrent networks,” J. Machine Learning Research, vol. 3, no. 1, pp. 115–143, 2002.
    [27]
    L. Li, S. X. You, C. Yang, B. J. Yan, J. Song, and Z. Chen, “Driving-behavior-aware stochastic model predictive control for plug-in hybrid electric buses,” Appl Energy, vol. 162, pp. 868–879, 2016. doi: 10.1016/j.apenergy.2015.10.152
    [28]
    F.-Y. Wang, “Artificial societies, computational experiments, and parallel systems: a discussion on computational theory of complex social-economic systems,” Complex Syst. Complex. Sci., vol. 1, no. 4, pp. 25–35, Oct. 2004.
    [29]
    F.-Y. Wang, “toward a paradigm shift in social computing: the ACP approach,” IEEE Intell. Syst., vol. 22, no. 5, pp. 65–67, Sept.–Oct. 2007. doi: 10.1109/MIS.2007.4338496
    [30]
    F.-Y. Wang, “Parallel control and management for intelligent transportation systems: concepts, architectures, and applications,” IEEE Trans. Intell. Transp. Syst., vol. 11, no. 3, pp. 630–638, Sep. 2010. doi: 10.1109/TITS.2010.2060218
    [31]
    F.-Y. Wang and S. N. Tang, “Artificial societies for integrated and sustainable development of metropolitan systems,” IEEE Intell. Syst., vol. 19, no. 4, pp. 82–87, Jul.–Aug. 2004. doi: 10.1109/MIS.2004.22
    [32]
    F.-Y. Wang, H. G. Zhang, and D. R. Liu, “Adaptive dynamic programming: an introduction,” IEEE Comput. Intell. Magazine, vol. 4, no. 2, pp. 39–47, Jun. 2009.
    [33]
    T. Liu, H. L. Yu, H. Y. Guo, Y. C. Qin, and Y. Zou, “Online energy management for multimode plug-in hybrid electric vehicles,” IEEE Trans. Industrial Informatics, vol. 15, no. 7, pp. 4352–4361, Jul. 2019.
    [34]
    P. Shan, R. Li, S. H. Ning, and Q. Yang, “Markov decision process toolbox, ” in Proc. of IEEE Int. Workshop on Open-Source Software for Scientific Computation (OSSC), Sep. 2009.
    [35]
    L. Li, Y. L. Lin, N. N. Zheng, and F.-Y. Wang, “Parallel learning: a perspective and a framework,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 3, pp. 389–395, 2017. doi: 10.1109/JAS.2017.7510493
    [36]
    C. Lv, X. S. Hu, A. Sangiovanni-Vincentelli, Y. T. Li, C. M. Martinez, and D. P. Cao, “Driving-style-based codesign optimization of an automated electric vehicle: a cyber-physical system approach,” IEEE Trans. Indus. Electron, vol. 66, no. 4, pp. 2965–2975, 2018.
    [37]
    C. Lv, Y. Xing, C. Lu, Y. H. Liu, H. Y. Guo, H. B. Gao, and D. P. Cao, “Hybrid-learning-based classification and quantitative inference of driver braking intensity of an electrified vehicle,” IEEE Trans. Veh. Technol., vol. 67, no. 7, pp. 5718–5729, 2018.
    [38]
    C. Lv, Y. Xing, J. Z. Zhang, X. X. Na, Y. T. Li, T. Liu, D. P. Cao, and F.-Y. Wang, “Leven-berg-marquardt backpropagation training of multilayer neural networks for state estimation of a safety-critical cyber-physical system,” IEEE Trans. Industrial Informatics, vol. 14, no. 8, pp. 3436–3446, 2017.
    [39]
    Y. Xing, C. Lv, H. J. Wang, D. P. Cao, E. Velenis, and F.-Y. Wang, “Driver lane change intention inference for intelligent vehicles: framework, survey, and challenges,” IEEE Trans. Veh. Technol, vol. 68, no. 5, pp. 4377–4390, 2019.
    [40]
    T. Liu and X. S. Hu, “A Bi-level control for energy efficiencyimprovement of a hybrid tracked vehicle,” IEEE Trans. Industrial Informatics, vol. 14, no. 4, pp. 1616–1625, 2018. doi: 10.1109/TII.2018.2797322

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(2)

    Article Metrics

    Article views (963) PDF downloads(100) Cited by()

    Highlights

    • Parallel reinforcement learning was used to construct energy management strategy.
    • A parallel system including a real powertrain and an artificial system was built.
    • Data from the parallel system is trained by a bidirectional long short-term network.
    • Real vehicle testing is implemented and experiment data is collected and calibrated.

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return