IEEE/CAA Journal of Automatica Sinica
Citation: | Fei-Yue Wang, Jie Zhang, Qinglai Wei, Xinhu Zheng and Li Li, "PDP: Parallel Dynamic Programming," IEEE/CAA J. Autom. Sinica, vol. 4, no. 1, pp. 1-5, Jan. 2017. |
[1] |
D. Silver et al.,"Mastering the game of Go with deep neural networks and tree search,"Nature 529.7587, pp. 484-489, 2016. https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf
|
[2] |
R. E. Bellman, Dynamic Programming. Princeton, NJ:Princeton University Press, 1957.
|
[3] |
P. J. Werbos,"Advanced forecasting methods for global crisis warning and models of intelligence,"General Syst. Yearbook, vol. 22, 1977.
|
[4] |
P. J. Werbos,"A menu of designs for reinforcement learning over time,"in Neural Networks for Control, W. T. Miller, R. S. Sutton and P. J. Werbos (Eds.), Cambridge:MIT Press, 1991, pp. 67-95.
|
[5] |
F.-Y. Wang, et al.,"Where does AlphaGo go:from church-turing thesis to AlphaGo thesis and beyond", IEEE/CAA J. Autom. Sinica, vol. 3, no. 2, pp. 113-120, April 2016. http://blog.sciencenet.cn/home.php?mod=attachment&filename=Where%20Does%20AlphaGo%20Go.pdf&id=85299
|
[6] |
F.-Y. Wang,"A big-data perspective on AI:Newton, Merton, and analytics intelligence", IEEE Intell. Syst., vol. 27, no. 5, pp. 2-4, 2012. doi: 10.1109/MIS.2012.91
|
[7] |
L. Li, Y.-L. Lin, D.-P. Cao, N.-N. Zheng, and F.-Y. Wang,"Parallel learning-a new framework for machine learning,"Acta Autom. Sinica, vol. 43, no. 1, pp. 1-8, 2017(in Chinese).
|
[8] |
J. Li, W. Xu, J. Zhang, M. Zhang, Z. Wang, and X. Li,"Efficient video stitching based on fast structure deformation,"IEEE Trans. Cybern., article in press, 2015. DOI:10.1109/TCYB.2014.2381774.
|
[9] |
C. Vagg, S. Akehurst, C. J. Brace, and L. Ash,"Stochastic dynamic programming in the real-world control of hybrid electric vehicles,"IEEE Trans. Control Syst. Technol., vol. 24, no. 3, pp. 853-866, Mar. 2016. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7337438
|
[10] |
P. M. Esfahani, D. Chatterjee, and J. Lygeros,"Motion planning for continuous-time stochastic processes:A dynamic programming approach,"IEEE Trans. Autom. Control, vol. 61, pp. 2155-2170, 2016. https://www.researchgate.net/publication/283790302_Motion_Planning_for_Continuous_Time_Stochastic_Processes_A_Dynamic_Programming_Approach
|
[11] |
P. J. Werbos,"Approximate dynamic programming for real-time control and neural modeling,"in Handbook of Intelligent Control:Neural, Fuzzy, and Adaptive Approaches, D.A. White and D.A. Sofge (Eds.), New York:Van Nostrand Reinhold, 1992, ch. 13. http://citeseerx.ist.psu.edu/showciting?cid=258656
|
[12] |
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Belmont, MA:Athena Scientific, 1996.
|
[13] |
D. V. Prokhorov and D. C. Wunsch,"Adaptive critic designs,"IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997. http://dl.acm.org/citation.cfm?id=2326139
|
[14] |
J. Han, S. Khushalani-Solanki, J. Solanki, and J. Liang,"Adaptive critic design-based dynamic stochastic optimal control design for a microgrid with multiple renewable resources,"IEEE Trans. Smart Grid, vol. 6, no. 6, pp. 2694-2703, Jun. 2015. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7175036
|
[15] |
R. S. Sutton and A. G. Barto, Reinforcement Learning:An Introduction. Cambridge, MA:MIT Press, 1998. https://www.amazon.com/Reinforcement-Learning-Introduction-Adaptive-Computation/dp/0262193981
|
[16] |
J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks,"Adaptive dynamic programming,"IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 32, no. 2, pp. 140-153, May 2002.
|
[17] |
Q. Wei, F. L. Lewis, D. Liu, R. Song, and H. Lin,"Discrete-time local value Iteration adaptive dynamic programming:Convergence analysis,"IEEE Trans. Syst., Man, Cybern. A, Syst., article in press, 2016. DOI:10.1109/TSMC.2016.2623766.
|
[18] |
Q. Wei, F. L. Lewis, Q. Sun, P. Yan, and R. Song,"Discrete-time deterministic Q-learning:A novel convergence analysis,"IEEE Trans. Cybern., article in press, 2016. DOI:10.1109/TCYB.2016.2542923.
|
[19] |
Q. Wei, D. Liu, and G. Shi,"A novel dual iterative Q-learning method for optimal battery management in smart residential environments,"IEEE Trans. Ind. Electron., vol. 62, no. 4, pp. 2509-2518, Apr. 2015. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6915886
|
[20] |
Q. Wei and D. Liu,"A novel iterative-Adaptive dynamic programming for discrete-time nonlinear systems,"IEEE Trans. Autom. Sci. Eng., vol. 11, no. 4, pp. 1176-1190, Oct. 2014. http://ieeexplore.ieee.org/document/6609148/
|
[21] |
Q. Wei, D. Liu, Q. Lin, and R. Song,"Discrete-time optimal control via local policy iteration adaptive dynamic programming,"IEEE Trans. Cybern., article in press, 2016. DOI:10.1109/TCYB.2016.2586082.
|
[22] |
R. Enns and J. Si,"Helicopter trimming and tracking control using direct neural dynamic programming,"IEEE Trans. Neural Netw., vol. 14, no. 4, pp. 929-939, Aug. 2003. http://ieeexplore.ieee.org/document/1215408/
|
[23] |
R. Kamalapurkar, J. R. Klotz, and W. E. Dixon,"Concurrent learningbased approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games,"IEEE/CAA J. Autom. Sinica, vol. 1, no. 3, pp. 239-247, Jul. 2014. http://www.ieee-jas.org/CN/abstract/abstract97.shtml
|
[24] |
Q. Wei, D. Liu, and Q. Lin,"Discrete-time local iterative adaptive dynamic programming:Terminations and admissibility analysis,"IEEE Trans. Neural Netw. Learn. Syst., article in press, 2016. DOI:10.1109/TNNLS.2016.2593743.
|
[25] |
Q. Wei, R. Song, and P. Yan,"Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP,"IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 2, pp. 444-458, Feb. 2016. http://ieeexplore.ieee.org/document/7208854/
|
[26] |
H. Zhang, C. Qin, B. Jiang, and Y. Luo,"Online adaptive policy learning algorithm for H∞ state feedback control of unknown affine nonlinear discrete-time systems,"IEEE Trans. Cybern., vol. 44, no. 12, pp. 2706-2718, Dec. 2014. https://www.ncbi.nlm.nih.gov/pubmed/25095274
|
[27] |
F.-Y. Wang and G. N. Saridis,"Suboptimal control for nonlinear stochastic systems,"Proc. 31st IEEE Conf. Decision Control, 1992. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=371109
|
[28] |
G. N. Saridis and F.-Y. Wang,"Suboptimal control of nonlinear stochastic systems,"Control Theory and Advanced Technology, vol. 10, no. 4, pp. 847-871, 1994. https://www.researchgate.net/publication/224669527_Suboptimal_control_of_nonlinear_stochastic_systems
|
[29] |
Q. Wei, D. Liu, and X. Yang,"Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems,"IEEE Trans. Neural Netw. Learn. Syst., vol. 26, no. 4, pp. 866-879, Apr. 2015. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7052401&filter%3DAND%28p_IS_Number%3A7061550%29
|
[30] |
Q. Wei, D. Liu, Y. Liu, and R. Song,"Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming,"IEEE/CAA J. Autom. Sinica, article in press, 2016. DOI:10.1109/JAS.2016.7510262.
|
[31] |
Q. Zhao, H. Xu, and S. Jagannathan,"Near optimal output feedback control of nonlinear discrete-time systems based on reinforcement neural network learning,"IEEE/CAA J. Autom. Sinica, vol. 1, no. 4, pp. 372-384, Oct. 2014. http://ieeexplore.ieee.org/document/4370989/
|
[32] |
Q. Wei, D. Liu, G. Shi, and Y. Liu,"Optimal multi-battery coordination control for home energy management systems via distributed iterative adaptive dynamic programming,"IEEE Trans. Ind. Electron., vol. 42, no. 7, pp. 4203-4214, Jul. 2015. https://www.researchgate.net/publication/273176842_Multi-Battery_Optimal_Coordination_Control_for_Home_Energy_Management_Systems_via_Distributed_Iterative_Adaptive_Dynamic_Programming?_sg=3y92bCwZfeymLHbpkNepKHvyJPXT_5p7IsK3eaW3YT6oX0AIaWQzP-HrmgPuGTz7HwXPz-CDc2k4U4QJ-vTZrw
|
[33] |
Q. Wei, D. Liu, and H. Lin,"Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems,"IEEE Trans. Cybern., vol. 46, no. 3, pp. 840-853, Mar. 2016. http://ieeexplore.ieee.org/document/7314890/
|
[34] |
Q. Wei, F. Wang, D. Liu, and X. Yang,"Finite-approximation-error based discrete-time iterative adaptive dynamic programming,"IEEE Trans. Cybern., vol. 44, no. 12, pp. 2820-2833, Dec. 2014. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6912005
|
[35] |
H. Li and D. Liu,"Optimal control for discrete-time affine non-linear systems using general value iteration,"IET Control Theory Appl., vol. 6, no. 18, pp. 2725-2736, Dec. 2012. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6418261
|
[36] |
W. Gao and Z.-P. Jiang,"Adaptive dynamic programming and adaptive optimal output regulation of linear systems,"IEEE Trans. Autom. Control, vol. 61, no. 12, pp. 4164-4169, Dec. 2016. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7444144
|
[37] |
Y. Duan, Y. Lv, J. Zhang, X. Zhao, and F.-Y. Wang,"Deep learning for control:The state of the art and prospects,"Acta Autom. Sinica, vol 42, no. 5, pp. 643-654, 2016. https://www.researchgate.net/publication/304888213_Deep_learning_for_control_the_state_of_the_art_and_prospects?_sg=nKxXcMNTesIgnrsxBDKmYme9XgVbVByLEqRJ5jzu_sA7M2xrAYZ40PSPmQ_DCA8aeb2SkTwtve26ulEHvKlAaQ
|
[38] |
F.-Y. Wang,"Building knowledge structure in neural nets using fuzzy logic,"Robotics and Manufacturing:Recent Trends in Research Education and Applications, M. Jamshidi (Eds.), New York, NY, ASME (American Society of Mechanical Engineers) Press, 1992.
|
[39] |
F.-Y. Wang and H.-A. Kim,"Implementing adaptive fuzzy logic controllers with neural networks:a design paradigm,"J. Intell. Fuzzy Syst., vol. 3, no. 2, pp. 165-180, 1995. https://www.researchgate.net/publication/305161757_Implementing_adaptive_fuzzy_logic_controllers_with_neural_networks_A_design_paradigm?_sg=JppkPZebku65ugc2wT3J8qk6iDZ_ugv1IatEl7w9LTcd661RChmgoIk0hB4H1gAF_8PUr1AdDtOadBj6hI9SrQ
|
[40] |
F.-Y. Wang,"The emergence of intelligent enterprises:From CPS to CPSS,"IEEE Intell. Syst., vol. 25, no. 4, pp. 85-88, 2010. doi: 10.1109/MIS.2010.104
|
[41] |
C. Nyce,"Predictive analytics white paper,"American Institute for Chartered Property Casualty Underwriters/Insurance Institute of America, 2007.
|
[42] |
W. Eckerson,"Extending the value of your data warehousing investment,"The Data Warehouse Institute, USA, 2007.
|
[43] |
J. R. Evans and C. H. Lindner,"Business analytics:The next frontier for decision sciences,"Decision Line, vol. 43, no. 2, pp. 1-4, Mar. 2012.
|
[44] |
J. Zhang, Q. Wei, and F.-Y. Wang,"Parallel dynammic programming with an average-greedy mechanism for discrete systems,"SKLMCCS/QAⅡ Tech Report 01-09-2016, ASIA, Beijing, China.
|
[45] |
F.-Y. Wang,"Parallel control:a method for data-driven and computational control,"Acta Autom.a Sinica, vol.39, no. 2, pp. 293-302, 2013. http://www.aas.net.cn/EN/abstract/abstract17915.shtml
|
[46] |
F.-Y. Wang,"Control 5.0:From Newton to Merton in Popper's Cyber-Social-Physical Spaces,"IEEE/CAA J. Autom. Sinica, vol. 3, no. 3, pp. 233-234, 2016. doi: 10.1109/JAS.2016.7508796
|