A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 3 Issue 3
Jul.  2016

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.
Citation: Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.

Traffic Signal Timing via Deep Reinforcement Learning

Funds:

This work was supported by National Natural Science Foundation of China (61533019, 71232006, 61233001).

More Information
  • In this paper, we propose a set of algorithms to design signal timing plans via deep reinforcement learning. The core idea of this approach is to set up a deep neural network (DNN) to learn the Q-function of reinforcement learning from the sampled traffic state/control inputs and the corresponding traffic system performance output. Based on the obtained DNN, we can find the appropriate signal timing policies by implicitly modeling the control actions and the change of system states. We explain the possible benefits and implementation tricks of this new approach. The relationships between this new approach and some existing approaches are also carefully discussed.

     

  • loading
  • [1]
    Mirchandani P, Head L. A real-time traffic signal control system: architecture, algorithms, and analysis. Transportation Research, Part C: Emerging Technologies, 2001, 9(6): 415-432
    [2]
    Papageorgiou M, Diakaki C, Dinopoulou V, Kotsialos A, Wang Y B. Review of road traffic control strategies. Proceedings of the IEEE, 2003, 91(12): 2043-2067
    [3]
    Mirchandani P, Wang F Y. RHODES to intelligent transportation systems. IEEE Intelligent Systems, 2005, 20(1): 10-15
    [4]
    Chen B, Cheng H H. A review of the applications of agent technology in traffic and transportation systems. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(2): 485-497
    [5]
    Li L, Wen D, Yao D Y. A survey of traffic control with vehicular communications. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(1): 425-432
    [6]
    Bellemans T, De Schutter B, De Moor B. Model predictive control for ramp metering of motorway traffic: a case study. Control Engineering Practice, 2006, 14(7): 757-767
    [7]
    Timotheou S, Panayiotou C G, Polycarpou M M. Distributed traffic signal control using the cell transmission model via the alternating direction method of multipliers. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(2): 919-933
    [8]
    Wang F Y. Parallel control and management for intelligent transportation systems: concepts, architectures, and applications. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(3): 630-638
    [9]
    Wang F Y. Agent-based control for networked traffic management systems. IEEE Intelligent Systems, 2005, 20(5): 92-96
    [10]
    Li L, Wen D. Parallel systems for traffic control: a rethinking. IEEE Transactions on Intelligent Transportation Systems, 2015, 17(4): 1179-1182
    [11]
    Liu H C, Han K, Gayah V V, Friesz T L, Yao T. Data-driven linear decision rule approach for distributionally robust optimization of on-line signal control. Transportation Research, Part C: Emerging Technologies, 2015, 59: 260-277
    [12]
    Yang I, Jayakrishnan R. Real-time network-wide traffic signal optimization considering long-term green ratios based on expected route flows. Transportation Research, Part C: Emerging Technologies, 2015, 60: 241-257
    [13]
    Rinaldi M, Tampre C M J. An extended coordinate descent method for distributed anticipatory network traffic control. Transportation Research, Part B: Methodological, 2015, 80: 107-131
    [14]
    Sánchez-Medina J J, Galán-Moreno M J, Rubio-Royo E. Traffic signal optimization in La Almozara district in Saragossa under congestion conditions, using genetic algorithms, traffic microsimulation, and cluster computing. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(1): 132-141
    [15]
    Bingham E. Reinforcement learning in neurofuzzy traffic signal control. European Journal of Operational Research, 2001, 131: 232-241
    [16]
    Prashanth L A, Bhatnagar S. Reinforcement learning with function approximation for traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(2): 412-421
    [17]
    El-Tantawy S, Abdulhai B, Abdelgawad H. Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(3): 1140-1150
    [18]
    Ozan C, Baskan O, Haldenbilen S, Ceylan H. A modified reinforcement learning algorithm for solving coordinated signalized networks. Transportation Research, Part C: Emerging Technologies, 2015, 54: 40-55
    [19]
    Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness A, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529-533
    [20]
    Sutton R, Barto A. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
    [21]
    Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786): 504-507
    [22]
    Bengio y. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009, 2(1): 1-127
    [23]
    Lange S, Riedmiller M. Deep auto-encoder neural networks in reinforcement learning. In: Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN). Barcelona: IEEE, 2010. 1-8
    [24]
    Abtahi F, Fasel I. Deep Belief Nets as function approximators for reinforcement learning. In: Proceedings of Workshops at the 25th AAAI Conference on Artificial Intelligence. Frankfurt, Germany: AIAA, 2011.
    [25]
    Lin W H, Lo H K, Xiao L. A quasi-dynamic robust control scheme for signalized intersections. Journal of Intelligent Transportation Systems: Technology, Planning, and Operations, 2011, 15(4): 223-233
    [26]
    Tong Y, Zhao L, Li L, Zhang Y. Stochastic programming model for oversaturated intersection signal timing. Transportation Research, Part C: Emerging Technologies, 2015, 58: 474-486
    [27]
    Wang F Y. Building knowledge structure in neural nets using fuzzy logic. in Robotics and Manufacturing: Recent Trends in Research Education and Applications, edited by M. Jamshidi, New York, NY, ASME (American Society of Mechanical Engineers) Press, 1992.
    [28]
    Wang F Y, Kim H M. Implementing adaptive fuzzy logic controllers with neural networks: a design paradigm. Journal of Intelligent & Fuzzy Systems, 1995, 3(2): 165-180
    [29]
    Chen C, Wang F Y. A self-organizing neuro-fuzzy network based on first order effect sensitivity analysis. Neurocomputing, 2013, 118: 21-32
    [30]
    Wang F Y. Toward a Revolution in transportation Operations: AI for Complex Systems. IEEE Intelligent Systems, 2008, 23(6): 8-13
    [31]
    Wang F Y. Parallel system methods for management and control of complex systems. Control and Decision, 2004, 19(5): 485-489 (in Chinese)
    [32]
    Wang F Y. Parallel control: a method for data-driven and computational control. Acta Automatica Sinica, 2013, 39(4): 293-302 (in Chinese)

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (1680) PDF downloads(96) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return