IEEE/CAA Journal of Automatica Sinica
Citation:  Parham M. Kebria, Abbas Khosravi, Syed Moshfeq Salaken and Saeid Nahavandi, "Deep Imitation Learning for Autonomous Vehicles Based on Convolutional Neural Networks," IEEE/CAA J. Autom. Sinica, vol. 7, no. 1, pp. 8295, Jan. 2020. doi: 10.1109/JAS.2019.1911825 
[1] 
Y. LeCun, Y. Bengio, and G. Hinton, " Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015. doi: 10.1038/nature14539

[2] 
M. Wainberg, D. Merico, A. Delong, and B. J. Frey, " Deep learning in biomedicine,” Nat. Biotechnol., vol. 36, no. 9, pp. 829–838, Oct. 2018. doi: 10.1038/nbt.4233

[3] 
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, " OverFeat: integrated recognition, localization and detection using convolutional networks,” arXiv preprint arXiv: 1312.6229, Dec. 2013.

[4] 
K. Simonyan and A. Zisserman, " Twostream convolutional networks for action recognition in videos,” in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 568–576.

[5] 
K. Simonyan and A. Zisserman, " Very deep convolutional networks for largescale image recognition,” arXiv preprint arXiv: 1409.1556, Sept. 2014.

[6] 
M. D. Zeiler and R. Fergus, " Visualizing and understanding convolutional networks,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 818–833.

[7] 
A. Krizhevsky, I. Sutskever, and G. E. Hinton, " ImageNet classification with deep convolutional neural networks,” in Proc. 25th Int. Conf. Neural Information Processing Systems, Lake Tahoe, Nevada, USA, 2012, pp. 1097–1105.

[8] 
L. Chen, X. M. Hu, T. Xu, H. L. Kuang, and Q. Q. Li, " Turn signal detection during nighttime by CNN detector and perceptual hashing tracking,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 12, pp. 3303–3314, Dec. 2017. doi: 10.1109/TITS.2017.2683641

[9] 
Q. Wang, J. Y. Gao, and Y. Yuan, " Embedding structured contour and location prior in siamesed fully convolutional networks for road detection,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 1, pp. 230–241, Jan. 2018. doi: 10.1109/TITS.2017.2749964

[10] 
S. P. Zhang, Y. K. Qi, F. Jiang, X. Y. Lan, P. C. Yuen, and H. Y. Zhou, " Pointtoset distance metric learning on deep representations for visual tracking,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 1, pp. 187–198, Jan. 2018. doi: 10.1109/TITS.2017.2766093

[11] 
P. M. Kebria, A. Khosravi, S. Nahavandi, Z. Najdovski, and S. J. Hilton, " Neural network adaptive control of teleoperation systems with uncertainties and timevarying delay,” in Proc. 2018 IEEE 14th Int. Conf. Automation Science and Engineering, Munich, Germany, 2018, pp. 252–257.

[12] 
P. M. Kebria, A. Khosravi, S. Nahavandi, D. R. Wu, and F. Bello, " Adaptive type2 fuzzy neuralnetwork control for teleoperation systems with delay and uncertainties,” IEEE Trans. Fuzzy Syst.

[13] 
M. Kuderer, S. Gulati, and W. Burgard, " Learning driving styles for autonomous vehicles from demonstration,” in Proc. 2015 IEEE Int. Conf. Robotics and Automation, Seattle, WA, USA, 2015, pp. 2641–2646.

[14] 
S. X. Gu, E. Holly, T. Lillicrap, and S. Levine, " Deep reinforcement learning for robotic manipulation with asynchronous offpolicy updates,” in Proc. 2017 IEEE Int. Conf. Robotics and Automation, Singapore, 2017, pp. 3389–3396.

[15] 
B. D. Argall, S. Chernova, M. Veloso, and B. Browning, " A survey of robot learning from demonstration,” Rob. Auton. Syst., vol. 57, no. 5, pp. 469–483, May 2009. doi: 10.1016/j.robot.2008.10.024

[16] 
A. Hussein, M. M. Gaber, E. Elyan, and C. Jayne, " Imitation learning: a survey of learning methods,” ACM Comput. Surv., vol. 50, no. 2, pp. 21, Jun. 2017.

[17] 
D. Silver, J. A. Bagnell, and A. Stentz, " Applied imitation learning for autonomous navigation in complex natural terrain,” in Field and Service Robotics, A. Howard, K. Iagnemma, and A. Kelly, Eds. Berlin, Heidelberg, Germany: Springer, 2010, pp. 249–259.

[18] 
A. MartínezTenor, J. A. FernándezMadrigal, A. CruzMartín, and J. GonzálezJiménez, " Towards a common implementation of reinforcement learning for multiple robotic tasks,” Expert Syst. Appl., vol. 100, pp. 246–259, Jun. 2018. doi: 10.1016/j.eswa.2017.11.011

[19] 
K. Menda, Y. C. Chen, J. Grana, J. W. Bono, B. D. Tracey, M. J. Kochenderfer, and D. Wolpert, " Deep reinforcement learning for eventdriven multiagent decision processes,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 4, pp. 1259–1268, Apr. 2019. doi: 10.1109/TITS.2018.2848264

[20] 
B. Ghazanfari and N. Mozayani, " Extracting bottlenecks for reinforcement learning agent by holonic concept clustering and attentional functions,” Expert Syst. Appl., vol. 54, pp. 61–77, Jul. 2016. doi: 10.1016/j.eswa.2016.01.030

[21] 
J. Courbon, Y. Mezouar, and P. Martinet, " Autonomous navigation of vehicles from a visual memory using a generic camera model,” IEEE Trans. Intell. Transp. Syst., vol. 10, no. 3, pp. 392–402, Sep. 2009. doi: 10.1109/TITS.2008.2012375

[22] 
T. H. Zhang, Z. McCarthy, O. Jow, D. Lee, X. Chen, K. Goldberg, and P. Abbeel, " Deep imitation learning for complex manipulation tasks from virtual reality teleoperation,” in Proc. 2018 IEEE Int. Conf. Robotics and Automation, Brisbane, QLD, Australia, 2018, pp. 5628–5635.

[23] 
W. Sun, A. Venkatraman, G. J. Gordon, B. Boots, and J. A. Bagnell, " Deeply AggreVaTed: Differentiable imitation learning for sequential prediction,” in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 3309–3318.

[24] 
B. K. Chen, C. Gong, and J. Yang, " Importanceaware semantic segmentation for autonomous vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 1, pp. 137–148, Jan. 2019. doi: 10.1109/TITS.2018.2801309

[25] 
W. Sun, J. A. Bagnell, and B. Boots, " Truncated horizon policy search: Combining reinforcement learning & imitation learning,” in Proc. ICLR 2018 Conf. Acceptance Decision, Vancouver, BC, Canada, 2018.

[26] 
J. Merel, Y. Tassa, T. B. Dhruva, S. Srinivasan, J. Lemmon, Z. Y. Wang, G. Wayne, and N. Heess, " Learning human behaviors from motion capture by adversarial imitation,” arXiv preprint arXiv: 1707.02201, Jul. 2017.

[27] 
T. Liu, B. Tian, Y. F. Ai, L. Li, D. P. Cao, and F. Y. Wang, " Parallel reinforcement learning: a framework and case study,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 4, pp. 827–835, Jul. 2018. doi: 10.1109/JAS.2018.7511144

[28] 
L. Cardamone, D. Loiacono, and P. L. Lanzi, " Learning drivers for torcs through imitation using supervised methods,” Proc. 2009 IEEE Symp. Computational Intelligence and Games Milano,Italy, pp. 148–155, 2009.

[29] 
J. Ho and S. Ermon, " Generative adversarial imitation learning,” in Proc. 30th Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 4565–4573.

[30] 
Y. Duan, M. Andrychowicz, B. Stadie, O. J. Ho, J. Schneider, I. Sutskever, P. Abbeel, and W. Zaremba, " Oneshot imitation learning,” in Proc. 31th Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 1087–1098.

[31] 
B. C. Stadie, P. Abbeel, and I. Sutskever, " Thirdperson imitation learning,” arXiv preprint arXiv: 1703.01703, Mar. 2017.

[32] 
J. Saunders, C. L. Nehaniv, and K. Dautenhahn, " Teaching robots by moulding behavior and scaffolding the environment,” in Proc. 1st ACM SIGCHI/SIGART Conf. Humanrobot Interaction, Salt Lake City, Utah, USA, 2006, pp. 118–125.

[33] 
S. Nahavandi, " Trusted autonomy between humans and robots: toward humanontheloop in robotics and autonomous systems,” IEEE Syst.,Man,Cybern. Mag., vol. 3, no. 1, pp. 10–17, Jan. 2017. doi: 10.1109/MSMC.2016.2623867

[34] 
D. Gandhi, L. Pinto, and A. Gupta, " Learning to fly by crashing,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Vancouver, BC, Canada, 2017, pp. 3948–3955.

[35] 
M. Mueller, V. Casser, N. Smith, and B. Ghanem, " Teaching UAVs to race using UE4Sim,” arXiv preprint arXiv: 1708.05884, Aug. 2017.

[36] 
C. Innocenti, H. Lindén, G. Panahandeh, L. Svensson, and N. Mohammadiha, " Imitation learning for visionbased lane keeping assistance,” in Proc. 20th Int. IEEE Conf. Intelligent Transportation Systems, Yokohama, Japan, 2017, pp. 3948–3955.

[37] 
S. Priesterjahn, O. Kramer, A. Weimer, and A. Goebels, " Evolution of reactive rules in multi player computer games based on imitation,” in Proc. 1st Int. Conf. Natural Computation, Changsha, China, 2005, pp. 744–755.

[38] 
P. M. Kebria, A. Khosravi, S. M. Salaken, I. Hossain, H. M. D. Kabir, A. Koohestani, R. Alizadehsani, and S. Nahavandi, " Deep imitation learning: the impact of depth on policy performance,” in Proc. 25th Int. Conf. Neural Information Processing, Siem Reap, Cambodia, 2018, pp. 172–181.

[39] 
A. Diosi, S. Segvic, A. Remazeilles, and F. Chaumette, " Experimental evaluation of autonomous driving based on visual memory and imagebased visual servoing,” IEEE Trans. Intell. Transp. Syst., vol. 12, no. 3, pp. 870–883, Sep. 2011. doi: 10.1109/TITS.2011.2122334

[40] 
H. B. Gao, B. Cheng, J. Q. Wang, K. Q. Li, J. H. Zhao, and D. Y. Li, " Object classification using CNNbased fusion of vision and LIDAR in autonomous vehicle environment,” IEEE Trans. Ind. Inform., vol. 14, no. 9, pp. 4224–4231, Sep. 2018. doi: 10.1109/TII.2018.2822828

[41] 
P. M. Kebria, R. Alizadehsani, S. M. Salaken, I. Hossain, A. Khosravi, D. Kabir, A. Koohestani, H. Asadi, S. Nahavandi, E. Tunsel, and M. Saif, " Evaluating architecture impacts on deep imitation learning performance for autonomous driving,” in Proc. IEEE Int. Conf. Industrial Technology, Melbourne, Australia, 2019, pp. 865–870.

[42] 
J. H. Kim, G. Batchuluun, and K. R. Park, " Pedestrian detection based on faster RCNN in nighttime by fusing deep convolutional features of successive images,” Expert Syst. Appl., vol. 114, pp. 15–33, Dec. 2018. doi: 10.1016/j.eswa.2018.07.020

[43] 
Y. W. Seo, J. Lee, W. D. Zhang, and D. Wettergreen, " Recognition of highway workzones for reliable autonomous driving,” IEEE Trans. Intell. Transp. Syst., vol. 16, no. 2, pp. 708–718, Apr. 2015.

[44] 
A. DominguezSanchez, M. Cazorla, and S. OrtsEscolano, " Pedestrian movement direction recognition using convolutional neural networks,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 12, pp. 3540–3548, Dec. 2017. doi: 10.1109/TITS.2017.2726140

[45] 
S. Di, H. G. Zhang, C. G. Li, X. Mei, D. Prokhorov, and H. B. Ling, " Crossdomain traffic scene understanding: a dense correspondencebased transfer learning approach,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 3, pp. 745–757, Mar. 2018. doi: 10.1109/TITS.2017.2702012

[46] 
Y. J. Zeng, X. Xu, D. Y. Shen, Y. Q. Fang, and Z. P. Xiao, " Traffic sign recognition using kernel extreme learning machines with deep perceptual features,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 6, pp. 1647–1653, Jun. 2017.

[47] 
Q. Wang, J. Y. Gao, and Y. Yuan, " A joint convolutional neural networks and context transfer for street scenes labeling,” IEEE Trans. Intell. Transp. Syst., vol. 19, no. 5, pp. 1457–1470, May 2018. doi: 10.1109/TITS.2017.2726546

[48] 
J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, " How transferable are features in deep neural networks?,” in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, 3320–3328.

[49] 
Z. H. Zhou, J. X. Wu, and W. Tang, " Ensembling neural networks: many could be better than all,” Artif. Intell., vol. 137, no. 12, pp. 239–263, May 2002. doi: 10.1016/S00043702(02)00190X

[50] 
L. Breiman, " Bagging predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, Aug. 1996.

[51] 
T. G. Dietterich, " An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization,” Mach. Learn., vol. 40, no. 2, pp. 139–157, Aug. 2000. doi: 10.1023/A:1007607513941

[52] 
C. Strobl, J. Malley, and G. Tutz, " An introduction to recursive partitioning: rationale, application and characteristics of classification and regression trees, bagging, and random forests,” Psychol. Methods, vol. 14, no. 4, pp. 323–348, Dec. 2009. doi: 10.1037/a0016973

[53] 
J. MendesMoreira, C. Soares, A. M. Jorge, and J. F. De Sousa, " Ensemble approaches for regression: a survey,” ACM Comput. Surv., vol. 14, no. 1, pp. 10, Nov. 2012.

[54] 
Udacity. Selfdrivigncar. (2018) [Online]. Available: https://www.udacity.com/

[55] 
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, " Humanlevel control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015. doi: 10.1038/nature14236

[56] 
T. Mareda, L. Gaudard, and F. Romerio, " A parametric genetic algorithm approach to assess complementary options of large scale windsolar coupling,” IEEE/CAA J. Autom. Sinica, vol. 4, no. 2, pp. 260–272, Apr. 2017. doi: 10.1109/JAS.2017.7510523

[57] 
Q. Kang, X. Y. Song, M. C. Zhou, and L. Li, " A collaborative resource allocation strategy for decompositionbased multiobjective evolutionary algorithms,” IEEE Trans. Syst.,Man,Cybern.:Syst., vol. 49, no. 12, pp. 2416–2423, Dec. 2019.
