, Available online , doi: 10.1109/JAS.2025.125957
Abstract:
Data-driven autonomous driving is a hot topic in academic and industry research due to its impressive performance, flexible mobility, and reduced human intervention. However, the development of this technology relies heavily on large datasets that contain accurately annotated data, obtained through artificial or semi-automated strategies. Consequently, datasets play a crucial role in autonomous driving, and their characteristics significantly impact the effectiveness of algorithms. Currently, there are several diverse datasets available, such as KITTI and CityScape, that cover various tasks. However, researchers often overlook the unique features, similarities, and specificities of these datasets. Furthermore, to the best of our knowledge, there is a lack of survey articles focusing on special metrics and benchmark performance on different datasets in autonomous driving. Therefore, the purpose of this article is to analyze autonomous driving datasets, guide researchers on collecting and utilizing relevant datasets, summarize evaluation strategies, analyze benchmark performance, and provide future research points to enrich the autonomous driving community. We believe that this work will assist researchers in evaluating their data using suitable metrics and offer a fresh perspective on autonomous driving.
Y. Li, S. Teng, Z. Wu, J. Wang, M. Liu, Z. Xuanyuan, and L. Chen, “Datasets, metrics, benchmarks and future research in autonomous driving: A review,” IEEE/CAA J. Autom. Sinica, early access, 2026. doi: 10.1109/JAS.2025.125957.