Motion-aware vehicle detection in driving videos

This paper focuses on vehicle detection based on motion features in driving videos. Long-term motion information can assist in driving scenarios since driving is a complicated and dynamic process. The proposed method is a deep learning based model which processes motion frame image. This image merges both spatial (frame) and temporal (motion) information. Hence, the model jointly detects vehicles and their motion from a single image. The trained model on Toyota Motor Europe Motorway Dataset reaches 83% mean average precision (mAP). Our experiments demonstrate that the proposed method has a higher mAP than a tracking-based model. The proposed method runs real-time in driving videos which enables the model to be used in time-critical applications such as autonomous driving and advanced driving assistance systems.

PDF

___

[1] Kilicarslan M, Zheng JY. Visualizing driving video in temporal profile. In: IEEE Intelligent Vehicles Symposium; Michigan, USA; 2014. pp. 1263-1269.
[2] Kilicarslan M, Zheng JY. Predict vehicle collision by TTC from motion using a single video camera. IEEE Transactions on Intelligent Transportation Systems 2019; 20 (2): 522-533. doi: 10.1109/TITS.2018.2819827
[3] Redmon J, Farhadi A. YOLOv3: an incremental improvement. arXiv 2018; arXiv:1804.02767.
[4] Caraffi C, Vojir T, Trefný J, Šochman J, Matas J. A system for real-time detection and tracking of vehicles from a single car-mounted camera. In: IEEE Intelligent Transportation Systems; Anchorage, AK, USA; 2012. pp. 975-982.
[5] Yuan C, Yang C, Xu Z. Simple vehicle detection with shadow removal at intersection. In: Second International Conference on Multimedia and Information Technology; Kaifeng, China; 2010. pp. 188-191.
[6] Leeuwen V, Groen CAG. Vehicle detection with a mobile camera: Spotting midrange, distant, and passing cars. IEEE Robotics & Automation Magazine 2005; 12 (1): 37-43. doi: 10.1109/MRA.2005.1411417
[7] Cheon M, Lee W, Yoon C, Park M. Vision-based vehicle detection system with consideration of the detecting location. IEEE Transactions on Intelligent Transportation Systems 2012; 13 (3): 1243-1252. doi: 10.1109/TITS.2012.2188630
[8] Betke M, Haritaoglu E, Davis L. Real-time multiple vehicle detection and tracking from a moving vehicle. Machine Vision and Applications 2000; 12: 69-83. doi: 10.1007/s001380050126
[9] Bertozzi M, Broggi A, Castelluccio S. A real-time oriented system for vehicle detection. Journal of Systems Architecture 1997: 43 (1): 317-325. doi: 10.1016/S1383-7621(96)00106-3
[10] Bensrhair A, Bertozzi A, Broggi A, Fascioli A, Mousset S et al. Stereo vision-based feature extraction for vehicle detection. In: IEEE Intelligent Vehicle Symposium; Versailles, France; 2002. pp. 465-470.
[11] Zielke T, Brauckmann M, Vonseelen W. Intensity and edge-based symmetry detection with an application to carfollowing. CVGIP: Image Understanding 1993; 58 (2): 177-190. doi: 10.1006/ciun.1993.1037
[12] Bücher T, Curio C, Edelbrunner J, Igel C, Kastrup D et al. Image processing and behavior planning for intelligent vehicles. IEEE Transactions on Industrial Electronics 2003; 50 (1): 62-75. doi: 10.1109/TIE.2002.807650
[13] Tsai LW, Hsieh JW, Fan KC. Vehicle detection using normalized color and edge map. IEEE Transactions on Image Processing 2007; 16 (3): 850-864. doi: 10.1109/TIP.2007.891147
[14] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition; San Diego, CA, USA; 2005. pp. 886-893.
[15] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Kauai, HI, USA; 2001. pp. 511-518.
[16] Neumann D, Langner T, Ulbrich F, Spitta D, Goehring D. Online vehicle detection using Haar-like, LBP and HOG feature based image classifiers with stereo vision preselection. In: IEEE Intelligent Vehicles Symposium; Los Angeles, CA, USA; 2017. pp. 773-778.
[17] Li X, Li L, Flohr F, Wang J, Xiong H et al. A unified framework for concurrent pedestrian and cyclist detection. IEEE Transactions on Intelligent Transportation Systems 2017; 18 (2): 269-281. doi: 10.1109/TITS.2016.2567418
[18] Tian Y, Luo P, Wang X, Tang X. Deep learning strong parts for pedestrian detection. In: IEEE International Conference on Computer Vision; Santiago, Chile; 2015. pp. 1904-1912.
[19] Girshick R. Fast R-CNN. In: IEEE International Conference on Computer Vision; Santiago, Chile; 2015. pp. 1440-1448.
[20] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S et al. SSD: Single shot multibox detector. In: Spring Lecture Notes in Computer Science 2016 European Conference on Computer Vision; Amsterdam, Netherlands; 2016. pp. 21-37.
[21] Horn BKP, Schunck BG. Determining optical flow. Artificial Intelligence 1981; 17 (1): 185-203. doi: 10.1016/0004- 3702(81)90024-2
[22] Smith SM, Brady JM. ASSET-2: real-time motion segmentation and shape tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 1995; 17 (8): 814-820. doi: 10.1109/34.400573
[23] Cao Y, Renfrew A, Cook P. Vehicle motion analysis based on a monocular vision system. In: IET Road Transport Information and Control Conference; Manchester, UK; 2008. pp. 1-6.
[24] Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric. In: IEEE International Conference on Image Processing; Beijing, China; 2017. pp. 3645-3649.
[25] Jazayeri A, Cai H, Zheng YZ, Tuceryan M. Vehicle detection and tracking in car video based on motion model. IEEE Transactions on Intelligent Transportation Systems 2011; 12 (2): 583-595. doi: 10.1109/TITS.2011.2113340
[26] John V, Mita S. Vehicle semantic understanding for automated driving in multiple lane urban roads using deep vision-based features. In: International Joint Conferences on Artificial Intelligence; Macao, China, 2019. pp. 1-7.
[27] Hou L, Xin L, Li SE, Cheng B, Wang W. Interactive trajectory prediction of surrounding road users for autonomous driving using structural-LSTM network. IEEE Transactions on Intelligent Transportation Systems 2019; 21 (11): 1-11. doi: 10.1109/TITS.2019.2942089
[28] Geiger A, Lenzz P, Urtasun R. Are we ready for autonomous driving? the KITTI vision benchmark suite. In: IEEE Computer Vision and Pattern Recognition; Providence, RI, USA; 2012. pp. 3354-3361.
[29] Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M et al. The cityscapes dataset for semantic urban scene understanding. In: IEEE Computer Vision and Pattern Recognition; Las Vegas, LA, USA; 2016. pp. 3213-3223.