İlkay ÖKSÜZ, Muhammed Murat ÖZBEK, Mustafa SYED

Subjective analysis of social distance monitoring using YOLO v3 architecture and crowd tracking system

The lethal infection, World Health Organization (WHO) reported coronavirus (COVID-19) as a pandemic. Lack of proper vaccine, low levels of immunity against COVID-19 has led to vulnerability of the human beings. Due to lack of eﬀicient vaccine treatment, the only options left to fight against this pandemic are lockdown and social distance. This work offers an autonomous monitoring system on social distancing using deep learning techniques. The proposed architecture tracks the humans on roads and calculates their distance between each other. This surveillance detects the furore violation of social distance utilizing CCTV cameras. The proposed framework uses YOLO v3 object-detection model built on COCO dataset and used to classify human class among 79 classes. The bounding box’s dimensions and centroid coordinates are computed in the two-dimensional feature space from the pairwise vectorized L2 norm and a threshold is fixed for computing the distance maintained between each other. We illustrate the superior performance of our framework checked against other state of the art methods regarding inference speed, mean average precision and loss defined from the localization.

PDF

___

[1] Amodei D, Ananthanarayanan S, Anubhai R , J Bai, E Battenberg et al. Deep speech 2: end-to-end speech recognition in English and Mandarin. Proceedings of the 33rd International Conference on Machine Learning 2016; 48: 173-182.
[2] Pouyanfar S, Sadiq S, Yilin Y, Haiman T, Yudong T et al. A survey on deep learning: algorithms, techniques, and applications. Association for Computing Machinery Computing Surveys 2018; 51 (5): 1-20. doi: 10.1145/3234150.
[3] Chen Z, Khemmar R, Decoux B, Atahouet A, Ertaud J. Real time object detection, tracking, and distance and motion estimation based on deep learning: Application to smart mobility. Eighth International Conference on Emerging Security Technologies (EST); Colchester, United Kingdom; 2019. pp. 1-6.
[4] Redmon J, Farhadi A. YOLOv3: an incremental improvement. arXix 2018; 1804.02767.
[5] Wei L, Dragomir A, Dumitru E, Christian S, Scott R et al. SSD: single shot multibox detector. In: European Conference on Computer Vision; Rome, Italy; 2016. pp. 21-37. doi: 10.1007/978-3-319-46448-0_2
[6] Shaoqing R, Kaiming H, Ross G, Jian S. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 2015; 28: 91-99.
[7] Girshick R, Donahue J, Darrell T, Malik J.Rich. Feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition; Ohio,USA; 2015. pp. 580-587.
[8] Girshick R. Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV); Montreal, Canada; 2015. pp. 1440-1448. doi: 10.1109/ICCV.2015.169
[9] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations(ICLR); San Diego,USA; 2015. pp. 1-20.
[10] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition(CVPR); Las Vegas, NV, USA; 2016. pp. 770-778. doi: 10.1109/CVPR.2016.90
[11] Palacin J, Palleja T, Tresanchez M, Sanz R, Llorens J et al. Real-time tree-foliage surface estimation using a ground laser scanner. IEEE Transactions on Instrumentation and Measurement 2007; 56(4): 1377-1383. doi: 10.1109/TIM.2007.900126
[12] Redmon J, Divvala S, Girshick R, Farhadi A. You Only Look Once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Las Vegas, NV, USA; 2016. pp 779-788. doi: 10.1109/CVPR.2016.91
[13] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Honolulu, HI; 2017. pp. 6517-6525. doi: 10.1109/CVPR.2017.690
[14] Everingham M, Gool L, Williams K, Winn J, Zisserman A. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision 2010; 88 (2): 303-338. doi: 10.1007/s11263-009-0275-4
[15] Lin T, Maire M, Belongie S, Bourdev L, Girshick R et al. Microsoft COCO: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (editors). Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Cham, Switzerland: Springer, 2014. doi: 0.1007/978-3-319-10602-1_48
[16] Pastor-Satorras R, Castellano C, Van P, Vespignani A. Epidemic processes in complex networks. Review Modern Physics 2015; 87 (3): 925-979. doi: 10.1103/RevModPhys.87.925
[17] Wang Z, Bauch T, Bhattacharyya S, Onofrio A, Manfred P et al. Statistical physics of vaccination. Physics Reports 2016; 664: 1-113. doi: 10.1016/j.physrep.2016.10.006
[18] Petropoulos F, Makridakis S. Forecasting the novel coronavirus COVID-19. PLoS ONE 2020; 15. doi: 10.1371/jour- nal.pone.0231236