Characterization of different crowd behaviors using novel deep learning framework

Crowd behavior understanding is recognized as a complex problem due to unpredictable behavior of humans and complex interactions of individuals in groups. For crowd managers, it is crucial to understand the crowd dynamics to manage the crowd eﬀiciently and effectively. Current practice of crowd management is based on manual analysis of the scene. Such manual analysis of the scene is a tedious job and usually prone to errors due to limited human capabilities. Therefore, the task of automatizing crowd analysis has received tremendous attention from the research community during the recent years. In this paper, we propose a deep model framework that automatically characterizes different crowd behaviors based on motion and appearance. We first extract dense trajectories from the input video segment and then generate trajectory image by projecting trajectories on to image plane. Trajectory image effectively captures relative motion in the scene. We use stack of trajectory images to train deep convolutional network that learns compact and powerful representation of motion in the scene. We evaluate our approach on UCF, CUHK, and Crowd-11 benchmark datasets. From the experiment results, we demonstrate, both in quantitative and qualitative ways, that the proposed framework outperforms other existing methods by a great margin.

PDF

___

1] Ullah H, Altamimi AB, Uzair M, Ullah M. Anomalous entities detection and localization in pedestrian flows. Neurocomputing 2018; 290: 74-86.
[2] Kratz L, Nishino K. Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; Miami, USA; 2009. pp. 1446-1453.
[3] Mahadevan V, Li W, Bhalodia V, Vasconcelos N. Anomaly detection in crowded scenes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; New York, USA; 2010. pp. 1975-1981.
[4] Ullah H, Ullah M, Conci N. Real-time anomaly detection in dense crowded scenes. In: Video Surveillance and Transportation Imaging Applications 2014. vol. 9026. International Society for Optics and Photonics; 2014. pp.35 902608.
[5] Ullah H, Tenuti L, Conci N. Gaussian mixtures for anomaly detection in crowded scenes. In: Video Surveillance and Transportation Imaging Applications. vol. 8663. International Society for Optics and Photonics; 2013.
[6] Ullah H, Ullah M, Afridi H, Conci N, De Natale FG. Traﬀic accident detection through a hydrodynamic lens. In: Image Processing (ICIP), 2015 IEEE International Conference; Quebec, Canada; 2015. pp. 2470-2474.
[7] Rabaud V, Belongie S. Counting crowded moving objects. In: Computer Vision and Pattern Recognition; New York, USA; 2016. pp. 705-711.
[8] Idrees H, Saleemi I, Seibert C, Shah M. Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Portland, Oregon, USA; 2013. pp. 2547-2554.
[9] Arif M, Daud S, Basalamah S. Counting of people in the extremely dense crowd using genetic algorithm and blobs counting. IAES International Journal of Artificial Intelligence 2013; 2(2): 51.
[10] Arif M, Daud S, Basalamah S. People counting in extremely dense crowd using blob size optimization. Life Science Journal 2012;9(3): pp. 1663-1673.
[11] Saqib M, Khan SD, Blumenstein M. Texture-based feature mining for crowd density estimation: A study. In: International Conference on Image and Vision Computing; Auckland, New Zealand; 2016. pp. 1-6.
[12] Khan S, Vizzari G, Bandini S, Basalamah S. Detecting dominant motion flows and people counting in high density crowds. Journal of WSCG 2014; 22(1): 21-30.
[13] Marsden M, McGuinness K, Little S, O’Connor NE. ResnetCrowd: A residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification. In: 14th IEEE International Conference Advanced Video and Signal Based Surveillance; Lecce, Italy; 2017. pp. 1-7.
[14] Khan SD, Vizzari G, Bandini S. Identifying sources and sinks and detecting dominant motion patterns in crowds. Transportation Research Procedia 2014; 1: 195-200.
[15] Saqib M, Khan SD, Sharma N, Blumenstein M. Extracting descriptive motion information from crowd scenes. In: International Conference on Image and Vision Computing New Zealand; Auckland, New Zealand; 2017. pp. 1-6.
[16] Saqib M, Khan SD, Blumenstein M. Detecting dominant motion patterns in crowds of pedestrians. In: Eighth International Conference on Graphic and Image Processing; Qingdao, China; 2017. pp. 1-6.
[17] Wang H, Klaser A, Schmid C, Liu CL. Action recognition by dense trajectories. In: International Conference on Computer Vision and Pattern Recognition; Colorado, USA; 2011. pp. 3169-3176.
[18] Wang H, Schmid C. Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision; Portland, Oregon, USA; 2013. pp. 3551-3558.
[19] Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems; Montreal, Canada; 2014. pp. 568-576.
[20] Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K et al. Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Boston, USA; 2015. pp. 2625-2634.
[21] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: 2005 IEEE Conference on Computer Vision and Pattern Recognition; San Diego, USA; 2005. pp. 886-893.
[22] Chaudhry R, Ravichandran A, Hager G, Vidal R. Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; Florida, USA; 2009. pp. 1932-1939.
43] Li R, Chellappa R. Group motion segmentation using a spatio-temporal driving force model. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; San Francisco, USA; 2010. pp. 2038- 2045.
[44] Rao AR, Jain RC. Computerized flow field analysis: Oriented texture fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1992;(7): 693-709.
[45] Ford RM. Critical point detection in fluid flow images using dynamical system properties. Pattern Recognition 1997; 30(12): 1991-2000.
[46] Helbing D. A fluid dynamic model for the movement of pedestrians. arXiv preprint cond-mat/9805213. 1998.
[47] Mehran R, Oyama A, Shah M. Abnormal crowd behavior detection using social force model. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; Florida, USA; 2009. pp. 935-942
[48] Wu S, Moore BE, Shah M. Chaotic invariants of lagrangian particle trajectories for anomaly detection in crowded scenes. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; San Francisco, USA; 2010. pp. 2054-2060.
[49] Shao J, Change Loy C, Wang X. Scene-independent group profiling in crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Ohio, USA; 2014. pp. 2219-2226.
[50] Yi S, Li H, Wang X. Understanding pedestrian behaviors from stationary crowd groups. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Boston, USA; 2015. pp. 3488-3496.
[51] Shao J, Kang K, Change Loy C, Wang X. Deeply learned attributes for crowded scene understanding. In: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition; Boston, USA; 2015. pp. 4657-4666.
[52] Shao J, Loy CC, Kang K, Wang X. Slicing convolutional neural network for crowd video understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, USA; 2016. pp. 5620-5628.
[53] Zhang X, Shu X, He Z. Crowd panic state detection using entropy of the distribution of enthalpy. Physica A: Statistical Mechanics and its Applications 2019; 525: 935-945.
[54] Bera A, Kim S, Manocha D. Realtime anomaly detection using trajectory-level crowd behavior learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; Las Vegas, USA; 2016. pp. 50-57.
[55] Hassner T, Itcher Y, Kliper-Gross O. Violent flows: Real-time detection of violent crowd behavior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops; Rhode Island, USA; 2012. pp. 1-6.
[56] Fradi H, Luvison B, Pham QC. Crowd behavior analysis using local mid-level visual descriptors. IEEE Transactions on Circuits and Systems for Video Technology 2016; 27(3): 589-602.
[57] Kim JS, Hwangbo M, Kanade T. Realtime aﬀine-photometric KLT feature tracker on GPU in CUDA framework. In: 2009 IEEE 12th International Conference on Computer Vision Workshops; Kyoto, Japan; 2009. pp. 886-893.
[58] Liu C, Yuen J, Torralba A, Sivic J, Freeman WT. Sift flow: Dense correspondence across different scenes. In: European Conference on Computer Vision; Marseille, France; 2008. pp. 28-42.
[59] Wang H, Kl ̈aser A, Schmid C, Liu CL. Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision. 2013; 103(1): 60-79.
[60] Zhang C, Li H, Wang X, Yang X. Cross-scene crowd counting via deep convolutional neural networks. In: Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition; Boston, USA; 2015. pp. 833-841.
[61] Allain P, Courty N, Corpetti T. AGORASET: a dataset for crowd video analysis. In: ICPR International Workshop on Pattern Recognition and Crowd Analysis; Tsukuba, Japan; 2012. pp. 1-6.
[62] Patino L, Cane T, Vallee A, Ferryman J. Pets 2016: Dataset and challenge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops; Las Vegas, USA; 2000. pp. 1-8.
[63] Nievas EB, Suarez OD, Garcia GB, Sukthankar R. Violence detection in video using computer vision techniques. In: International Conference on Computer Analysis of Images and Patterns; Seville, Spain; 2011. pp. 332-339.
[64] Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile; 2015. pp. 4489-4497.
[65] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:150203167. 2015