Derin Öğrenme Tabanlı Vücut Bölme ve Gaussian Filtreleme Kullanarak Gözetim Videolarında Kişiyi Yeniden Tanıma

Bu makalede, daha önce bir kamera ağı üzerinden gözlemlenen bir kişiyi aramak için Kişiyi Yeniden Tanıma (Person ReIdentification) sistemi üzerine yoğunlaşıyoruz. Kişiyi Yeniden Tanıma önemli bir iştir. Örnek olarak kayıp veya şüpheli bir kişinin görüntüleri bulunuyorsa, Kişiyi Yeniden Tanıma sistemi, kişiyi video görüntülerinden bulunmasını sağlayabilir. Bu alanda kişinin yeniden tanımlanmasına ilişkin birçok araştırma olmasına rağmen, bu hala zor bir problem olmaya devam etmektedir ve birçok araştırma bu alanda devam etmektedir. Bu sorunu çözmek amacıyla, makalemizde derin öğrenme tabanlı insan vücudu bölümü bölümleme ve Gaussian filtreleme tabanlı pürüzsüz maske üretimi kullanarak Kişiyi Yeniden Tanıma Sistemi sunuyoruz. İnsan vücudu parçalarını bölümlere ayırmak ve yerel ikili maskeler oluşturmak için anlamsal bir bölme tekniği kullanıyoruz. Bu maskeler deterministik ikili görüntülerdir. Bu ikili maskelerin katı sınırları vardır ve bu deterministik maskelerle bazı özelliklerin kaybedilmesine sebep verip, performansı düşürmektedir. Bu nedenle, maskeleri pürüzsüz hale getirebilmek için Gaussian filtresi uyguluyoruz, böylece sınırlara yakın özellikler de performansa biraz katkı sağlıyor. Bizim geliştirdiğimiz metodumuzda, bu pürüzsüz maskeler, derin öğrenme ağının başında veya ortasında maske uygulayan diğer yöntemlerin aksine, ağın sonunda oluşturulan son özellik haritalarına uygulanmaktadır. Bu nedenle, işimiz yeni ve diğer çalışmalardan farklıdır çünkü ağın sonunda anlamsal bölümleme ve maskeleme kullanmanın yanı sıra bölümleme aşamasında hataları gidermek için maskemiz Gaussian filtresi ile pürüzsüz hale getirilmiştir. Global özellikleri çıkarmak için iyi bilinen önceden eğitilmiş bir ağ olan ResNet-50'yi ve insan vücudunun bölümlenmesi için Alanlar Arası Tamamlayıcı Öğrenme adlı bir yöntemi kullanıyoruz. Resnet-50 ağının sonunda çıkarılan global özelliklere Gaussian filtreli pürüzsüz yerel maskelerin uygulanması, kişiyi yeniden tanıma sisteminin performansını artıryor. Değerlendirme, yaygın olarak kabul edilen Market-1501 veri kümesi üzerinde gerçekleştirilmiştir ve sonuçlar umut vericidir.

Person Re-Identification in Surveillance Videos using Deep Learning based Body Part Partition and Gaussian Filtering

In this paper, we concentrate on Person Re-Identification (Re-ID) that consists of searching for a person who has been previouslyobserved over a camera network. Person Re-ID is important for searching suspicious or missing persons if we have sample images ofthe person of interest. Despite the fact that there are many researches on vision-based Person re-identification, it still remains achallenging problem. We propose a person re-identification system using a deep learning based human body part segmentation, andGaussian filtering based smooth mask generation. A semantic partition technique is used to segment human body parts and generatelocal binary masks. These masks are deterministic binary images. These binary masks have strict boundaries, and we lose somefeatures with these deterministic masks. Therefore, we apply Gaussian filter for smoothing masks so that features near the boundariesare also taken into account slightly. These smooth masks are applied to the final feature maps generated at the end of network oncontrary to other methods which apply mask at the beginning or in the middle of the deep learning network. Therefore, our work isnew and different from other works because of using semantic partition and masking at the end of network, as well as our mask aresmoothed with Gaussian filter to handle errors during the partitioning stage. We use a well-known pre-trained network, namelyResNet-50, to extract global features, and a method called Cross-Domain Complementary Learning for human body partitioning.Applying Gaussian filtered smooth local masks to the global features, which are extracted at the end of Resnet-50 network, increasesthe performance of Person Re-Identification system. Evaluation is conducted on a commonly accepted Market-1501 dataset, andresults are promising.

___

  • Bai, X., & Yang, M., & Huang, T., & Dou, Z., & Yu, R., & Xu, Y. (2017). Deep-Person: Learning Discriminative Deep Features for Person Re-Identification. arXiv preprint, arXiv:1711.10658.
  • Bai, X. et al. (2017B). Deep-Person: Learning Discriminative Deep Features for Person Re-Identification, arXiv:1711.10658.
  • Cheng, D., & Gong, Y., & Zhou, S., & Wang, J., & Zheng, N. (2016). Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function, IEEE Conference on Computer Vision and Pattern Recognition, 1335-1344.
  • Cong, D. N. T., & Achard, C., & Khoudour, L. & Douadi, L. (2009). Video Sequences Association for People Re-Identification Across Multiple Non-Overlapping Cameras. International Conference on Image Analysis and Processing, 179–189.
  • Deng, J., & Dong, W., & Socher, R., & Li, L., & Kai, L., Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database, IEEE Conference on Computer Vision and Pattern Recognition, 248-255.
  • Ding, S., & Lin, L., & Wang, G., & Chao, H. (2015). Deep Feature Learning with Relative Distance Comparison for Person ReIdentification. Pattern Recognition, 48(10), 2993–3003.
  • Farenzena, M. & Bazzani, L., & Perina, A., & Murino, V., & Cristani, M. (2010). Person Re-Identification by Symmetry-Driven Accumulation of Local Features. IEEE Computer Vision and Pattern Recognition (CVPR), 2360–2367.
  • Gray, D., & Brennan, S., & Tao, H. (2007). Evaluating appearance models for recognition, reacquisition, and tracking, IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), 1–7.
  • Hamdoun, O., & Moutarde, F., & Stanciulescu, B., & Steux, B. (2008). Person Re-Identification in Multi-Camera System by Signature based on Interest Point Descriptors Collected on Short Video Sequences. ACM/IEEE International Conference on Distributed Smart Cameras, 1–6.
  • Hatirnaz, E., & Sah, M., & Direkoglu, C. (2020). A novel framework and concept-based semantic search Interface for abnormal crowd behaviour analysis in surveillance videos. Multimedia Tools and Applications, 79, 17579–17617.
  • He, K., & Zhang, X., & Ren, S., & Sun, J. (2016). Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
  • Karaman, S. & Bagdanov, A. D. (2012). Identity Inference: Generalizing Person Re-Identification Scenarios. European Conference on Computer Vision, 443–452.
  • Krizhevsky, A. & Sutskever, I., & Hinton, G. E. (2012). Imagenet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 1097–1105.
  • Lavi, B., & Serj, M. F., & Ullah, I. (2018). Survey on Deep Learning Techniques for Person Re-Identification Task. arXiv preprint arXiv:1807.05284.
  • Leng, Q., & Ye, M., & Tian, Q. (2020). A Survey of Open-World Person Re-identification. IEEE Transactions on Circuits and Systems for Video Technology, 30(4), 1092-1108.
  • Li, D. & Chen, X., & Zhang, Z., & Huang, K. (2017). Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-Identification. IEEE Conference on Computer Vision and Pattern Recognition, 384–393.
  • Li, W. et al. (2017B). Person Re-Identification by Deep Joint Learning of Multi-Loss Classification, International Joint Conference on Artificial Intelligence, 2194–2200.
  • Li, W., & Zhu, X., & Gong, S. (2018). Harmonious attention network for person re-identification, arXiv preprint, arXiv:1802.08122.
  • Lin, K., & Wang, L., & Luo, K., & Chen, Y., & Liu, Z., & Sun, M. (2020). Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation, IEEE Transactions on Circuits and Systems for Video Technology.
  • Liu, J., & Zha, Z.-J., & Tian, Q., & Liu, D., & Yao, T., & Ling, Q., & Mei, T. (2016). Multi-Scale Triplet CNN for Person Reidentification. ACM Conference on Multimedia, 192–196.
  • Quan, R., & Dong, X., & Wu, Y., & Zhu, L., & Yang, Y. (2019). Auto-reid: Searching for a part-aware convnet for person reidentification, in Proc. ICCV.
  • Sah, M., & Direkoglu, C. (2017). Semantic annotation of surveillance videos for abnormal crowd behaviour search and analysis, IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 1-6.
  • Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-scale Image Recognition. arXiv preprint arXiv:1409.1556.
  • Su, C., & Li, J., & Zhang, S., & Xing, J., & Gao, W., & Tian, Q. (2017). Pose-driven Deep Convolutional Model for Person ReIdentification. IEEE International Conference on Computer Vision (ICCV), 3980–3989.
  • Sun, Y., & Zheng, L., & Yang, Y., & Tian, Q., & Wang, S. (2018). Beyond Part Models: Person Retrieval with Refined Part Pooling (And A Strong Convolutional Baseline), European Conference on Computer Vision, 480-496.
  • Tao, D., & Guo, Y., & Yu, B., & Pang, J. & Yu, Z. (2018). Deep Multi-View Feature Learning for Person Re-Identification, IEEE Transactions on Circuits and Systems for Video Technology, 28(10), 2657 – 2666.
  • Varior, R. R., & Shuai, B., & Lu, J., & Xu, D., & Wang, G. (2016). A Siamese Long Short-Term Memory Architecture for Human ReIdentification. European Conference on Computer Vision, 135–153.
  • Wang, J., & Wang, Z., & Gao, C., & Sang, N., & Huang, R. (2017). Deeplist: Learning Deep Features with Adaptive Listwise Constraint for Person Re-Identification. IEEE Transactions on Circuits and Systems for Video Technology. 27(3), 513 – 524.
  • Wang, G. et al. (2018). Learning Discriminative Features with Multiple Granularities for Person Re-Identification. ACM international conference on Multimedia, 274–282.
  • Wu, S., & Chen, Y-C, & Li, X., & Wu, A. C., & You, J. J., & Zheng, W. S. (2016). An Enhanced Deep Feature Representation for Person Re-Identification. IEEE Winter Conference Applications of Computer Vision (WACV), 1–8.
  • Xiao, T., & Li, H., & Ouyang, W., & Wang, X. (2016). Learning Deep Feature Representations with Domain Guided Dropout for Person Re-Identification. IEEE Conference on Computer Vision and Pattern Recognition, 1249–1258.
  • Xie, B., & Wu, X., & Zhang, S., & Zhao, S., & Li, M. (2020). Learning Diverse Features with Part-Level Resolution for Person ReIdentification, arXiv preprint, arXiv:2001.07442.
  • Zhang, G., & Kato, J., & Wang, Y., & Mase, K. (2014). People Re-Identification using Deep Convolutional Neural Network. IEEE International Conference on Computer Vision Theory and Applications (VISAPP), 216–223.
  • Zheng, L., & Yang, Y., & Hauptmann, A. G. (2016). Person Re-identification: Past, Present and Future. arXiv preprint arXiv:1610.02984.
  • Zheng, L., & Shen, L., & Tian, L., & Wang, S., & Wang, J., & Tian, Q. (2016B). Scalable person re-identification: a benchmark, IEEE International Conference on Computer Vision, 1116–1124.