Selin GÖK IŞIK, Hazım EKENEL

Silüet ve RGB Görüntüleri ile Derin Evrişimsel Sinir Ağları Kullanarak Yürüyüşten Kimlik Tanıma

Günümüzde kişi tanıma için kullanılan birçok biyometrik özellik vardır. Göz, iris, kulak, parmak izi, DNA gibi fiziksel biyometrik özelliklerden farklı olarak davranışsal biyometrik özelliklerimiz zamanla öğrenilir ve gelişirler. Yürüyüş, yakın mesafeden görüntü edinimi gerektiren fiziksel biyometrilerin aksine, uzak mesafeden kaydedilmiş görüntüler üzerinden kişiyi tanımayı sağlamaktadır. Bu makalede, yürüyüşten kişi tanıma problemi için derin öğrenme yöntemi kullanan görünüm tabanlı bir yaklaşım önerilmiştir. Çalışmada, yürüyüş tanıma probleminde yaygın kullanılan girdiler olan ikili insan silüetinin ve yürüyüş enerji imgesinin kişi tanıma başarımına etkileri incelenmiştir. Ayrıca yöntemi pratik uygulamalarda kullanıma daha uygun hale getirebilmek için insan silüeti çıkarma, yürüyüş döngüsü hesaplama gibi ön işleme adımları kaldırılmış ve doğrudan RGB çerçeveleri girdi olarak kullanılmıştır. Ek olarak transfer öğrenmenin başarıma katkısı gözlemlenmiş, bu amaçla popüler bir nesne tanıma modeli CASIA-B yürüyüş veri kümesi üzerinde ince ayarlanmıştır. Yürüyüş dizisini temsil edecek öznitelik vektörünün elde edilmesi aşamasında çerçevelerden çıkarılan öznitelik vektörleri arasında farklı birleştirme yöntemleri denenmiş ve başarıları karşılaştırılmıştır. Önerilen yaklaşımın başarımı hem bu alanda sıkça kullanılan CASIA-B ve OU-ISIR Büyük Popülasyon yürüyüş veri kümelerinde hem de gerçek hayattan toplanmış yürüyüş verileri içeren PRID-2011 kişiyi yeniden tanıma veri kümesi üzerinde deneyler yapılarak ölçülmüştür. Açı farklılıklarının etkisini gözlemlemek için deneyler özdeş ve çapraz görünüm koşulları için tekrarlanmıştır. Derin öğrenme yaklaşımı kullanılarak elde ettiğimiz sonuçlar geleneksel yöntemlerin sonuçlarına göre daha başarılı bulunmuştur.

Anahtar Kelimeler:

Derin Öğrenme, Yürüyüş Tanıma, Transfer Öğrenme, Çapraz Görünüm, Biyometri

PDF

___

S. G. Işık and H. K. Ekenel, "Deep Convolutional Feature-based Gait Recognition Using Silhouettes and RGB Images," 2021 6th International Conference on Computer Science and Engineering (UBMK), 2021, pp. 336-341, doi: 10.1109/UBMK52708.2021.9559026.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C. ve Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, 115(3), 211–252.
Yu, S., Tan, D. ve Tan, T. (2006). A Framework for Evaluating the Effect of View Angle, Clothing and Carrying Condition on Gait Recognition, 18th International Conference on Pattern Recognition (ICPR’06), cilt 4, s.441–444.
Iwama, H., Okumura, M., Makihara, Y. ve Yagi, Y. (2012). The OU-ISIR Gait Database Comprising the Large Population Dataset and Performance Evaluation of Gait Recognition, IEEE Trans. on Information Forensics and Security, 7, Issue 5, 1511–1521.
Simonyan, K. ve Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
Tafazzoli, F. ve Safabakhsh, R. (2010). Model-based human gait recognition using leg and arm movements, Engineering Applications of Artificial Intelligence, 23(8), 1237–1246.
Cunado, D., Nixon, M.S., Carter, J.N., 2003. Automatic extraction and description of human gait models for recognition purposes. Computer Vision and Image Understanding 90 (1), 1–41.
Yoo, J.H., Hwang, D., Moon, K.Y. ve Nixon, M.S. (2008). Automated Human Recognition by Gait using Neural Network, 2008 First Workshops on Image Processing Theory, Tools and Applications, s.1–6.
Zhang, R., Vogler, Ch., Metaxas, D., 2007. Human gait recognition at sagittal plane. Image and Vision Computing 25, 321–330.
R. Liao, C. Cao, E. B. Garcia, S. Yu, and Y. Huang, “Posebased temporal-spatial network (PTSN) for gait recognition with carrying and clothing variations,” in Chinese Conference on Biometric Recognition. Springer, 2017, pp. 474–483.
W. An, S. Yu, Y. Makihara, X. Wu, C. Xu, Y. Yu, R. Liao, and Y. Yagi, “Performance evaluation of model-based gait on multi-view very large population database with pose sequences,” IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 2, no. 4, pp. 421–430, 2020.
Lam, T., Lee, R. ve Zhang, D. (2007). Human gait recognition by the fusion of motion and static spatio-temporal templates, Pattern Recognition, 40, 2563–2573.
Y. Zhang, Y. Huang, S. Yu, and L. Wang, “Cross-view gait recognition by discriminative feature learning,” IEEE Transactions on Image Processing, vol. 29, pp. 1001–1015, 2019.
Hochreiter, Sepp & Schmidhuber, Jürgen. (1997). Long Short-term Memory. Neural computation. 9. 1735-80.
Zhang, X., Sun, S., Li, C., Zhao, X. ve Hu, Y. (2017). DeepGait: A Learning Deep Convolutional Representation for Gait Recognition, J. Zhou, Y. Wang, Z. Sun, Y. Xu, L. Shen, J. Feng, S. Shan, Y. Qiao, Z. Guo ve S. Yu, (düzenleyenler), Biometric Recognition, Springer International Publishing, Cham, s.447–456.
Wang, K., Liu, L., Lee, Y., Ding, X. ve Lin, J. (2019). Nonstandard Periodic Gait Energy Image for Gait Recognition and Data Augmentation, Z. Lin, L. Wang, J. Yang, G. Shi, T. Tan, N. Zheng, X. Chen ve Y. Zhang, (düzenleyenler), Pattern Recognition and Computer Vision, Springer International Publishing, Cham, s.197–208.
Wu, Z., Huang, Y. ve Wang, L. (2015). Learning Representative Deep Features for Image Set Analysis, IEEE Transactions on Multimedia, 17(11), 1960–1968.
Chao, Hanqing & He, Yiwei & Zhang, Junping & Feng, Jianfeng. (2019). GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition. Proceedings of the AAAI Conference on Artificial Intelligence.
Fan, Chao & Peng, Yunjie & Cao, Chunshui & Liu, Xu & Hou, Saihui & Chi, Jiannan & Huang, Yongzhen & Li, Qing & He, Zhiqiang. (2020). GaitPart: Temporal Part-Based Model for Gait Recognition.
Ziyuan Zhang, Luan Tran, Xi Yin, Yousef Atoum, Jian Wan, Nanxin Wang, and Xiaoming Liu. Gait recognition via disentangled representation learning. In CVPR, 2019.
Zhong Li, Jiulong Xiong, Xiangbin Ye, "A new gait energy image based on mask processing for pedestrian gait recognition," Proc. SPIE 11321, 2019 International Conference on Image and Video Processing, and Artificial Intelligence, 113212A (27 November 2019).
Sun, K., Xiao, B., Liu, D. ve Wang, J. (2019). Deep High-Resolution Representation Learning for Human Pose Estimation, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), s.5686–5696.
Xiao, B., Wu, H. ve Wei, Y. (2018). Simple Baselines for Human Pose Estimation and Tracking, V. Ferrari, M. Hebert, C. Sminchisescu ve Y. Weiss, (düzenleyenler), Computer Vision – ECCV 2018, Springer International Publishing, Cham, s.472–487.
Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W. ve Xiao, B. (2021). Deep High-Resolution Representation Learning for Visual Recognition, IEEE Transactions on Pattern Analysis & Machine Intelligence, 43(10), 3349–3364.
Adler, A., Youmaran, R. ve Loyka, S. (2009). Towards a measure of biometric feature information, Formal Pattern Analysis & Applications, 12, 261–270.
Hirzer, M., Beleznai, C., Roth, P.M. ve Bischof, H. (2011). Person Re-Identification by Descriptive and Discriminative Classification, Proc. Scandinavian Conference on Image Analysis (SCIA).
Y. Boykov and M. Jolly, “Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images,” in Proc. Int. Conf. Computer Vision, 2001, pp. 105–112.
Kingma, D. ve Ba, J. (2014). Adam: A Method for Stochastic Optimization, International Conference on Learning Representations.
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D. ve Batra, D. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization, 2017 IEEE International Conference on Computer Vision (ICCV), s.618–626.
Hofmann, M., Geiger, J., Bachmann, S., Schuller, B. ve Rigoll, G. (2014). The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits, Journal of Visual Communication and Image Representation, 25(1), 195–206.
Shutler, J.D., Grant, M.G., Nixon, M.S. ve Carter, J.N. (2004). On a Large Sequence-Based Human Gait Database, Applications and Science in Soft Computing, Springer Berlin Heidelberg, Berlin, Heidelberg, s.339–346.
Z. Wu, Y. Huang, L. Wang, X. Wang and T. Tan, "A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 2, pp. 209-226, 2017.
Xing, W., Li, Y. ve Zhang, S. (2018). View-invariant gait recognition method by three-dimensional convolutional neural network, Journal of Electronic Imaging, 27, 1.
Wu, Z., Huang, Y. ve Wang, L. (2015). Learning Representative Deep Features for Image Set Analysis, IEEE Transactions on Multimedia, 17(11), 1960–1968.
Hu, M., Wang, Y., Zhang, Z., Little, J.J. ve Huang, D. (2013). View-Invariant Discriminative Projection for Multi-View Gait-Based Human Identification, IEEE Transactions on Information Forensics and Security, 8(12), 2034–2045.