Two person interaction recognition based on a dual-coded modified metacognitive (DCMMC) extreme learning machine

Two person interaction recognition based on a dual-coded modified metacognitive (DCMMC) extreme learning machine

Human action recognition has been an active research area for over three decades. However, state-of-the-art proposed algorithms are still far from developing error-free and fully-generalized systems to perform accurate interaction recognition. This work proposes a new method for two-person interaction recognition from videos, based on well-known cognitive theories. The main idea is to perform classification based on a theory of cognition known as dual coding theory. The theory states that human brain processes and represents two types of information to learn/classify data named analogue and symbolic codes, i.e. (verbal as analogue and visual as symbolic). To implement such a theory in a two-person interaction classification system, we exploit dense trajectories as analogue codes and a bag of words as symbolic codes which are two code types hypothesized in the theory. In addition to dual coding theory, we propose to implement a metacognitive classifier model which adds a metalevel with its own rules to perform more accurate training process. We also propose a modification in a metacognitive component to prevent cognitive interference well known as the Stroop effect. Evaluations on both datasets revealed that the method offers comparable recognition accuracy (95.6% for the SBU interaction dataset and 91.1% for the UT-interaction dataset).

___

  • [1] Fadl S, Han Q, Li Q. CNN spatiotemporal features and fusion for surveillance video forgery detection. Signal Processing: Image Communication 2021; 90: 116066. doi: 10.1016/j.image.2020.116066
  • [2] Ramezani M, Yaghmaee F. Motion pattern based representation for improving human action retrieval. Multimedia Tools and Applications 2018; 77: 26009–26032. doi: 10.1007/s11042-018-5835-6
  • [3] Vogiatzidakis P, Koutsabasis P. ‘Address and command’: Two-handed mid-air interactions with multiple home devices. International Journal of Human-Computer Studies 2022; 159: 102755. doi: 10.1016/j.ijhcs.2021.102755
  • [4] Shen Z, Elibol A, Chong NY. Multi-modal feature fusion for better understanding of human personality traits in social human–robot interaction. Robotics and Autonomous Systems 2021; 146: 103874. doi: 10.1016/j.robot.2021.103874
  • [5] Shen Z, Elibol A, Chong NY. Multi-modal feature fusion for better understanding of human personality traits in social human–robot interaction. Robotics and Autonomous Systems 2021; 146: 103874. doi: 10.1016/j.robot.2021.103874
  • [6] Islam N, Faheem Y, Din IU, Talha M, Guizani M et al. A blockchain-based fog computing framework for activity recognition as an application to e-Healthcare services. Future Generation Computer Systems 2019; 100: 569-78. doi: 10.1016/j.future.2019.05.059
  • [7] Gao Y, Xiang X, Xiong N, Huang B, Lee HJ et al. Human action monitoring for healthcare based on deep learning. IEEE Access 2018; 6: 52277-85. doi: 10.1109/ACCESS.2018.2869790
  • [8] Chen J, Samuel RD, Poovendran P. LSTM with bio inspired algorithm for action recognition in sports videos. Image and Vision Computing 2021; 112 :104214. doi: 10.1016/j.imavis.2021.104214
  • [9] Akila K, Chitrakala S. Highly refined human action recognition model to handle intraclass variability and interclass similarity. Multimedia Tools and Applications 2019; 78: 20877–20894. doi: 10.1007/s11042-019-7392-z
  • [10] Yun K, Honorio J, Chattopadhyay D, Berg TL, Samaras D. Two-person interaction detection using body-pose features and multiple instance learning. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops; Providence, RI, USA; 2012. pp. 28-35. doi: 10.1109/CVPRW.2012.6239234
  • [11] Ryoo MS, Aggarwal JK. UT-interaction dataset, ICPR contest on semantic description of human activities (SDHA). In: IEEE International Conference on Pattern Recognition Workshops; 2010. Vol. 2, p. 4.
  • [12] Soomro K, Zamir A, Shah M. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv 2012: 1212.0402.
  • [13] Al-Faris M, Chiverton J, Ndzi D, Ahmed AI. A Review on Computer Vision-Based Methods for Human Action Recognition. Journal of Imaging 2020; 6 (6): 46. doi: 10.3390/jimaging606004
  • [14] Chao Wu, Yaqian Li, Yaru Zhang, Bin Liu. Double constrained bag of words for human action recognition. Signal Processing: Image Communication 2021; 98: 116399. doi: 10.1016/j.image.2021.116399
  • [15] Zou Y, Ren X. An Efficient Action Recognition Framework Based on ELM and 3D CNN. In: Chinese Intelligent Systems Conference; Springer, Singapore; 2020. pp. 641-648.
  • [16] Ijjina EP, Chalavadi KM. Human action recognition in RGB-D videos using motion sequence information and deep learning. Pattern Recognition 2017; 72: 504-516. doi: 10.1016/j.patcog.2017.07.013
  • [17] Patel CI, Labana D, Pandya S, Modi K, Ghayvat H et al. Histogram of oriented gradient-based fusion of features for human action recognition in action video sequences. Sensors 2020. 20 (24): 7299. doi: 10.3390/s20247299
  • [18] Wang H, Kläser A, Schmid C, Liu CL. Dense trajectories and motion boundary descriptors for action recognition. International journal of computer vision 2013; 103 (1): 60-79. doi: 10.1007/s11263-012-0594-8
  • [19] Haoyuan Z, Yonghong H, Pichao W, Zihui G, Wanqing L. SAR-NAS: Skeleton-based action recognition via neural architecture searching. Journal of Visual Communication and Image Representation 2020; 73: 102942. doi:10.1016/j.jvcir.2020.102942.
  • [20] Scarpina F, Tagini S. The stroop color and word test. Frontiers in psychology 2017; 8: 557.
  • [21] Clark JM, Paivio A. Dual coding theory and education. Educational psychology review 1991; 3 (3): 149-210.
  • [22] Reed SK. Cognition: Theories and applications. CA, USA: CENGAGE learning, 2012.
  • [23] Sternberg RJ. Cognitive theory. CA, USA: Thomson Wadsworth, 2003.
  • [24] Shi J. Good features to track. In: 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition; Seattle, WA, USA; 1994. pp. 593-600. doi: 10.1109/CVPR.1994.323794
  • [25] Babu GS, Suresh S. Meta-cognitive RBF network and its projection based learning algorithm for classification problems. Applied Soft Computing 2013; 13 (1): 654-66. doi: 10.1016/j.asoc.2012.08.047
  • [26] Metcalfe J, Shimamura AP. Metacognition: Knowing about knowing. MIT press, 1994.
  • [27] Fleming SM, Frith CD. The cognitive neuroscience of metacognition. London, UK: Springer, 2014.
  • [28] Cox MT. Metacognition in computation: A selected research review. Artificial intelligence 2005; 169 (2): 104-141.
  • [29] Nelson TO. Metamemory: A theoretical framework and new findings. In: Psychology of Learning and Motivation. Academic Press, 1990, Vol. 26, pp. 125-173. doi: 10.1016/S0079-7421(08)60053-5
  • [30] Cheng S, Wu Y, Li Y, Yao F, Min F. TWD-SFNN: Three-way decisions with a single hidden layer feedforward neural network. Information Sciences 2021; 579: 15-32. doi :10.1016/j.ins.2021.07.091
  • [31] Guliyev NJ, Ismailov VE. On the approximation by single hidden layer feedforward neural networks with fixed weights. Neural Networks 2018; 98: 296-304. doi: 10.1016/j.neunet.2017.12.007
  • [32] Cheng X, Feng Z, Niu W. Forecasting Monthly Runoff Time Series by Single-Layer Feedforward Artificial Neural Network and Grey Wolf Optimizer. IEEE Access 2020; 8: 157346-157355. doi: 10.1109/ACCESS.2020.3019574
  • [33] Huang GB, Zhu QY, Siew CK. Extreme learning machine: theory and applications. Neurocomputing 2006; 70 (1-3): 489-501. doi: 10.1016/j.neucom.2005.12.126
  • [34] Huang GB, Zhu QY, Siew CK. Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE International Joint Conference on Neural Networks; Budapest, Hungary; 2004. Vol. 2, pp. 985-990.
  • [35] Liu B, Ju Z, Liu H. A structured multi-feature representation for recognizing human action and interaction. Neurocomputing 2018; 318: 287-96. doi: 10.1016/j.neucom.2018.08.066
  • [36] Ke Q, Bennamoun M, An S, Sohel F, Boussaid F. Learning clip representations for skeleton-based 3d action recognition. IEEE Transactions on Image Processing 2018; 27 (6): 2842-55. doi: 10.1109/TIP.2018.2812099
  • [37] Liu J, Wang G, Duan LY, Abdiyeva K, Kot AC. Skeleton-based human action recognition with global context-aware attention LSTM networks. IEEE Transactions on Image Processing 2018; 27 (4): 1586-99.
  • [38] Nikzad S, Ebrahimnezhad H. Two-person interaction recognition from bilateral silhouette of key poses. Journal of Ambient Intelligence and Smart Environments 2017; 9 (4): 483-499. doi: 10.3233/AIS-170442
  • [39] Mottaghi A, Soryani M, Seifi H. Action recognition in freestyle wrestling using silhouette-skeleton features. Engineering Science and Technology 2020; 23 (4): 921-30. doi: 10.1016/j.jestch.2019.10.008
  • [40] Liu X, Li Y, Guo T, Xia R. Relative view based holistic-separate representations for two-person interaction recognition using multiple graph convolutional networks. Journal of Visual Communication and Image Representation 2020; 70: 102833. doi: 10.1016/j.jvcir.2020.102833
  • [41] Berlin SJ, John M. Particle swarm optimization with deep learning for human action recognition. Multimedia Tools and Applications 2020; 79: 17349-17371.
  • [42] Garzón G, Martínez F. A Fast Action Recognition Strategy Based on Motion Trajectory Occurrences. Pattern Recognition and Image Analysis 2019; 29 (3): 447-56. doi: 10.1134/S1054661819030039
  • [43] Sahoo SP, Ari S. On an algorithm for human action recognition. Expert Systems with Applications 2019; 115: 524-34. doi: 10.1016/j.eswa.2018.08.014
  • [44] Wang Z, Jin J, Liu T, Liu S, Zhang J et al. Understanding human activities in videos: A joint action and interaction learning approach. Neurocomputing 2018; 321: 216-26. doi: 10.1016/j.neucom.2018.09.031
Turkish Journal of Electrical Engineering and Computer Sciences-Cover
  • ISSN: 1300-0632
  • Yayın Aralığı: Yılda 6 Sayı
  • Yayıncı: TÜBİTAK
Sayıdaki Diğer Makaleler

Interval observer-based supervision of nonlinear networked control systems

Afef Najjar, Messaoud Amairi, Thach Ngoc Dinh, Tarek Raissi

A hybrid acoustic-RF communication framework for networked control of autonomous underwater vehicles: design and cosimulation

Mehrullah SOOMRO, Özgur GÜRBÜZ, Saeed NOURIZADEH AZAR, Oytun ERDEMİR, Ahmet ONAT

Development of a control algorithm and conditioning monitoring for peak load balancing in smart grids with battery energy storage system

İbrahim ŞENGÖR, Sezai TAŞKIN, Turhan ATICI, Macit TOZAK, Osman DEMİRCİ

A survey on organizational choices for microservice-based software architectures

Burak BİLGİN, Hüseyin ÜNLÜ, Onur DEMİRÖRS

A novel fault detection approach based on multilinear sparse PCA: application on the semiconductor manufacturing processes

Riadh TOUMI, Yahia KOURD, Dimitri LEFEBVRE

Learning target class eigen subspace (LTC-ES) via eigen knowledge grid

Sanjay Kumar Sonbhadra, Sonali Agarwal, P. Nagabhushan

Classification and phenological staging of crops from in situ image sequences by deep learning

Uluğ BAYAZIT, Turgay ALTILAR, Nilgün GÜLER BAYAZIT

BLMDP: A new bi-level Markov decision process approach to joint bidding and task-scheduling in cloud spot market

Mehdi Dehghan Takht Fooladi, Mona Naghdehforoushh, Mohammad Hossein Rezvani, Mohammad Mehdi Gilanian Sadeghi

Lightweight distributed computing framework for orchestrating high performance computing and big data

Muhammed Numan İNCE, Joseph LEDET, Melih GÜNAY

A Comprehensive Survey for Non-Intrusive Load Monitoring

Eray YILDIZ, Efe İsa TEZDE