Reverberation Effect on Online Hazardous Sound Event Detection

Bu makale, tehlikeli Ses Olayı Algılama (SED) ile ilgili araştırmanın sonuçlarını bildirmektedir. Araba kazalarını ve çığlıkları tespit etmek için Derin Sinir Ağlarını (DNN) kullandık. Güvenlik için sesli izleme uygulamalarının bu sesleri algılaması beklenir. Bu seslerin çevrim dışı (kayıtlı) veriler üzerinden tespiti konusunda birçok araştırma yapılmıştır ancak bu seslerin çevrim içi tespiti önemlidir. Araştırmamızın, kayıtlı veriler üzerinde tespit ve aynı veriler üzerinde çevrimiçi tespit yapıp sonuçları karşılaştırmamıza katkısı vardır. Çevrimiçi tespit sırasında, gözetim uygulamaları için tehlikeli SED uygulamak isteyen uygulayıcılara yardımcı olacak bazı önemli noktaları fark ettik. Testlerimiz, uzak hoparlör tanımada (DSR) karşılaşılan sorunların tehlikeli SED'de de görüldüğünü göstermektedir. SED ile ilgili mevcut araştırmalar, genellikle model geliştirme sırasında arka plan gürültüsünü hesaba katar. Çevrimiçi testlerimiz, yankılanmanın performansı önemli ölçüde düşürebileceğini gösteriyor. Çevrimiçi testlerin sonuçları olarak, gerçek dünya uygulamalarında kullanım için tehlikeli bir SED geliştirme modeli için bazı önerilerde bulunuyoruz.

Reverberation Effect on Online Hazardous Sound Event Detection

This paper reports the results of the research on hazardous Sound Event Detection (SED). We used Deep Neural Networks (DNN) to detect car crashes and screams. These are the two of the hazardous sound events on which studies are done for detection. We have selected these sounds because detection of these sounds and early warning can save lives. The research made on hazardous sound events are generally on recorded data. In this paper we wanted to show that there is a difference between recorded data and online (playing) data. At the end if an audio surveillance algorithm would be used in real time, to test it with online data was also an important part of the development. In this research we have developed an online detection environment which consists of a database, automatic audio playing and receiving software, detection software and automatic evaluating software. Our tests show that the reverberation degrades performance significantly. Current research on SED usually only takes into account background noise which is inserted artificially during model development. The results we have found during these online tests are the same as the ones we encountered during far field speaker recognition.

___

  • [1] T. Ahmed, M. Uppal and A. Muhammad, “Improving Efficiency and Realibility of Gunshot Detection Systems”, IEEE, ICASSP 2013.
  • [2] P. Thumwarin, T. Matsuura and K. Yakoompai, "Audio forensics from gunshot for firearm identification", Proc. IEEE 4th Joint International Conference on Information and Communication Technology Electronic and Electrical Engineering Tailand, pp. 1-4, 2014.
  • [3] S. Chu, S. Narayanan, C.J. Kuo, M.J. Mataric,., “Where am I? Scene recognition for mobile robots using audio features”, in 2006 IEEE Int.Conf. on Multimedia and Expo. IEEE, 885–888, 2006.
  • [4] N. Yamakawa, T. Takahashi, T. Kitahara, T. Ogata, H.G. Okuno, “Environmental sound recognition for robot audition using Matching-Pursuit”, International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, Springer Berlin Heidelberg, 1–10, 2011.
  • [5] J. Chen, A.H. Kam, J. Zhang, N. Liu, L. Shue, “Bathroom activity monitoring based on sound”, in Pervasive Computing, Springer Berlin Heidelberg, 47–61, 2005.
  • [6] M. Vacher, F. Portet, A. Fleury, N. Noury, “Challenges in the processing of audio channels for ambient assisted living”, in 2010 12th IEEE Int. Conf. on e-Health Networking Applications and Services (Healthcom), IEEE, 330–337, 2010.
  • [7] J.C. Wang, H.P. Lee, J.F. Wang, C.B. Lin, “Robust environmental sound recognition for home automation”., Automation Science and Engineering, IEEE Transactions on, 5 (1) (2008), 25–31.
  • [8] R. Bardeli, D. Wolff,, F. Kurth, M. Koch, K.H. Tauchert, K.H. Frommolt, “ Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring”, Pattern Recognition. Letters, 31 (12) (2010), 1524–1534.
  • [9] F. Weninger, B. Schuller, “Audio recognition in the wild: static and dynamic classification on a real-world database of animal vocalizations”, in 2011 IEEE Int.Conf. on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2011, 337–340.
  • [10] P. Foggia, A. Saggese, N. Strisciuglio, M. Vento, and N. Petkov, “Car crashes detection by audio analysis in crowded roads”, In Advanced Video and Signal Based Surveillance (AVSS), 2015 12th IEEE International Conference on, pages 1-6, Aug 2015.
  • [11] J. Rouas, J. Louradour, and S. Ambellouis, "Audio Events Detection in Public Transport Vehicle," Proc. of the 9th International IEEE Conference on Intelligent Transportation Systems, 2006.
  • [12] R. Radhakrishnan and A. Divakaran, “Systematic acquisition of audio classes for elevator surveillance,” in Image and Video Communications and Processing 2005, vol. 5685 of Proceedings of SPIE, pp. 64–71, March 2005.
  • [13] P. K. Atrey, N. C. Maddage, and M. S. Kankanhalli, “Audio based event detection for multimedia surveillance,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’06), vol. 5, pp. 813–816, Toulouse, France, May 2006.
  • [14] T. Virtanen, M. Plumbley, D. Ellis, “Computational Analysis of Sound Scenes and Events”, book, Springer, 21 Sep. 2017.
  • [15] Arslan Y. Detection and recognition of sounds from hazardous events for surveillance applications. PhD, Yıldırım Beyazıt University, Ankara, Turkey, 2018
  • [16] A. Mesaros, T. Heittola, A. Diment, B. Elizalde, A. Shah, E. Vincent, B. Raj, and T. Virtanen, “DCASE 2017 challenge setup: tasks, datasets and baseline system”, in Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), November 2017.
  • [17] E.Cakır and T. Virtanen, “Convolutional Recurrent Neural Networks for Rare Sound Event Detection”, DCASE 2017, 27 Nov. 2017.
  • [18] H. Lim, J. Park, K. Lee, Y.Han, “Rare Sound Event Detection Using 1D Convolutional Recurrent Neural Networks ”, DCASE 2017, 27 Nov. 2017.
  • [19] A. Dang, T. H. Vu, J. C. Wang, “Deep Learning For DCASE 2017 Challenge”, DCASE 2017, 16 Nov. 2017.
  • [20] J. Salamon and J. P. Bello, “Deep convolutional neural networks and data augmentation for environmental sound classification”, IEEE Signal Processing Letters, Vol. 24, No.3, March 2017.
  • [21] L. Gerosa, G. Valenzise, F. Antonacci, M. Tagliasacchi and A. Sarti, "Scream and gunshot detection in noisy environments," in EURASIP, Poznan, Poland, September 2007.
  • [22] P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio, M. Vento, "Reliable detection of audio events in highly noisy environments", Pattern Recognition Letters, vol. 65, pp. 22-28, 2015.
  • [23] F. Colangelo, F. Battisti, M. Carli, A. Neri, “Enhancing audio surveillance with hierarchical recurrent neural networks”, Advanced Video and Signal Based Surveillance (AVSS), 2017 14th IEEE International Conference on, Sept 2017.
  • [24] K. Lopatka J. Kotus, A. Czyzewski, “Detection, classification and localization of acoustic events in the presence of background noise for acoustic surveillance of hazardous situations”, Multimedia Tools and Applications, 75:1–33, 2016.
  • [25] P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio, M. Vento, "Audio surveillance of roads: A system for detecting anomalous sounds", IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 1, pp. 279-288, Jan 2016.
  • [26] Y.Arslan and H. Canbolat, “A sound database development for environmental sound recognition”, Signal Processing and Communications Applications Conference (SIU), 25th, 2017.
  • [27] A. Mesaros, T. Heittola, T. Virtanen, “Metrics for Polyphonic Sound Event Detection” Applied Sciences, vol. 6, no. 6, p. 162, 2016.
  • [28] M. A. Nematollahi, S. A. R. Al-Haddad, "Distant speaker recognition: an overview", International Journal of Humanoid Robotics, pp. 1550032, 2015.
  • [29] Q. Jin, T. Schultz, A. Waibel, "Far-Field Speaker Recognition", IEEE TASLP, vol. 15, no. 7, pp. 2023-2032, Sept. 2007.