MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama

Konuşmacıyı tanıma ya da konuşmacıyı tanımlama konuşmacının ses sinyallerine ait parametrelerinin analiz edilmesi ile otomatik olarak tanınmasıdır. İnsan sesleri sahibine çok yüksek bağlılık içerir. Bu nedenle bu çalışmada Yasin Suresini okuyan 46 farklı kişiden kim olduğunu belirlemek için Youtube üzerinden veri kümesi elde edilmiştir. Elde edilen ses dosyalarından MFCC ve LBP ile öznitelik çıkarımı yapılmıştır. Öznitelik vektörleri çeşitli sınıflandırma algoritmaları ile sınanmış ve MFCC için %35,10 başarı elde edilirken LBP için %90,74 oranında başarılı sonuçlar elde edilmiştir. Kişi doğrulama için ise LBP’de %100 sınıflandırma başarısı elde edilmiştir.

Speaker Recognition and Speaker Verification by Comparison of MFCC and LBP Methods

Speaker recognition or speaker identification is the automatic recognition of the speaker by analyzing the parameters of the audio signals. Human voices contain a very high attachment to their owner. For this reason, in this study, a dataset was obtained from Youtube to determine who is from 46 different people who read Surah Yasin. Feature extraction was done from the obtained audio files with MFCC and LBP. Feature vectors have been tested with various classification algorithms and 35.10% success has been obtained for MFCC, while 90.74% success has been obtained for LBP. For person verification, 100% classification success was achieved in LBP.

___

  • Abdul Z. K., “Kurdish speaker identification based on one dimensional convolutional neural network,” Computational Methods for Differential Equations, vol. 7, no. 4 (Special Issue), pp. 566-572, 2019.
  • Patel K. ve Prasad R., “Speech recognition and verification using MFCC & VQ,” Int. J. Emerg. Sci. Eng.(IJESE), vol. 1, no. 7, pp. 137-140, 2013.
  • Kumar C. S. ve Rao P. M., “Design of an automatic speaker recognition system using MFCC, vector quantization and LBG algorithm,” International Journal on Computer Science and Engineering, vol. 3, no. 8, p. 2942, 2011.
  • Le-Qing L., “Insect sound recognition based on mfcc and pnn,” in 2011 International Conference on Multimedia and Signal Processing, 2011, vol. 2: IEEE, pp. 42-46.
  • Wanli Z. ve L. Guoxin L., “The research of feature extraction based on MFCC for speaker recognition,” in Proceedings of 2013 3rd International Conference on Computer Science and Network Technology, 2013: IEEE, pp. 1074-1077.
  • Bimbot F. et al., “A tutorial on text-independent speaker verification,” EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, pp. 1-22, 2004.
  • Singh S., “Forensic and Automatic Speaker Recognition System,” International Journal of Electrical & Computer Engineering (2088-8708), vol. 8, no. 5, 2018.
  • Kinnunen T. ve Li H., “An overview of text-independent speaker recognition: From features to supervectors,” Speech communication, vol. 52, no. 1, pp. 12-40, 2010.
  • Larcher A, Lee K. A., Ma B., ve Li H., “Text-dependent speaker verification: Classifiers, databases and RSR2015,” Speech Communication, vol. 60, pp. 56-77, 2014.
  • Sanjaya M. ve Salleh Z., “Implementasi Pengenalan Pola Suara Menggunakan Mel-Frequency Cepstrum Coefficients (Mfcc) Dan Adaptive Neuro-Fuzzy Inferense System (Anfis) Sebagai Kontrol Lampu Otomatis,” ALHAZEN Journal of Physics, vol. 1, no. 1, pp. 43-54, 2014.
  • Tiwari V., “MFCC and its applications in speaker recognition,” International journal on emerging technologies, vol. 1, no. 1, pp. 19-22, 2010.
  • Bansal P., Imam S. A., ve Bharti R., “Speaker recognition using MFCC, shifted MFCC with vector quantization and fuzzy,” in 2015 International Conference on Soft Computing Techniques and Implementations (ICSCTI), 2015: IEEE, pp. 41-44.
  • Yutai W., Bo L., Xiaoqing J., Feng L., ve Lihao W., “Speaker recognition based on dynamic MFCC parameters,” in 2009 International Conference on Image Analysis and Signal Processing: IEEE, pp. 406-409, 2009.
  • Ohini Kafui T., ve Mignotte M.. “Environmental sound classification using local binary pattern and audio features collaboration.” IEEE Transactions on Multimedia, 2020, vol. 23: pp. 3978-3985.
  • Sengupta, N., Sahidullah, M., & Saha, G. 2017. “Lung sound classification using local binary pattern”. arXiv preprint arXiv:1710.01703.
  • ER, M.B. “Heart sounds classification using convolutional neural network with 1D-local binary pattern and 1D-local ternary pattern features”. Applied Acoustics, 2021, vol. 180: 108152.
  • Yang, W., Krishnan, S. “Combining temporal features by local binary pattern for acoustic scene classification.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, vol. 25, no. 6: 1315-1321.
  • Abidin, S., Togneri, R., & Sohel, F. “Spectrotemporal analysis using local binary pattern variants for acoustic scene classification.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 11, 2112-2121, 2018.
  • Yang L., Chen X. ve Tao L., “Acoustic scene classification using multi-scale features”, Proc. Detection Classification Acoustic Scenes Events (DCASE), pp. 29-33, 2018.
  • Deng, M., Meng, T., Cao, J., Wang, S., Zhang, J., & Fan, H. “Heart sound classification based on improved MFCC features and convolutional recurrent neural networks.” Neural Networks, 130, 22-32, 2020.
  • Chauhan, S., Wang, P., Lim, C. S., & Anantharaman, V. “A computer-aided MFCC-based HMM system for automatic auscultation.” Computers in biology and medicine, 38(2), 221-233, 2008.
  • Rahmandani, M., Nugroho, H. A., & Setiawan, N. A. Cardiac sound classification using Mel-frequency cepstral coefficients (MFCC) and artificial neural network (ANN). In 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), pp. 22-26, IEEE, 2018.
  • Şaşmaz E., ve Tek, F. B. “Animal sound classification using a convolutional neural network.” In 2018 3rd International Conference on Computer Science and Engineering (UBMK) (pp. 625-629). IEEE, 2018.
  • Dewi S. P., Prasasti A. L., ve Irawan B. “The study of baby crying analysis using MFCC and LFCC in different classification methods.” In 2019 IEEE International Conference on Signals and Systems (ICSigSys) (pp. 18-23). IEEE, 2019.
  • Leena R., Mehta S.P., Mahajan A.S.. “Dabhade Comparative Study Of MFCC And LPC For Marathi Isolated Word Recognition System.” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering vol. 2, no. 6, p. 2133-2139, 2013.