Classification of Psychogenic and Laryngeal Voice Diseases Based on Teager Energy Operator

Among several ways of communication, the voice remains the fastest natural tool for human-to-human and human-to-machine communication. That is why the research in automatic voice pathology detection and classification area has gained much interest in the recent years. Indeed, these automatic systems may be considered as assistive tools for the physicians during the assessment stage. This latter may help them to make decision, whether the voice signal belongs to a healthy or unhealthy subject and identifies the nature of pathology. In this context, this paper provides a voice pathology detection and classification system based on wavelet analysis and Teager Energy Operator (TEO). First, we used the input voice signal that we taken from Saarbrücken Voice Database (SVD) [1], to extract a set of features. These feature vectors are fed into a Gaussian Mixture Model (GMM) [2] for the sake of classification. The obtained results are 96.66% for the detection task and 92.5 % using TEO. These results show that our proposal outperforms some state-of-art methods used in voice pathology identification.

___

  • Reference1 : Saarbrucken Voice Database (SVD), version 2.0. Available at [accessed December 2017] http://www.stimmdatenbank.coli.uni-saarland.de/help_en. php4
  • Reference2: R. J. Schalkoff, “Pattern Recognition: Statistical, Structural and Neural Approaches,” New York: Wiley, 1991.
  • Reference3: National Institute on Deafness and Other Communication Disorders: Statistics on Voice, Speech, and Language. Available at http://www.nidcd.nih.gov/health/statistics/vsl/Pages/stats.aspx.Accessd on July, 2016.
  • Reference4: A. Al nasheri et al.,“Voice Pathology Detection and Classification using Auto-correlation and entropy features in Different Frequency Regions, ” in IEEE Access, vol. PP, no. 99, pp.1-1.
  • Reference5: C. L Ludlow, “Central nervous system control of the laryngeal muscles in humans,” Respiratory Physiology & Neurobiologie, vol. 147, pp. 205-255, 2005.
  • Reference6: J. BAKER, “Functional voice disorders: clinical presentations and differential diagnosis,” M. Hallett, J. Stone, and A. Carson, Ed. Handbook of Clinical Neurology, Elsevier, vol 139, pp. 389-405, 2016.
  • Reference7: J. R. Orozco-Arroyave et al., “Characterization Methods for the Detection of Multiple Voice Disorders: Neurological, Functional, and Laryngeal Diseases,” in IEEE Journal of Biomedical and Health Informatics, vol. 19, no. 6, pp. 1820-1828, 2015.
  • Reference8: J. Rusz, M. Novotný, J. Hlavnička, T. Tykalová and E. Růžička, “High-Accuracy Voice-Based Classification Between Patients With Parkinson’s Disease and Other Neurological Diseases May Be an Easy Task With Inappropriate Experimental Design,” in IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 8, pp. 1319-1321, Aug. 2017.
  • Reference9: J. R. Orozco-Arroyave, F. H¨onig, J. D. Arias-Londo˜no, J. F. Vargas-Bonilla, and E. N¨oth, “Spectral and cepstral analyses for Parkinson’s disease detection in Spanish vowels and words,” Expert Systems, pp. 1–10, 2015, in press, 2015.
  • Reference10: T. Bocklet, S. Steidl, E. N¨oth, and S. Skodda, “Automatic Evaluation of Parkinson’s Speech - Acoustic,” Prosodic and Voice Related Cues, Proceedings of the 15th INTERSPEECH, pp. 1149–1153, 2013.
  • Reference11: J. Rusz, R. Cmejla, H. Ruzickova, and E. Ruzicka, “Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson’s disease,” Journal of the Acoustical Society of America, vol. 129, pp. 350–367, 2011.
  • Reference12: D. Rahn, M. Chou, J. J. Jiang, and Y. Zhang, “Phonatory impairment in Parkinson’s disease: Evidence from nonlinear dynamic analysis and perturbation analysis,” in Journal of voice, vol. 21, pp. 64–71, 2007.
  • Reference13: A. Benba, A. Jilbab and A. Hammouch, “Discriminating Between Patients With Parkinson’s and Neurological Diseases Using Cepstral Analysis,” in IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 24, no. 10, pp. 1100-1108, Oct. 2016.
  • Reference14: H. Cordeiro, J. Fonseca, I. Guimarães, and C. Meneses, “Hierarchical Classification and System Combination for Automatically Identifying Physiological and Neuromuscular Laryngeal Pathologies”, in Journal of voice, vol 31.
  • Reference15: S.P. Nanavati, P.K. Panigrahi, “Wavelet Transform,” Resonance, Vol 9, Issue 3, pp50-64, March 2004.
  • Reference16: S. Mallat, “A Theory for multiresolution signal decomposition: Wavelet representation,” IEEE Trans. Pattern Analysis and Machine Intelligence. Vol. 11, No. 7, pp 674-693 July 1989.
  • Reference17: L. Salhi, M. Talbi, A. Cherif, “Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks,” World Academy of Science, Engineering and Technology, vol.21. 2008.
  • Reference18: H. M. Teager and S. M. Teager, “A Phenomenological Model for Vowel Production in the Vocal Tract,” Ch. 3, pp. 73–109. San Diego, CA: College-Hill Press, 1983.
  • Reference19: H. M. Teager and S. M. Teager, “Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract,” vol. 55 of D, pp. 241–261. France: Kluwer Acad. Publ., 1990.
  • Reference20: R. Hamila, J. Astola, F. Alaya Cheikh, M. Gabbouj, and M. Renfors, “Teager energy and the ambiguity function,” in IEEE Transactions on Signal Processing, vol 47, 1999.
  • Reference21: P. Maragos, J. F. Kaiser, and T. F. Quatieri, “Energy separation in signal modulations with application to speech analysis,” in IEEE Trans. Signal Processing, vol 41, 3024, 1993.
  • Reference22: E. Kvedalen, “Signal processing using the Teager Energy Operator end other nonlinear operators,” Cand. Scient Thesis, University of Oslo Departement of Informatics, May 2003.
  • Reference23: A. Ramović, L. Bandić, J. Kevrić, E. Germović and A. Subasi, “Wavelet and Teager Energy Operator (TEO) for Sound Processing and Identification,” CMBEBIH 2017, IFMBE Proceedings vol 62. Sringer, Singapore.
  • Reference24: D. A. Cairns, J. H. L. Hansen and J. E. Riski, “A noninvasive technique for detecting hypernasal speech using a nonlinear operator, ” in IEEE Transactions on Biomedical Engineering, vol. 43, no. 1, pp. 35, Jan. 1996.
  • Reference25: C. L. Jones and H. F. Jelinek, “Wavelet Packet Fractal Analysis of Neuronal Morphology,” METHODS 24, pp. 347–358, 2001.
  • Reference26: M. V. Wickerhauser, M. Farge, E. Goirand, “Theoritical Dimension and the Complexity of Simulated Turbulance,” Wavelet Analysis and Its Applications, vol.6, Elsevier, pp. 473-492, 1997.
  • Reference27: P. Henríquez, J. B. Alonso, M. A. Ferrer, C. M. Travieso, J. I. Godino-Llorente, and F. Díaz-de-María, “Characterization of Healthy and Pathological Voice Through Measures Based on Nonlinear Dynamics, ” in IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, vol. 17, NO. 6, AUGUST 2009.
International Journal of Applied Mathematics Electronics and Computers-Cover
  • ISSN: 2147-8228
  • Yayın Aralığı: Yılda 4 Sayı
  • Başlangıç: 2013
  • Yayıncı: Selçuk Üniversitesi