Kamaldeep Kaur, Parminder Singh

Punjabi Emotional Speech Database:Design, Recording and Verification

This paper introduces Punjabi Emotional Speech Database that has been created to evaluate the recognition of emotions in speech, by the humans and the computer system. The database has been designed, recorded and verified using various standards. The results set a standard for identifying emotions from Punjabi speech. Six emotions are simulated for the collection of speech corpus, including happy, sad, fear, anger, neutral and surprise. 15 speakers, with age group 20-45 years have participated in the recordings for this database. Finally, this database has been used to further design and develop the speech emotion recognition system for Punjabi language.

PDF

___

[1] I. Luengo, E. Navas, and I. Hernáez, “Feature Analysis and Evaluation for Automatic Emotion Identification in Speech,” IEEE Transactions on Multimedia, vol. 12, no. 6, pp. 490– 501, 2010, doi: 10.1109/TMM.2010.2051872.
[2] S. Kuchibhotla, H. D. Vankayalapati, R. S. Vaddi, and K. R. Anne, “A comparative analysis of classifiers in emotion recognition through acoustic features,” International Journal of Speech Technology, vol. 17, no. 4, pp. 401–408, 2014, doi: 10.1007/s10772-014-9239-3.
[3] P. Chandrasekar, S. Chapaneri, and D. Jayaswal, “Automatic speech emotion recognition: A survey,” in 2014 International Conference on Circuits, Systems, Communication and Information Technology Applications, CSCITA 2014, 2014, pp. 341–346, doi: 10.1109/CSCITA.2014.6839284.
[4] S. Bansal and A. Dev, “Emotional hindi speech database,” 2013 International Conference Oriental COCOSDA Held Jointly with 2013 Conference on Asian Spoken Language Research and Evaluation, O-COCOSDA/CASLRE 2013, pp. 1–4, 2013, doi: 10.1109/ICSDA.2013.6709867.
[5] S. G. Koolagudi and K. S. Rao, “Emotion recognition from speech: A review,” International Journal of Speech Technology, vol. 15, no. 2, pp. 99–117, 2012, doi: 10.1007/s10772-011-9125-1.
[6] J. Gomes and M. El-Sharkawy, “i-Vector Algorithm with Gaussian Mixture Model for Efficient Speech Emotion Recognition,” 2015 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 476–480, 2015, doi: 10.1109/CSCI.2015.17.
[7] Z. Zhang, E. Coutinho, J. Deng, and B. Schuller, “Cooperative learning and its application to emotion recognition from speech,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 23, no. 1, pp. 115–126, 2015, doi: 10.1109/TASLP.2014.2375558.
[8] P. Jackson and S. ul haq, Surrey Audio-Visual Expressed Emotion (SAVEE) database. 2011.
[9] M. Liberman, “Emotional prosody speech and transcripts,” LDC2002S28,University of Pennsylvania, 2002. .
[10] F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, and B. Weiss, “A database of German emotional speech,” in 9th European Conference on Speech Communication and Technology, 2005, vol. 5, pp. 1517–1520.
[11] V. Hozjan, Z. Kacic, A. Moreno, A. Bonafonte, and A. Nogueiras, “Interface databases: design and collection of a multilingual emotional speech database,” 2002.
[12] X. Mao and L. Chen, “Speech emotion recogition based on parametric filter and fractal dimension,” IEICE Transactions on Information and Systems, vol. E93-D, no. 8, pp. 2324– 2326, 2010, doi: 10.1587/transinf.E93.D.2324.
[13] C. academic of science Institute of automation, “CASIAChinese Emotional Speech Corpus,” Chinese Linguistic Data Consortium (CLDC), 2005. .
[14] I. S. Engberg, A. V Hansen, O. Andersen, and P. Dalsgaard, “Design, recording and verification of a danish emotional speech database,” in 5th European Conference on Speech Communication and Technology, Rhodes, Greece, 1997, pp. 1–4.
[15] S. G. Koolagudi, R. Reddy, J. Yadav, and K. S. Rao, “IITKGP-SEHSC : Hindi speech corpus for emotion analysis,” 2011 International Conference on Devices and Communications, ICDeCom 2011 - Proceedings, 2011, doi: 10.1109/ICDECOM.2011.5738540.
[16] S. G. Koolagudi, S. Maity, V. A. Kumar, S. Chakrabarti, and K. S. Rao, “IITKGP-SESC: Speech Database for Emotion Analysis,” in Contemporary Computing, 2009, pp. 485–492.
[17] R. Banse and K. Scherer, “Acoustic Profiles in Vocal Emotion Expression,” Journal of personality and social psychology, vol. 70, pp. 614–636, Apr. 1996, doi: 10.1037/0022-3514.70.3.614.
[18] K. Scherer, “Speech and emotionnal states.,” in Speech Evaluation in psychiatry, 1981.
[19] E. M. Albornoz, D. H. Milone, and H. L. Rufiner, “Spoken emotion recognition using hierarchical classifiers,” Computer Speech and Language, vol. 25, no. 3, pp. 556–570, 2011, doi: 10.1016/j.csl.2010.10.001.