Timur DÜZENLİ, Nalan ÖZKURT

Comparison OF Wavelet Based Feature Extraction Methods for Speech/Music Discrimination

The speech/music discrimination systems have gaining importance in several intelligent audio retrieval algorithms due to the increasing size of the multimedia sources in our daily lives. This study aims to propose a speech/music discrimination system which utilizes the advantages of the wavelet transform. Also, the performance of the discrete wavelet transform and the dual- tree wavelet transform has been compared with the conventional time, frequency and cepstral domain features used in speech/music discrimination. The speech and music samples collected from common databases, CD recording and internet radios have been classified with artificial neural networks with different feature sets. The principal component analysis has been applied to eliminate the correlated features before classification stage. Considering the number of vanishing moments and orthogonality, the best performance has been obtained with Daubechies8 wavelet among the other members of the Daubechies family. According to the results, the proposed feature set outperforms the traditional ones.

Keywords:

-,

PDF

___

Ambikairajah, O. M. E., Epps, J., “Novel features for effective speech and music discrimination,” in Proc. IEEE Int. Conf. on Engineering of Intelligent Systems, pp. 1–5, 2006.
Exposito, N. R. J.E.M., Galan, S.G., Candeas, P., “Audio coding improvement using evolutionary speech/music discrimination,” in Proc. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), pp. 1–6, 2007.
El-Maleh, K., Petrucci, M. G., Kabal, P., “Speech/music discrimination for multimedia applications,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2445–2448, 2000.
Gedik, A., Bozkurt, B., “Pitch frequency histogram based music information retrieval for turkish music,” Signal Processing, vol. 10, pp. 1049–1063, 2010.
Saunders, J., “Real time discrimination of broadcast speech/music,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 993–996, 1996.
Scheier, E., Slaney, M., “Construction and evaluation of a robust multifeature speech/music discriminator,” in Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, ICASSP’97, pp. 1331–1334, 1997
Ajmera, I. M. J., Bourlard, H., “Speech/music segmentation using entropy and dynamism features in a HMM classification framework,” Speech Communication, vol. 40, pp. 351–363, 2003.
Panagiotakis, C., Tziritas, G., “A speech/music discriminator based on RMS and zero-crossings,” IEEE Trans. Multimedia, vol. 7, pp. 155–166, 2005.
Tzanetakis, G. E. G., Cook, P., “Audio analysis using the discrete wavelet transform,” in Proc. Conf. in Acoustics and Music Theory Applications. WSES, pp. 318–323, 2001.
Didiot E., Illina, I., Fohr, D., Mella, O., “A wavelet- based parameterization for speech/music discrimination,” Computer Speech and Language, vol. 24, pp. 341–357, 2010. [11] Ntalampiras, S., Fakotakis, N., “Speech /music discrimination based on discrete wavelet transform,” in Proc. of 5th Hell. Conf. On Art.Int., SETN’08, LNAI 5138, Greece, Oct. 2008, pp. 205–211, 2008
Khan, M., Al-Khatib, W., “Machine-learning based classiŞcation of speech and music,” ACM Jour. on Multimedia Systems, vol. 12, pp. 55–67, 2006.
Mallat, S., A wavelet tour of signal processing. Academic Press, 1999
Zheng, F., Zhang, G., Song, Z., “Comparison of different implemantations of mfcc,” Arch. Rat. Mech. Anal., vol. 16, pp. 582–589, 2001.
Selesnick, I.W., Baraniuk, R.G., Kingsbury, N.G. “The Dual-Tree ComplexWavelet Transform”, IEEE Sig.Proc. Mag. 22, pp. 123–151, 2005.
Kingsbury, N.G., “The dual-tree complex wavelet transform: a new technique for shift invariance and directional Şlters”, Proc. of the IEEE Digital Signal Processing Workshop, 1998.
Düzenli, T., (2010). Classification of Speech and Musical Signals Using Wavelet Domain Features, MSc. Thesis submitted to Dokuz Eylül University, Graduate School Of Natural And Applied Sciences.
Charalambous, C., Conjugate gradient algorithm for efficient training of artificial neural networks. IEEE Proceedings-G on Circuit Devices and System, 139 (3), pp. 301- 310, 1992
A. Toker, S. Özcan, H. Kuntman, O. Çiçekoğlu, “Supplementary all-pass sections with reduced number of passive elements using a single current conveyor”, Int J of Electronics, vol.88, pp.969-976,2001.
U. Çam, O. Çiçekoğlu, M. Gülsoy, H. Kuntman, “New voltage and current mode first-order all-pass filters using single FTFN”, Frequenz, vol.7-8, pp.177-179,2000.
R. Schauman, M. E. Valkenburg, “Design of analog filters”, Oxford University Press, New York, 2001.
Nalan Özkurt received her B.S., M.S. and Ph.D. degree in Electrical
Engineering from the Dokuz Eylul University, in 1994, 1998 and 2004, respectively. She is currently an assistant professor in the Department of Electrical Engineering at
Yaşar University. Her research interests are wavelets, nonlinear static and dynamical systems, chaos. She is a member of Association of Electrical and Electronic Engineers of Turkey.
Timur Düzenli received his B.S. in 2007 and his M.S. in 2010, both
Electronics Engineering, from Dokuz Eylul University. He is currently a Ph.D. student at the same department. Her research interests are wavelets, time- frequency analysis, and digital communication systems. and