Speech Denoising using Common Vector Analysis in Frequency Domain

Signal denoising approaches on data of any dimension largely relies on the assumption that data and the noise components and the noise itself are somewhat uncorrelated. However, any denoising process heavily depending on this assumption retreats when the signal component is adversely affected by the operation. Therefore, several proposed algorithms try to separate the data into two or more parts with varying noise levels so that denoising process can be applied on them with different parameters and constraints. In this paper, the proposed method separates the speech data into magnitude and phase where the magnitude part is further separated into common and difference parts using common vector analysis. It is assumed that the noise largely resides on difference part and therefore denoised by a known algorithm. The speech data is reconstructed by combining common, difference and phase parts. Using Linear Minimum Mean Square Error Estimation algorithm on the difference part, excellent denoising results are obtained. Results are compared with that of the state of the art in well-known speech quality measures.

___

  • M. Dendrinos, S. Bakamidis and G. Carayannis, "Speech enhancement from noise: a regenerative approach," Speech Comm., vol. 10 (1), pp. 45-57, 1991.
  • S. H. Jensen, P. C. Hansen, S. D. Hansen and J. A Sorensen, "Reduction of broadband noise in speech by truncated QSVD," IEEE Trans. Speech Audio Processing, vol. 3, pp. 439-448, 1995.
  • Y. Ephraim and H. L. Van Trees, "A signal subspace approach for speech Enhancement," IEEE Trans. on Speech Audio Processing, 3, 251. 166, 1995.
  • U. Mittal and N. Phamdo, "Signal/noise KLT based approach for enhancing speech degraded by colored noise," IEEE Trans. Speech and Audio Proc., vol. 8 (2), pp. 159-167, 2000.
  • A. Rezayee and S. Gazor, "An Adaptive KLT Approach for Speech Enhancement," IEEE Trans. Speech and Audio Proc., vol. 9 (2), pp. 87-95, 2001.
  • S. Ergin, "The Improvement and Recognition of the Noisy Speech Parameters," M. Eng. Thesis, Eskisehir Osmangazi University, Eskisehir, Turkey, 2004, [Online]. Available: http://ulusaltezmerkezi.com/gurultulu-ses-parametrelerinin-iyilestirilmesi-ve-taninmasi/ .
  • M. B. Gülmezoglu. and A. Barkana, "Text-dependent speaker recognition by using Gram-Schmidt orthogonalization method," Proc. of IASTED Int. Conf. on Sig. Proc. & App., 438-440, 1998.
  • M. B. Gülmezoglu, V. Dzhafarov, M. Keskin and A. Barkana, "A novel approach to isolated word recognition," IEEE Trans. on Acoustic Speech and Signal Processing, vol. 7(6), pp. 620-628, 1999.
  • M.B.Gülmezoglu, V. Dzhafarov and A. Barkana, "The Common Vector Approach and its relation to Principal Component Analysis," IEEE Trans. On Speech and Audio Processing, vol. 9( 6), pp. 655-662, 2001.
  • M. B. Gülmezoglu, V. Dzhafarov and A. Barkana, "The Common Vector Approach and its Comparison with Other Subspace Methods in Case of Sufficient Data," Computer Speech and Language, vol. 21, pp. 266-281, 2007.
  • H. Çevikalp, M. Neamtu, M. Wilkes and A. Barkana, "Discriminative Common Vectors for Face Recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27(1), pp. 4-13, 2005.
  • M. B. Gülmezoglu and S. Ergin, "An Approach for Bearing Fault Detection in Electrical Motors," European Transactions on Electrical Power, vol. 17(6), pp. 628-641, 2007.
  • S. Günal, S. Ergin, M. B. Gülmezoglu and Ö. N. Gerek, "On Feature Extraction for Spam E-Mail Detection," Lecture Notes in Computer Science, vol. 4105, pp. 635-642, 2006.
  • K. Özkan and E. Seke, "Image Denoising Using Common Vector Approach, "IET Image Processing, vol. 9(8), pp. 709-715, 2015.
  • L. Zhang, W. S. Dong, D. Zhang, and G. M. Shi, "Two-stage image denoising by principal component analysis with local pixel grouping," Pattern Recognition, vol. 43, pp. 1531-1549, 2010.
  • F. Jabloun and B. Champagne, "Incorporating the human hearing properties in the signal subspace approach for speech enhancement," IEEE Trans. on Speech and Audio Processing, vol. 11(6), pp. 700-708, 2003.
  • Y. Hu and P. Loizou, "Incorporating a psychoacoustical model in frequency domain speech enhancement," IEEE Signal Processing Letters, vol. 11(2), pp. 270-273, 2004.
  • Y. Hu and P. Loizou, "Speech enhancement based on wavelet thresholding the multitaper spectrum," IEEE Trans. on Speech and Audio Processing, vol. 12(1), pp. 59-67, 2004.
  • S. Rangachari and P. Loizou, "A noise estimation algorithm for highly nonstationary environments," Speech Communication, vol. 28, pp. 220-231, 2006.
  • G. Doblinger, "Computationally Efficient Speech Enhancement By Spectral Minima Tracking in Subbands," Proc. EuroSpeech, vol. 2, p. 1513-1516, 1995
  • Y. Hu and P. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Trans. Speech Audio Processsing, vol. 16 (1), pp. 229-238, 2008.
  • S. Quackenbush, T. Barnwell and M. Clements, "Objective measures of speech quality," NJ: Prentice-Hall, Eaglewood Cliffs, 1988.
  • J. H. L. Hansen and B. L. Pellom, "An effective quality evaluation protocol for speech enhancement algorithms," Int'l. Conf. on Spoken Language Processing, vol. 7, pp. 2819–2822 , 1998.
  • N. Kitawaki, H. Nagabuchi and K. Itoh, "Objective quality evaluation for low bitrate speech coding systems," IEEE J. Select. Areas in Comm., vol. 6(2), pp. 262-273, 1988