Automatic Speaker Gender Identification for the German Language

Automatic Speaker Gender Identification for the German Language

— Authentication systems necessitate transmission, design and classification of biometric data in a secure manner. Moreover, in voice process of biometric can be obtained successful results by determining gender of speaker. In this study, the aim was to designed system taking German sound forms and properties for automatic recognition gender of speaker. Approximately 2658 German voice samples of words and clauses with differing lengths have been collected from 50 males and 50 females. This voice samples includes more than one word as a word. Features of these voice samples have been obtained using MFCC (Mel Frequency Cepstral Coefficients). Feature vectors of the voice samples obtained have been trained with such methods as Hidden Markov Model, Dynamic Time Warping and Artifical Neural Network. In the test phase, gender of a given voice sample has been identified taking the trained voice samples into consideration. Results and performances of the algorithms employed in the study for classification have been also demonstrated in a comparative manner

___

  • [1] Quan, Jie-Fu, Fan Gang, Zeng F and Robert, Shannon etc., (“Importance of tonal envelope cues in Chinese speech recognition”, The Journal of the Acoustical Societct of America, Vol.104, No.1, pp.505-510, 1998.
  • [2] Keiichi, Tokuda , Heiga, Zen and Alan, Black, “An HMM- Based Speech Synthesis System Applied to English”, Proc.of 2002 IEEE SSW, pp.227-230, 2012.
  • [3] Douglas, Reynold , Walter, Andrews and Joseph, Campbell etc.,“The SuperSID Project: Exploiting High-Level Information for HighAccuracy Speaker Recognition”, In.Proc. ICASSP, Hong Kong, pp.784-787, 2003.
  • [4] Lindasalwa, Muda and Mumtaj, Began, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Journal Computing, Vol.2, No.3, pp.138-143, ISBN 2151-9617, 2010.
  • [5] Edmondo, Trentin and Marko, Gori, “A survey of hybrid ANN/HMM models for automatic speech recognition”, Elsevier Neurocomputing 37, pp.91-126, 2001.
  • [6] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, vol.35, issue 1, pp.229-244, 2002.
  • [7] Theodore L. Perry, Ralph N. Ohde,a) and Daniel H. Ashmead, ” The acoustic bases for gender identification from children’s voices”, J. Acoust. Soc. Am. 109 (6), pp.2988-2998, 2001.
  • [8] Douglas, Reynolds, Thomas, Quatieri and Robert, Dunn, “Speaker Verification using Adapted Gaussian Mixture Models”, Digital Signal Processing 10, pp.19-41, 2000.
  • [9] Wouter, Gevaert, Georgi, Tsenov and Valeri, Mladenov, “Neural networks used for speech recognition”, Journal of Automatic Control, Vol.20, pp.1-7, 2010.
  • [10] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “ Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques”, Jornal of Computing, Vol.2, No.3, pp.138-143, ISSN 2151-9617, 2010.
  • [11] Eluned, Parris, Micheal, Carey, “Language Independent Gender Identification”, Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on, Vol.2, pp.685-688, 1996.
  • [12] Lihang, Li, Dongqing, Chen and Sarang, Lakare etc, “Image segmentation approach to extract colon lümen through colonic material taggng and hidden markov random field model for virtual colonoskopy”, Medical Imaging, 2002.
  • [13] Seok, Oh and Ching, Suen, “A class-modular feed forward neural network for handwriting recognition”, Pattern Recognition, Vol.35, No.1, pp.229-244, 2002.