Assessment of disordered voices based on an optimized glottal source model

In this paper, a method for the assessment of disordered voices is proposed. A feature named mean opening quotient (MOQ) obtained from the glottal source estimation is used as an acoustic cue to summarize the degree of severity of the voice disorder. The analysis method uses the empirical mode decomposition algorithm to estimate the glottal source excitation signal from the speech signal. The logarithm of the magnitude spectrum of the speech signal is decomposed into oscillatory modes, called intrinsic mode functions, that are clustered into two classes, the spectral envelope and the harmonic component. The exploitation of the phase information jointly with the estimated harmonic component enables the estimation of the glottal source signal. An appropriate parametric model is fitted to the estimated glottal source excitation signal. The optimal parameters of the glottal source excitation model from which the MOQ is defined are obtained by using a genetic algorithm. The presented method is tested on a corpus of natural speech including the vowel [a] uttered by 22 normophonic speakers and 229 speakers with different degrees of dysphonia. Experimental results show that the proposed method is very effective for assessing the degree of severity of the voice disorder.