Özkan ARSLAN

Determination of Optimum Parameters for Cochlear Implants Speech Processors by Using Objective Measures

In a cochlear implant (CI) processor, several parameters such as channel numbers bandwidths, rectification type and cutoff frequency play an important role in acquiring enhanced speech. The effective and general purpose CI approach has been a research topic for a long time. In this study, it is aimed to determine the optimum parameters for CI users by using different channel numbers (4, 8, 12, 16 and 22), rectification types (half and full) and cutoff frequencies (200, 250, 300, 350 and 400 Hz). The CI approaches have been tested on Turkish sentences which are taken from METU database. The optimum CI structure has been tested with objective quality that weighted spectral slope (WSS) and objective intelligibility measures such as short-term objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ). Experimental results show that 400 Hz cutoff frequency, full wave rectifier and 16-channels CI approach give better quality and higher intelligibility scores than other CI approaches according to STOI, PESQ and WSS results. The proposed CI approach provides the ability to percept 91% of output vocoded Turkish speech for CI users.

Anahtar Kelimeler:

Cochlear implant, Vocoder, Filter bank, Objective intelligibility measures

Determination of Optimum Parameters for Cochlear Implants Speech Processors by Using Objective Measures

Bir koklear implant (Kİ) işlemcisinde, kanal sayıları, bant genişlikleri, doğrultma tipi ve kesme frekansı gibi çeşitli parametreler, gelişmiş konuşma elde etmede önemli bir rol oynamaktadır. Etkili ve genel amaçlı Kİ yaklaşımı uzun süredir araştırma konusu olmuştur. Bu çalışmada, farklı kanal sayıları (4, 8, 12, 16 ve 22), doğrultma tipleri (yarım ve tam dalga) ve kesme frekansları (200, 250, 300, 350 ve 400 Hz) kullanılarak Kİ kullanıcıları için optimum parametrelerin belirlenmesi amaçlanmıştır. Kİ yaklaşımları ODTÜ veri tabanından alınan Türkçe cümleler ile test edilmiştir. Optimum Kİ yapısı, ağırlıklı spektral eğim (WSS) gibi nesnel kalite, kısa-süreli nesnel anlaşılabilirlik (STOI) ve konuşma kalitesinin algısal değerlendirmesi (PESQ) gibi nesnel anlaşılabilirlik ölçütleri ile belirlenmiştir. Deneysel sonuçlar, 400 Hz kesme frekansı, tam dalga doğrultucu ve 16-kanallı Kİ yaklaşımının STOI, PESQ ve WSS sonuçlarına göre daha kaliteli ve daha yüksek anlaşılabilirlik skorları verdiğini göstermektedir. Önerilen Kİ yaklaşımı, implant kullanıcıları için çıkış kodlu konuşmanın %91'ini algılama yeteneği sağlamaktadır.

Keywords:

Cochlear implant, Vocoder, Filter bank, Objective intelligibility measures,

PDF

___

[1]. Dorman, M. F., Loizou, P. C., Fitzke, J., and Tu, Z., The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6–20 channels, The Journal of the Acoustical Society of America, 1998, 104(6), 3583-3585.
[2]. Guo, R., Ma, X., Liao, M., Liu, Y., Hu, Y., Qian, X., and Tang, M., Development and application of cochlear implant-based electric-acoustic stimulation of spiral ganglion neurons. ACS Biomaterials Science & Engineering, 2019, 5(12), 6735-6741.
[3]. Wouters, J., McDermott, H. J., and Francart, T., Sound coding in cochlear implants: From electric pulses to hearing. IEEE Signal Processing Magazine, 2015, 32(2), 67-80.
[4]. Berg, K. A., Noble, J. H., Dawant, B. M., Dwyer, R. T., Labadie, R. F., and Gifford, R. H., Speech recognition with cochlear implants as a function of the number of channels: Effects of electrode placement. The Journal of the Acoustical Society of America, 2020, 147(5), 3646-3656.
[5]. Bratu, E., Dwyer, R., and Noble, J., A Graph-Based Method for Optimal Active Electrode Selection in Cochlear Implants. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 2020, (pp. 34-43). Springer, Cham.)
[6]. Berg, K. A., Noble, J., Dawant, B., Dwyer, R., Labadie, R., Richards, V., and Gifford, R., Musical sound quality as a function of the number of channels in modern cochlear implant recipients. Frontiers in neuroscience, 2019, 13, 999.
[7]. Mourão, G. L., Costa, M. H., and Paul, S., Speech Intelligibility for Cochlear Implant Users with the MMSE Noise-Reduction Time-Frequency Mask. Biomedical Signal Processing and Control, 2020, 60, 101982.
[8]. Loizou, P. C., Dorman, M., and Tu, Z., On the number of channels needed to understand speech. The Journal of the Acoustical Society of America, 1999, 106(4), 2097-2103.
[9]. Lee, S., Mendel, L. L., and Bidelman, G. M., Predicting speech recognition using the speech intelligibility index and other variables for cochlear implant users. Journal of Speech, Language, and Hearing Research, 2019, 62(5), 1517-1531.
[10]. Baumann, U., Stöver, T., and Weißgerber, T., Device profile of the MED-EL Cochlear Implant System for hearing loss: overview of its safety and efficacy. Expert Review of Medical Devices, 2020, 17(7), 599-614.
[11]. Dorman, M. F., and Loizou, P. C., The identification of consonants and vowels by cochlear implant patients using a 6-channel continuous interleaved sampling processor and by normal-hearing subjects using simulations of processors with two to nine channels. Ear and hearing, 1998, 19(2), 162-166.
[12]. Nogueira, W., Büchner, A., Lenarz, T., and Edler, B., A psychoacoustic" NofM"-type speech coding strategy for cochlear implants. EURASIP Journal on Advances in Signal Processing, 2005, 18, 101672.
[13]. Hu, Y., and Loizou, P. C., A new sound coding strategy for suppressing noise in cochlear implants. The Journal of the Acoustical Society of America, 2008, 124(1), 498-509.
[14]. Buechner, A., Frohne-Buechner, C., Boyle, P., Battmer, R. D., and Lenarz, T., A high rate n-of-m speech processing strategy for the first generation Clarion cochlear implant. 2009, Taylor & Francis.
[15]. Churchill, T. H., Kan, A., Goupell, M. J., and Litovsky, R. Y., Spatial hearing benefits demonstrated with presentation of acoustic temporal fine structure cues in bilateral cochlear implant listeners. The Journal of the Acoustical Society of America, 2014, 136(3), 1246-1256.
[16]. Churchill, T. H., Kan, A., Goupell, M. J., Ihlefeld, A., and Litovsky, R. Y., Speech perception in noise with a harmonic complex excited vocoder. Journal of the Association for Research in Otolaryngology, 2014, 15(2), 265-278.
[17]. Lai, Y. H., Chen, F., Wang, S. S., Lu, X., Tsao, Y., and Lee, C. H., A deep denoising autoencoder approach to improving the intelligibility of vocoded speech in cochlear implant simulation. IEEE Transactions on Biomedical Engineering, 2016, 64(7), 1568-1578.
[18]. Stafford, R. C., Stafford, J. W., Wells, J. D., Loizou, P. C., and Keller, M. D., Vocoder simulations of highly focused cochlear stimulation with limited dynamic range and discriminable steps. Ear and Hearing, 2014, 35(2), 262-270.
[19]. Chen, F., Predicting the intelligibility of cochlear-implant vocoded speech from objective quality measure. J. Med. Biol. Eng, 2012, 32(3), 189-194.
[20]. Santos, J. F., Cosentino, S., Hazrati, O., Loizou, P. C., and Falk, T. H., Objective speech intelligibility measurement for cochlear implant users in complex listening environments. Speech Communication, 2013, 55(7-8), 815-824.
[21]. Chen, F., and Loizou, P. C., Predicting the intelligibility of vocoded speech. Ear and hearing, 2011, 32(3), 331.
[22]. Goldsworthy, R. L., and Greenberg, J. E., Analysis of speech-based speech transmission index methods with implications for nonlinear operations. The Journal of the Acoustical Society of America, 2004, 116(6), 3679-3689.
[23]. Taal, C. H., Hendriks, R. C., Heusdens, R., and Jensen, J., An algorithm for intelligibility prediction of time–frequency weighted noisy speech. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7), 2125-2136.
[24]. ITU-T Rec. P.862. International Telecommunication Union. Geneva, Switzerland: Feb. 2001 Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs.
[25]. Kates, J. M., and Arehart, K. H., The hearing-aid speech quality index (HASQI). Journal of the Audio Engineering Society, 2010, 58(5), 363-381.
[26]. Kates, J. M., & Arehart, K. H., The hearing-aid speech perception index (HASPI). Speech Communication, 2014, 65, 75-93. [27]. Hu, Y., & Loizou, P. C., Evaluation of objective quality measures for speech enhancement. IEEE Transactions on audio, speech, and language processing, 2007, 16(1), 229-238.
[28]. Arslan, Ö., and Engin, E. Z., Speech enhancement using adaptive thresholding based on gamma distribution of Teager energy operated intrinsic mode functions. Turkish Journal of Electrical Engineering & Computer Sciences, 2019, 27(2), 1355-1370.
[29]. Möller, S., Chan, W. Y., Cote, N., Falk, T. H., Raake, A., and Wältermann, M., Speech quality estimation: Models and trends. IEEE Signal Processing Magazine, 2011, 28(6), 18-28.
[30]. Kates, J. M., Arehart, K. H., Anderson, M. C., Muralimanohar, R. K., and Harvey Jr, L. O., Using Objective Metrics to Measure Hearing-Aid Performance. Ear and hearing, 2018, 39(6), 1165.
[31]. Salor, Ö., Pellom, B. L., Ciloglu, T., and Demirekler, M., Turkish speech corpora and recognition tools developed by porting sonic: Towards multilingual speech recognition. Computer Speech & Language, 2007, 21(4), 580-593.
[32]. Rix, A. W., Beerends, J. G., Hollier, M. P., and Hekstra, A. P., Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221, 2001, (Vol. 2, pp. 749-752). IEEE.
[33]. Sharma, D., Wang, Y., Naylor, P. A., and Brookes, M., A data-driven non-intrusive measure of speech quality and intelligibility. Speech Communication, 2016, 80, 84-94.