Lung disease classification using machine learning algorithms

In this study we compared support vector machines (SVM), k-nearest neighbor (k-NN), and Gaussian Bayes (GB) algorithms in classification of respiratory diseases with text and audio data. An electronic stethoscope and its software are used to record patient information and 17930 lung sounds from 1630 subjects. SVM, k-NN and GB algorithms were run on 6 datasets to classify patients into; (1) sick or healthy with text data, (2) sick or healthy with audio MFCC features, (3) sick or healthy with the text data and audio MFCC features, (4) 12 diseases with text data, (5) for 12 disease with audio MFCC features, (6) for 12 disease with the text data and audio MFCC features. Accuracy results in SVM were %75, %88, %64, %73, %63, %70; for k-NN %95, %92, %92, %67, %64, %66; for GB %98, %91, %97, %58, %48, %58 respectively. In 12 class classification of lung diseases, the most accurate algorithm was SVM with text data. In classifying via audio data, k-NN was the most accurate. Using both audio and text data, SVM was the most accurate. When we classify healthy versus sick via text, audio and combined data, GB was always the most accurate with very high accuracy, closely followed by k-NN. We can infer from here that when we have large number of features but limited amount of samples, SVM and k-NN are best in classifying the dataset in more than two classes. However GB is best when it comes to classifying into two classes.

___

  • MedHelp. (2017, January 26). [Online]. Available: http://www.edhelp.org/Medical-Dictionary/Terms/2/8964.htm
  • I. Kononenko, “Inductive and Bayesian learning in medical diagnosis”, Applied Artificial Intelligence, vol. 7, no. 4, pp. 317-337, 1993. DOI:10.1080/08839519308949993
  • J. L. M. Amaral, A. J. Lopes, J. M. Jansen, A. C. D. Faria, and P. L. Melo, “Machine learning algorithms and forced oscillation measurements applied to the automatic identification of chronic obstructive pulmonary disease”, Computer Methods and Programs in Biomedicine, vol. 105, pp. 183-193, 2012.
  • M. Aykanat, Ö. Kılıç, B. Kurt, and S. Saryal, “Classification of lung sounds using convolutional neural networks”, EURASIP Journal on Image and Video Processing, vol. 65, pp. 1-9, 2017. DOI: 10.1186/s13640-017-0213-2.
  • E. Coiera, “Guide to Health Informatics”, 2nd ed., London: CRC Press, 2003.
  • R. Palaniappan, K. Sundaraj, and N. U. Ahamed, “Machine learning in lung sound analysis: A systematic review”, Biocybernetics and Biomedical Engineering, vol. 33, no. 3, pp. 129–135, 2013.
  • K. M. Sindhu, and H. S. Suresha, “Hand Gesture Recognition using DTW and Morphological Feature Extraction”, International Journal of Innovative Research in Computer and Communication Engineering (IJIRCCE), vol. 5, no. 5, pp. 10171-10175, 2017.
  • Z. S. Huang, C. C. Chuang, C. W. Tao, M. Y. Hsieh, C. X. Zhang, and C. W. Chang, “iOS-based people detection of multi-object detection system” In 2016 Joint 8th International Conference on Soft Computing and Intelligent Systems (SCIS) and 17th International Symposium on Advanced Intelligent Systems (ISIS, IEEE), Sapporo, Japan, pp. 868-873, August 2016.
  • G. Serbes, C. O. Sakar, Y. P. Kahya, and N. Aydin, “Feature extraction using time–frequency/scale analysis and ensemble of feature sets for crackle detection”, 33rd Annual International Conference of the IEEE EMBS, Boston, Massachusetts USA, pp. 3314–3317, 2011.
  • Ng. Andrew, (2017, January 16), Support Vector Machines. [Online]. Available: http://cs229.stanford.edu/notes/cs229-notes3.pdf
  • B. Flietstra, N. Markuzon, A. Vyshedskiy, and R. Murphy, “Automated analysis of crackles in patients with interstitial pulmonary fibrosis”, Pulm Med., vol. 2010, pp. 1-7, 2011.
  • A. Gouda, S. El Shehaby, N. Diaa, and M. Abougabal, “Classification Techniques for Diagnosing Respiratory Sounds in Infants and Children”, In 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, pp. 0354-0360, January 2019.
  • S. Don, “Random Subset Feature Selection and Classification of Lung Sound”, Procedia Computer Science, vol. 167, pp. 313-322, 2020.
  • S. H. Ah, and S. Lee, “Hierarchical representation using NMF neural information processing”, Heidelberg, Berlin: Springer, 2013.
  • T.C. Sağlık Bakanlığı Temel Sağlık Hizmetleri Genel Müdürlüğü, “Türkiye kronik hava yolu hastalıklarını (Astım - KOAH) önleme ve kontrol programı (2009 - 2013)”, Ankara: T.C. Sağlık Bakanlığı Eylem Planı, 2009.
  • H. Pasterkamp, S. S. Kraman, and G. R. Wodicka, “Respiratory sounds, advances beyond the stethoscope”, Am J Respir Crit Care Med, vol. 156, pp. 974–987, 1997.
  • J. E. Earis, and B. M. G. Cheetham, “Current methods used for computerized respiratory sound analysis”, Eur Respir Rev., vol. 10, no. 77, pp. 586–590, 2000.
  • Y. P. Kahya, E. C. Guler, and S. Sahin, “Respiratory disease diagnosis using lung sounds”, Engineering in Medicine and Biology Society, Proceedings of the 19th Annual International Conference of the IEEE, Chicago, IL, USA, pp. 2051-2053, 1997.
  • K. Ashizawa, T. Ishida, H. MacMahon, C. J. Vyborny, S. Katsuragawa, and K. Doi, “Artificial neural networks in chest radiography: application to the differential diagnosis of interstitial lung disease”, Academic Radiology, vol. 6, no. 1, pp. 2-9, 1999.
  • A. M. Santos, B. B. Pereira, J. M. Seixas, F. C. Mello, and A. L. Kritski, “Neural Networks: An Application for Predicting Smear Negative Pulmonary Tuberculosis”, Advances in Statistical Methods for the Health Sciences Statistics for Industry and Technology, pp. 275-287. DOI:10.1007/978-0-8176-4542-7_18.
  • M. Barua, H. Nazeran, P. Nava, V. Granda, and B. Diong, “Classification of pulmonary diseases based on impulse oscillometric measurements of lung function using neural networks”, In the 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Francisco, CA, USA, vol. 2, pp. 3848-3851, September 2004.
  • M. Barua, H. Nazeran, P. Nava, B. Diong, and M. Goldman, “Classification of impulse oscillometric patterns of lung function in asthmatic children using artificial neural networks”, In 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, pp. 327-331, January 2006.
  • O. Er, F. Temurtas, and A. Ç. Tanrıkulu, “Tuberculosis Disease Diagnosis Using Artificial Neural Networks”, Journal of Medical Systems, vol. 34 no. 3, pp. 299-302, 2008. DOI:10.1007/s10916-008-9241-x
  • O. Er, and F. Temurtas, “A study on chronic obstructive pulmonary disease diagnosis using multilayer neural networks”, Journal of Medical Systems, vol. 32, no. 5, pp. 429–432, 2008.
  • O. Er, C. Sertkaya, F. Temurtas, and A. C. Tanrikulu, “A comparative study on chronic obstructive pulmonary and pneumonia diseases diagnosis using neural networks and artificial immune system”, Journal of Medical Systems, vol. 33, no. 6, pp. 485–492, 2009.
  • F. Temurtas, “A comparative study on thyroid disease diagnosis using neural networks”, Expert Systems with Applications, vol. 36, pp. 944–949, 2009.
  • O. Er, N. Yumusak, and F. Temurtas, “Chest diseases diagnosis using artificial neural networks”, Expert Systems with Applications, vol. 37, no. 12, pp. 7648–7655, 2010.
  • M. Yamashita, S. Matsunaga, and S. Miyahara, “Discrimination between healthy subjects and patients with pulmonary emphysema by detection of abnormal respiration”, In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, pp. 693-696, May 2011.
  • A. Rao, S. Chu, N. Batlivala, S. Zetumer, and S. Roy, “Improved detection of lung fluid with standardized acoustic stimulation of the chest”, IEEE Journal of Translational Engineering in Health and Medicine, vol. 6, pp. 1-7, 2018.
  • S. Jayalakshmy, and G. F. Sudha, “Scalogram based prediction model for respiratory disorders using optimized convolutional neural networks”, Artificial Intelligence in Medicine, vol. 103, pp. 101809, 2020.
  • A. A. El-Solh, C. B. Hsiao, S. Goodnough, J. Serghani, and B. J. B. Grant, “Predicting active pulmonary tuberculosis using an artificial neural network”, Chest, vol. 116, pp. 968–973, 1999..
  • P. S. Heckerling, B. S. Gerbera, T. G. Tapec, and R. S. Wigton, “Use of genetic algorithms for neural networks to predict community-acquired pneumonia”, Artificial Intelligence in Medicine, vol. 30, pp. 71–84, 2004.