Disease prognosis using machine learning algorithms based on new clinical dataset

Disease prognosis using machine learning algorithms based on new clinical dataset

Today, artificial intelligence-based solutions are produced to facilitate human life in almost every field. The healthcare sector is one of the sectors which took advantage of these solutions. Due to reasons such as the world’s ever-expanding population, ongoing epidemics, and the emergence of new disease types, it is becoming increasingly difficult for a patient to benefit from health services quickly and to make an accurate diagnosis. At this juncture, artificial intelligence reduces the patient density in hospitals, enables patients to access accurate information, and allows medical students to practice by seeing new cases. In this study, a new and reliable dataset was created with disease information obtained from various sources under the supervision of a specialist medical doctor. Then, new patient histories were added to the dataset used in the previous study, the experiments were repeated with the same algorithms, and the accuracy score comparison was presented. The created dataset includes 2006 unique patient histories, 358 symptoms, and 141 diseases and we think it will be a valuable dataset for researchers who make developments using machine learning in the field of healthcare. Various machine learning algorithms have been used in the training process to predict diseases belonging to different branches of medicine, such as diabetes, bronchial asthma, and covid. Besides, Support Vector Machine, Naive Bayes, K-Nearest Neighbors, Multilayer Perceptron, Decision Tree, and Random Forest algorithms, we also studied popular boosting algorithms such as XGBoost and LightGBM. All algorithms were validated with cross-validation and performance comparisons were made with different performance metrics such as accuracy, precision, recall, and f1-score. It is also the first study to achieve an accuracy score of 99.33% with a dataset that involves a greater number of diseases than the datasets used in the studies examined.

___

  • Our World in Data, (2022). Available: https://ourworldindata.org/births-and-deaths/. [Accessed: December 2022].
  • Cantalay, P. J., Uçan, O. N., Zontul, M., Diagnosis of breast cancer from X-ray images using deep learning methods, J. Ponte, 77 (6), (2021), https://doi.org/10.21506/j.ponte.2021.6.1.
  • Wang, Y., Yang, F., Zhang, J., Yue, X., Liu, S., Application of artificial intelligence based on deep learning in breast cancer screening and imaging diagnosis, Neural Comput. & Applic., (2021), 9637–9647, https://doi.org/10.1007/s00521-021-05728-x.
  • Mobark, N., Hamad, S., Rida, S. Z., CoroNet: Deep neural network-based end-to-end training for breast cancer diagnosis, Appl. Sci., 12 (14), (2022), 7080, https://doi.org/10.3390/app12147080.
  • Manishkumar, S. H. and Saranya, P., Detection and classification of breast cancer from mammogram images using adaptive deep learning technique, 2022 6th Int’l Conf. on Dev., Circ. & Syst., (2022), 327-331, https://doi.org/10.1109/ICDCS54290.2022.9780770.
  • Reddy, K. V. V., Elamvazuthi, I., Aziz, A. A., Paramasivam, S., Chua, H. N., Pranavanand, S., Heart disease risk prediction using machine learning classifiers with attribute evaluators, Appl. Sci., 11 (18) (2021), 8352, https://doi.org/10.3390/app11188352.
  • Bharti, R., Khamparia, A., Shabaz, M., Dhiman, G., Pande, S., Singh, P., Prediction of heart disease using a combination of machine learning and deep learning, Comp. Intell. & Neurosci., (2021), 8387680, https://doi.org/10.1155/2021/8387680.
  • Mehmood, A., Iqbal, M., Mehmood, Z. et al., Prediction of heart disease using deep convolutional neural networks, Arab. J. for Sci. & Eng., 46 (2021), 3409–3422, https://doi.org/10.1007/s13369-020-05105-1.
  • Puri, H., Chaudhary, J., Raghavendra, K. R., Mantri, R., Bingi, K., Prediction of heart stroke using support vector machine algorithm, 21 8th Int’l Conf. on Sm. Comput. & Comm., (2021), 21-26, https://doi.org/10.1109/ICSCC51209.2021.9528241.
  • Sivari, E., Güzel, M. S., Bostanci, E., Mishra, A., A novel hybrid machine learning based system to classify shoulder implant manufacturers, Healthcare, (2022), 10, 580, https://doi.org/10.3390/healthcare10030580.
  • [Colak, M., Sivri, T. T., Akman, N. P., Berkol, A., Ekici,Y., A study of disease prediction on weighted symptom data using deep learning and machine learning algorithms, Int’l Conf. on Theor. & Appl. Comput. Sci. & Eng., (2022), 116-119, https://doi.org/10.1109/ICTACSE50438.2022.10009857.
  • Xie, S., Yu, Z., Lv, Z., Multi-disease prediction based on deep learning: A survey, Comput. Mod. in Eng. & Sci., 128 (2) (2021), 489-522, https://doi.org/10.32604/cmes.2021.016728.
  • Ahsan, M., Siddique, Z., Machine learning-based heart disease diagnosis: A systematic literature review, Artif. Intel. in Med., 128 (2022), https://doi.org/10.1016/j.artmed.2022.102289.
  • Disease-Symptom Knowledge Database, (2022). Available: https://people.dbmi.columbia.edu/. [Accessed: September 2022].
  • Gandhi, K. , Mittal, M., Gupta, N. and Dhall, S., Disease prediction using machine learning, Int’l J. for Res. in Appl. Sci. & Eng. Tech., 8 (2020), 500-507, http://doi.org/10.22214/ijraset.2020.6077.
  • Agrawal, A., Agrawal, H., Shivam, M., Sharma, M., Disease prediction using machine learning, Proc. of 3rd Int. Conf. on IoT & Connected Tech., (2018), http://dx.doi.org/10.2139/ssrn.3167431.
  • Vinitha, S., Sweetlin, S., Vinusha, H., Sajini, S., Disease prediction using machine learning over big data, Comput. Sci. & Eng.: An Int’l J., 8 (1) (2018), http://dx.doi.org/10.2139/ssrn.3458775.
  • Kumar, A., Sharma, G. K. and Prakash, U. M. , Disease prediction and doctor recommendation system using machine learning approaches, Int’l J. for Res. in Appl. Sci. & Eng. Tech., 9 (2021), 34-44, https://doi.org/10.22214/ijraset.2021.36234.
  • Mallela, R. C., Bhavani., R. L., Ankayarkanni, B., Disease prediction using machine learning techniques, 2021 5th Int. Conf. on Trends in Electronics & Informatics, (2021), 962-966, https://doi.org/10.1109/ICOEI51242.2021.9453078.
  • Grampurohit, S., Sagarnal, C., Disease prediction using machine learning algorithms, 2020 Int. Conf. for Emerging Tech., (2020), 1-7, https://doi.org/10.1109/INCET49848.2020.9154130.
  • Dhabarde, S., Mahajan, R., Mishra, S., Chaudhari, S., Manelu, S., Shelke, N. S., Disease prediction using machine learning algorithms, Int. Res. J. of Mod. in Eng. Tech. & Sci.,4 (2022), 379-384.
  • Alanazi, R., Identification and prediction of chronic diseases using machine learning approach, J. of Healthc. Eng., 2022 (2022), 1-9, https://doi.org/10.1155/2022/2826127.
  • Uddin, S., Haque, I., Lu, H., Moni, M. A., Gide, E., Comparative performance analysis of K-nearest neighbor (KNN) algorithm and its different variants for disease prediction, Sci. Rep., 12 (2022), 1-11, https://doi.org/10.1038/s41598-022-10358-x.
  • Joel, G. N., Priya, S. M., Improved ant colony on feature selection and weighted ensemble to neural network based multimodal disease risk prediction (WENN-MDRP) classifier for disease prediction over big data, Int. J. of Eng. & Tech., 7 (2018), 56-61, https://doi.org/10.14419/ijet.v7i3.27.17654.
  • Ibrahim, F., Taib, M. N., Abas, W. A., Guan, C. C., Sulaiman, S., A novel dengue fever (DF) and dengue haemorrhagic fever (DHF) analysis using artificial neural network (ANN), Comput. Methods Programs Biomed., 79 (3) (2005), 273-281, https://doi.org/10.1016/j.cmpb.2005.04.002.
  • Venkatesh, K., Dhyanesh, K., Prathyusha, M. and Teja, C. H. N. , Identification of disease prediction based on symptoms using machine learning, JAC: J. Comp. Theory, 14 (2021), 86-93, https://doi.org/10.1155/2022/2826127.
  • Chauhan, R. H., Naik, D. N., Halpati, R. A., Patel, S. J. and Prajapati, A. D., Disease prediction using machine learning, Int. Res. J. of Eng. & Tech., 7 (2020), 2000-2002.
  • Maram, B., Kumar, K. S. and Gampala, V., Symptoms based disease prediction using bigdata analytics, Turk. J. of Phys. Ther. & Rehab., 32 (3) (2021), 3228-3234.
  • Ferjani, M., Disease prediction using machine learning, Bournemouth Univ., (2020), http://dx.doi.org/10.13140/RG.2.2.18279.47521.
  • Awari, S. V., Diseases prediction model using machine learning technique, Int. J. of Sci. Res. in Sci. & Tech., 8 (2) (2021), 461-467, https://doi.org/10.32628/IJSRST.
  • Shirsath, S. S., Patil, S., Disease prediction using machine learning over big data, Int. J. of Innov. Res. in Sci. Eng. & Tech., 7 (6) (2018), 6752-6757, https://doi.org/10.15680/IJIRSET.2018.0706059.
  • Keniya, R., Khakharia, A., Shah, V., Gada, V., Manjalkar, R., Thaker, T., Warang, M. and Mehendale, N., Disease prediction from various symptoms using machine learning, SSRN Electron. J., (2020), https://dx.doi.org/10.2139/ssrn.3661426.
  • Dahiwade, D., Patle, G., Meshram, E., Designing disease prediction model using machine learning approach, 3rd Int. Conf. on Comput. Methodol. & Comm., (2019), 1211-1215, https://doi.org/10.1109/ICCMC.2019.8819782.
  • Ke, G., Meng, Q., Finley, T., Wang, T. , Chen, W., Ma, W., Ye, Q., Liu, T., LightGBM: A highly efficient gradient boosting decision tree, Adv. in Neural Inf. Process. Syst., (30) (2017), 3146–3154.
  • Kuhn, M., Kjell, J., Applied Predictive Modeling, Springer, New York, 2013.
  • He, H., Ma, Y., Imbalanced Learning: Foundations, Algorithms, and Applications, Willey, 2013, 188, http://dx.doi.org/10.1002/9781118646106.