Makine öğrenmesi modeli nasıl açıklanır: HbA1c sınıflama örneği

Giriş/Amaç Makine öğrenimi araçlarının sağlık alanında birçok uygulamaya sahiptir. Ancak, geliştirilen modellerin uygulanması hala çeşitli zorluklar nedeniyle sınırlıdır. Bu konuda en önemli sorunlardan biri makine öğrenimi modellerinin açıklanabilirliğinin eksikliğidir. Açıklanabilirlik, yapay zeka sistemlerinin karar verme sürecinin nedenlerini ve mantığını ortaya koyma kapasitesini ifade eder ve, kullanıcılar için sürecin nasıl anlaşılacağını ve sistemin nasıl belirli bir sonuca ulaştığını açık hale getirir. Çalışma, HbA1c sınıflandırması için iki farklı ML modeli kullanarak farklı model-agnostik açıklama yöntemlerinin performansını karşılaştırmayı amaçlamaktadır. Yöntemler HbA1c sınıflandırması için iki ML modeli (Gradient boosting machine (GBM) ve default random forests (DRF)) H2O AutoML motoru kullanılarak 3,036 kayıt içeren NHANES açık veri kümesi kullanılarak geliştirilmiştir. Geliştirilen modeller için, performans metrikleri, özellik parametre analizi ve kısmi bağımlılık, kesit ayrıştırma ve Shapley açıklama grafikleri gibi global ve yerel model-agnostik açıklama yöntemleri kullanılmıştır. Sonuçlar GBM ve DRF modelleri benzer performans metriklerine sahip olmasına rağmen, parametre öneminde hafif farklılıklar vardı. Yerel açıklanabilirlik yöntemleri de özelliklere farklı katkılar gösterdi. Sonuç Bu çalışmada, sağlık alanında yapay zekâ entegrasyonu ve modellerin anlaşılmasında açıklanabilir makine öğrenimi tekniklerinin önemini değerlendirilmiştir. Sonuçlar, mevcut açıklanabilirlik yöntemlerinin sınırlılığına rağmen hem global hem de yerel açıklama modellerinin makine öğrenmesi modellerini değerlendirmek için bir fikir verdiğini ve modeli geliştirmek veya karşılaştırmak için kullanabileceğini göstermektedir.

Anahtar Kelimeler:

Makine öğrenmesi, açıklanabilir yapay zekâ, glikolize hemoglobin

How to explain a machine learning model: HbA1c classification example

Aim: Machine learning tools have various applications in healthcare. However, the implementation of developed models is still limited because of various challenges. One of the most important problems is the lack of explainability of machine learning models. Explainability refers to the capacity to reveal the reasoning and logic behind the decisions made by AI systems, making it straightforward for human users to understand the process and how the system arrived at a specific outcome. The study aimed to compare the performance of different model-agnostic explanation methods using two different ML models created for HbA1c classification. Material and Method: The H2O AutoML engine was used for the development of two ML models (Gradient boosting machine (GBM) and default random forests (DRF)) using 3,036 records from NHANES open data set. Both global and local model-agnostic explanation methods, including performance metrics, feature important analysis and Partial dependence, Breakdown and Shapley additive explanation plots were utilized for the developed models. Results: While both GBM and DRF models have similar performance metrics, such as mean per class error and area under the receiver operating characteristic curve, they had slightly different variable importance. Local explainability methods also showed different contributions to the features. Conclusion: This study evaluated the significance of explainable machine learning techniques for comprehending complicated models and their role in incorporating AI in healthcare. The results indicate that although there are limitations to current explainability methods, particularly for clinical use, both global and local explanation models offer a glimpse into evaluating the model and can be used to enhance or compare models.

Keywords:

Machine Learning, Explainable artificial intelligence, Glycated Hemoglobin,

PDF

___

Haymond S, McCudden C. Rise of the Machines: artificial ıntelligence and the clinical laboratory. J Appl Lab Med 2021; 6: 1640–54.
Habehh H, Gohel S. Machine learning in healthcare. Curr Genomics. 2021; 22: 291-300.
Zhang Y, Weng Y, Lund J. Applications of explainable artificial ıntelligence in diagnosis and surgery. Diagnostics 2022; 12: 237.
Arbelaez Ossa L Starke G, Lorenzini G, Vogt JE, Shaw DM, Elger BS. Re-focusing explainability in medicine. Digit Heal 2022; 8: 205520762210744.
Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable ai: a review of machine learning interpretability methods. Entropy 2020; 23: 18.
Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Heal 2021; 3: e745-50.
Langs HG, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov 2019; 9: e1312.
Sherwani SI, Khan HA, Ekhzaimy A, Masood A, Sakharkar MK. Significance of HbA1c test in diagnosis and prognosis of diabetic patients. Biomark Insights 2016; 11: 95–104.
National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD; U.S. Department of Health and Human Services, Centers for Disease Control and Prevention. [December 2022][https://www.cdc.gov/nchs/nhanes/ index.htm]
American Diabetes Association Professional Practice Committee (2022). 2. Classification and Diagnosis of Diabetes: Standards of Medical Care in Diabetes. Diabetes care 2022; 45: 17-38.
R Core Team R. A language and environment for statistical computing. https://www.r-project.org/ [December 2022].
Ledell E, Poirier S. H2O AutoML: scalable automatic machine learning [Internet]. AutoML Org; 2020. [https://H2O.ai/platform/H2O-automl/] [December 2022]
Biecek P. DALEX: Explainers for Complex Predictive Models in R. J Mach Learn Res 2018; 19: 1-5.
Kursa MB, Rudnicki WR. Feature selection with the boruta package. J Stat Software 2010;36:1–13.
Hand DJ, Till RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn 2001; 45: 171-86.
Wei P, Lu Z, Song J. Variable importance analysis: A comprehensive review. Reliab Eng Syst Saf 2015; 142: 399-432.
Antwarg L, Miller RM, Shapira B, Rokach L. Explaining anomalies detected by autoencoders using Shapley Additive Explanations. Expert Syst Appl 2021; 186: 115736.
Staniak M, Biecek P. Explanations of model predictions with live and breakDown packages. R Journal 2019; 10: 395-409.
Nathan DM, Kuenen J, Borg R, Zheng H, Schoenfeld D, Heine RJ. Translating the A1C assay into estimated average glucose values. Diabetes Care 2008; 31: 1473-8.
Feller S, Boeing H, Pischon T. Body mass index waist circumference and the risk of type 2 diabetes mellitus: implications for routine clinical practice. Dtsch Arztebl Int 2010: 470-6.
Shimizu T, Suda K, Maki S, et al. Efficacy of a machine learning-based approach in predicting neurological prognosis of cervical spinal cord injury patients following urgent surgery within 24 h after injury. J Clin Neurosci Off J Neurosurg Soc Australas 2023; 107: 150-6.