Sigara kullanımının sosyo-demografik belirleyicileri: Küresel Yetişkin Tütün Araştırmaları üzerine bir veri madenciliği analizi

Amaç: Bu makale a) Küresel Yetişkin Tütün Araştırması (KYTA) verilerinin tütün kullanım davranışları hakkındaki değerli bilgileri ortaya çıkarmada nasıl kullanılabileceğini ve b)KYTA verileri üzerinde uygulanan sınıflandırma algoritmalarının performanslarını sunmaktadır.Yöntem: Üç iyi bilinen sınıflandırma yöntemi olan K -en yakın komşu algoritması, C4.5 algoritması ve çok katmanlı algılayıcısı KYTA katılımcılarının sosyo-demografik özellikleri (yaş grubu, cinsiyet, yerleşim yeri, eğitim düzeyi ve çalışma durumu) temel alınarak, sigara içme durumunu (önceden tanımlanmış sınıflar: sigara içen ve içmeyen) doğru sınıflandırma performansı değerlendirilmiştir. İlk analiz KYTA Türkiye verileri üzerinde gerçekleştirilmiştir. Daha sonra Türkiye için en iyi performansı üreten model altı farklı Avrupa ülkesi: Yunanistan, Kazakistan, Polonya, Romanya, Rusya ve Ukrayna verileri için de uygulanmıştır.Bulgular: Bütün ağaç algoritmaları sigara içmeyenleri tespit etmekte daha doğru sonuçlar vermektedir. C4.5 algoritmasının doğru sınıflandırma oranı, Türkiye için en yüksek olandır. Ülkeler için yapılan karşılaştırmalı analiz, C4.5 algoritmasının Ukrayna’daki katılımcıların sigara içme durumunu %80’in üzerinde doğru bir şekilde sınıflandırabildiğini ancak Yunanistan için bu oranını %70’in altında kaldığını göstermektedir.Sonuç: Bu makale, demografik veriler gibi KYTA tarafından sağlanan bilgilerin, bir bireyin gelecekte sigara içmesi olasılığının hesaplanmasına yardımcı olabileceğini ortaya koymaktadır

Socio-demographic determinants of smoking: A data mining analysis of the Global Adult Tobacco Surveys

Objective: This paper presented a) how the Global Adult Tobacco Surveys (GATSs) data can be used for extracting valuable information about tobacco use behaviors of people and b) the prediction performance of the implemented classification algorithms on the GATS data.Methods: Three well-known classification methods: K-nearest neighbor, C4.5 algorithm, and multilayer perceptron were applied to assess the classifying performance for the smoking status of GATS participants (pre-defined classes: smoker and no smoker) based on the socio-demographic characteristics (age group, gender, residence, education level, and working status). The first analysis was performed on the GATS data from Turkey. Subsequently, the model producing the best performance for Turkey was also implemented for other six European countries: Greece, Kazakhstan, Poland, Romania, Russia, and Ukraine.Results: All of the tree algorithms were more confident to classify no smokers. The correct classification rate of C4.5 algorithm was the highest among the algorithms for the GATS Turkey data. In addition, the C4.5 algorithm classified the males more detailed than the females. The comparative analysis indicated that the C4.5 algorithm correctly classified the smoking status of participants of Ukraine over 80% while it was lower than 70% for Greece. Thus, the effects of demographic factors on smoking status can change from one country to another.Conclusion: This paper indicated that the data supplied by GATS such as demographic data may help to compute the likelihood of an individual to be a smoker in the future.

___

  • Jabbar MA, Deekshatulu BL, Chandra P. Classification of Heart Disease Using KNearest Neighbor and Genetic Algorithm. Procedia Technol. 2013;10:85-94.
  • Kartelj A. Classification of Smoking Cessation Status Using Various Data Mining Methods. Math Balk New Ser. 2010;24(3-4):199-205.
  • Segall RS, Guha GS, Nonis SA. Data mining of environmental stress tolerances on plants. Kybernetes. 2013;37:127-148.
  • Montaño-Moreno JJ, Gervilla-García E, Cajal-Blasco B, Palmer A. Data mining classification techniques: an application to tobacco consumption in teenagers. An Psicol. 2014;30(2):633-641.
  • Moon SS, Kang S-Y, Jitpitaklert W, Kim SB. Decision tree models for characterizing smoking patterns of older adults. Expert Syst Appl. 2012;39(1):445-451.
  • Ding X, Bedingfield S, Yeh C-H, et al. Identifying Tobacco Control Policy Drivers: A Neural Network Approach. In: Leung CS, Lee M, Chan JH, eds. Neural Information Processing. Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2009:770-776.
  • Yun C-J, Ding X, Bedingfield S, et al. Performance Evaluation of Intelligent Prediction Models on Smokers’ Quitting Behaviour. In: Fyfe C, Kim D, Lee S-Y, Yin H, eds. Intelligent Data Engineering and Automated Learning – IDEAL 2008. Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2008:210-216.
  • Sofean M, Smith M. Sentiment analysis on smoking in social networks. Stud Health Technol Inform. 2013;192:1118.
  • Myslín M, Zhu S-H, Chapman W, Conway M. Using twitter to examine smoking behavior and perceptions of emerging tobacco products. J Med Internet Res. 2013;15(8):e174.
  • Benjakul S, Termsirikulchai L, Hsia J, et al. Current manufactured cigarette smoking and roll-your-own cigarette smoking in Thailand: findings from the 2009 Global Adult Tobacco Survey. BMC Public Health. 2013;13:277.
  • Nollen NL, Ahluwalia JS, Lei Y, Yu Q, Scheuermann TS, Mayo MS. Adult Cigarette Smokers at Highest Risk for Concurrent Alternative Tobacco Product Use Among a Racially/Ethnically and Socioeconomically Diverse Sample. Nicotine Tob Res Off J Soc Res Nicotine Tob. 2016;18(4):386-394.
  • Singh A, Katyan H. Classification of nicotine-dependent users in India: a decision-tree approach. J Public Health. 2019;27(4):453-459.
  • Ding X, Yang Y, Stein EA, Ross TJ. Multivariate classification of smokers and nonsmokers using SVM-RFE on structural MRI images. Hum Brain Mapp. 2015;36(12):4869-4879.
  • McCormick PJ, Elhadad N, Stetson PD. Use of semantic features to classify patient smoking status. AMIA Annu Symp Proc. 2008;2008:450-454. Factors Affecing Smoking
  • Figueroa RL, Soto DA, Pino EJ. Identifying and extracting patient smoking status information from clinical narrative texts in Spanish. In: 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. ; 2014:2710-2713.
  • Wicentowski R, Sydes MR. Using Implicit Information to Identify Smoking Status in Smoke-blind Medical Discharge Summaries. J Am Med Inform Assoc JAMIA. 2008;15(1):29-31.
  • Sordo M, Zeng Q. On Sample Size and Classification Accuracy: A Performance Comparison. In: Oliveira JL, Maojo V, Martín-Sánchez F, Pereira AS, eds. Biological and Medical Data Analysis. Lecture Notes in Computer Science. Springer Berlin Heidelberg; 2005:193-201.
  • Huang Y, Britton J, Hubbard R, Lewis S. Who receives prescriptions for smoking cessation medications? An association rule mining analysis using a large primary care database. Tob Control. 2013;22(4):274-279.
  • Kaleta D, Usidame B, DziankowskaZaborszczyk E, Makowiec-Dąbrowska T. Socioeconomic Disparities in Age of Initiation and Ever Tobacco Smoking: Findings from Romania. Cent Eur J Public Health. 2015;23(4):299-305.
  • WHO [Internet]. Global Adult Tobacco Survey (GATS). WHO. [Cited:14.06.2016]. Avaliable from: http://www.who.int/tobacco/surveillance/survey/gats/en/
  • Hussain S, Alili AA. A pruning approach to optimize synaptic connections and select relevant input parameters for neural network modelling of solar radiation. Appl Soft Comput. 2017;52:898-908.
  • Tou JY, Tay YH, Lau PY. Recent trends in texture classification: A review. In: Symposium on Programs in Information & Communication Technology. 2009; 2(3).
  • Liao TW, Kuo RJ. Five discrete symbiotic organisms search algorithms for simultaneous optimization of feature subset and neighborhood size of KNN classification models. Appl Soft Comput. 2018;64:581-595.
  • Amaral JLM, Lopes AJ, Jansen JM, Faria ACD, Melo PL. An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms. Comput Methods Programs Biomed. 2013;112(3):441-454.
  • Classification Methods [Internet]. [Cited: 13.06.2017]. Avaliable from: http://www.d.umn.edu/~padhy005/Chapter5.html
  • Kaur G, Chhabra A. Improved J48 Classification Algorithm for the Prediction of Diabetes. Int J Comput Appl. 2014;98(22):13-17.
  • King MW, Resick PA. Data mining in psychological treatment research: a primer on classification and regression trees. J Consult Clin Psychol. 2014;82(5):895-905
  • Hsu-Che W, Ya-Han H, Yen-Hao H. Two-stage credit rating prediction using machine learning techniques. Kybernetes. 2014;43(7):1098-1113.
  • Zhu Y, Fang J. Logistic Regression–Based Trichotomous Classification Tree and Its Application in Medical Diagnosis. Med Decis Making. 2016;36(8):973-989.
  • Gholap J. Performance Tuning Of J48 Algorithm For Prediction Of Soil Fertility. Asian J Comput Sci Inf Technol. 2012;2(8). Accessed August 19, 2016. http://arxiv.org/abs/1208.3943
  • Nadeem M, Banka H, Venugopal R. Estimation of pellet size and strength of limestone and manganese concentrate using soft computing techniques. Appl Soft Comput. 2017;59:500-511.
  • Yan H, Jiang Y, Zheng J, Peng C, Li Q. A multilayer perceptron-based medical decision support system for heart disease diagnosis. Expert Syst Appl. 2006;30(2):272-281.
  • Malhotra R. An empirical framework for defect prediction using machine learning techniques with Android software. Appl Soft Comput. 2016;49:1034-1050.
  • Arora R, Suman. Comparative Analysis of Classification Algorithms on Different Datasets using WEKA. Int J Comput Appl. 2012;54:13. Factors Affecing Smoking
  • Riedmiller M. Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms. Comput Stand Interfaces. 1994;16(3):265-278.
  • Cross Validation [Internet]. [Cited:19.08.2016]. Avaliable from: https://www.cs.cmu.edu/~schneide/ tut5/node42.html
  • Steinbach WR, Richter K. Multiple Classification and Receiver Operating Characteristic (ROC) Analysis. Med Decis Making. 1987;7(4):234-237.
  • Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation - SIE07001. pdf. Accessed August 19, 2016. https://csem.flinders.edu.au/research/techreps/SIE07001.pdf
  • Danenas P, Garsva G. Credit risk evaluation modeling using evolutionary linear SVM classifiers and sliding window approach. Procedia Comput Sci. 2012;Complete(9):1324-1333.
  • WekaDataAnalysis [Internet]. [Cited: 15.07.2016]. Avaliable from: http://www.cs.usfca.edu/~pfrancislyon/ courses/640fall2015/WekaDataAnalysis.pdf
  • Mermer G, Dağhan Ş, Bilge A, Dönmez RÖ, Özsoy S, Günay T. Prevalence of Tobacco Use among School Teachers and Effect of Training on Tobacco Use in Western Turkey. Cent Eur J Public Health. 2016;24(2):137-143.
Turkish Journal of Public Health-Cover
  • Başlangıç: 2003
  • Yayıncı: Halk Sağlığı Uzmanları Derneği
Sayıdaki Diğer Makaleler

Kerala bir tıp kurumunda COVID-19 pandemisi sırasında tıp öğrencileri arasında e-öğrenime yönelik algıları

Babita KURUVİLLA, Ann THOMAS, Jacob KALLİATH, Alexander JOHN, Brilly ROSE

Türkiye’de Göçmenlere Sunulan Sağlık Hizmetleri İçin Yürütülen İletişim Çalışmaları

Sevil TURHAN, Selen GÜRSOY, Serdar KARAKULLUKÇU

Kayseri ili 2018 yılı HPV tarama sonuçlarının değerlendirilmesi

Serkan YILDIZ, Mehmet Emin ÖZDEMİR, Mebrure Beyza GÖKÇEK, Nadir Emre ÜNSAL, Berkan ASLAN, Ali Ramazan BENLİ

Afganistan’da sağlık çalışanlarında tükenmişlik sendromu sıklığı ve ilişkili risk faktörleri

Khwaja Mir Islam SAEED

Türkiye’deki yol güvenliği kararlarının Haddon Matrisi ve 7Es ile incelenmesi

İbrahim ÖZTÜRK, Pınar BIÇAKSIZ, Yeşim ÜZÜMCÜOĞLU ZİHNİ, Türker ÖZKAN

Sigara kullanımının sosyo-demografik belirleyicileri: Küresel Yetişkin Tütün Araştırmaları üzerine bir veri madenciliği analizi

Zeynep DURMUŞOĞLU, Pınar KOCABEY ÇİFTÇİ

Farklı kilo verme yöntemlerinin ve yöntemlere uyumun kilo verme ve kontrolü ile ilişkisi

Fatih GÜLTEKİN, Halil İbrahim BÜYÜKBAYRAM, Duygu KUMBUL DOĞUÇ, Hikmet ORHAN, Osman GÜRDAL, Mücahit EĞRİ, Rıza ÇITIL, Mustafa TÖZÜN, Ahmet Nesimi KİŞİOĞLU, Fatih KARA, İlter İLHAN, Muazzez GARİPAĞOĞLU

COVID-19 pandemisi döneminde genç erişkinlerin psikolojik özelliklerinin araştırılması

Ayşe Sonay TÜRKMEN, Ali CEYLAN, Ayşe TOPUZ

Dünya Sağlık Örgütü yaşam kalitesi ölçeği Afgan Dari sürümünün (WHOQOL-Bref-Dari) geçerliliği ve güvenilirliği

Nasar Ahmad SHAYAN, Erhan ESER, Ahmad NEYAZİ, Sultan ESER

Halk sağlığı bakış açısıyla teletıp

Deren ÖZYÜREK UCAEL, Mustafa ÖZDEN, Ercüment ALTINTAŞ, Dilek ASLAN