VERİ MADENCİLİĞİNDE İKİ AŞAMALI KÜMELEME İLE İNSANİ GELİŞME ENDEKSİ İNCELEMESİ

Bu çalışmada ulusal düzeyde sosyoekonomik gelişmeyi ölçen ve ekonomiler arasındaki farkları karşılaştıran göstergelerden olan İnsani Gelişme Endeksi ve Cinsiyet Gelişme Endeksi ile ülkelerin gelişim düzeyleri incelenmektedir. Ülkelerin gruplanmasında hangi değişkenlerin etkili olduğu veri madenciliği tekniklerinden biri olan iki aşamalı kümeleme analizi ile araştırılmaktadır. Çalışmada kullanılan verilerin tamamı Birleşmiş Milletler Gelişme Programı(UNDP)’ndan elde edilmektedir. Analizler UNDP tarafından yayımlanan rapordaki 2019 yılına ait 189 ülke için 2268 gözlem ile gerçekleştirilmiştir. İnsani Gelişme Endeksi sonuçlarına göre ülkeler yüksek, orta ve düşük gelişme endeksine sahip ülkeler olmak üzere üç grupta toplanmaktadır. Cinsiyet Gelişme Endeksi sonuçlarına göre ülkeler yüksek, yüksek-orta, orta, orta-düşük ve düşük gelişme endeksine sahip ülkeler olmak üzere beş grupta toplanmaktadır. İnsani Gelişme Endeksi için belirlenen grup sayısının UNDP raporunda açıklanan grup sayısından faklı olduğu, Cinsiyet Gelişme Endeksi’nin grup sayısının rapor ile aynı olduğu belirlenmiştir. Genel olarak sonuçlar, kadınların erkeklere nazaran daha uzun ömürlü olduğunu, kadın ve erkeklerin eğitim seviyelerinin hemen hemen aynı olduğunu, erkeklerin kadınlara göre daha yüksek milli gelire sahip olduğunu göstermektedir. Böylece cinsiyet faktörünün gelişme endeksi üzerindeki etkisi ortaya koyulmaktadır.

Human Development Index Study With Two Stage Clustering In Data Mining

This study examines the level of development of countries with the Human Development Index and the Gender Development Index, which are indicators that measure socioeconomic development at national level and compare the differences between economies. Which variables are influential in the grouping of countries is being investigated through a two-stage clustering analysis, one of the data mining techniques All data used in the study are obtained from the United Nations Development Program (UNDP) Analyzes were conducted with 2268 observation for 189 countries published by UNDP for 2019 According to the results of the Human Development Index, countries are grouped into three groups: countries with high, medium and low development indexes According to the results of the Gender Development Index, countries are grouped into five groups: countries with high, high-medium, medium, medium-low and low development indexes. It has been determined that the number of groups determined for the Human Development Index is different from the number of groups announced in the UNDP report, and the number of groups for the Gender Development Index is the same as the report. In general, the results show that women live longer than men, the education level of men and women is almost the same, and men have a higher national income than women. Thus, the effect of the gender factor on the development index was revealed.

___

  • Aguña, C., & Kovacevic, M. (2010). Uncertainty and Sensitivity Analysis of the Human Development Index. Human Development Research Paper 2010/47.
  • Anand, S., & Ravallion, M. (1993). Human development in poor countries: on the role of private incomes and public services. Journal of economic perspectives, 7(1), 133-150.
  • Batool , F., & Hennig, C. (2021). Clustering with the average silhouette width. Computational Statistics & Data Analysis, 158, 107190, 1-18.
  • Berkhin, P. (2006). A survey of clustering data mining techniques. In Grouping multidimensional data, Springer, 25-71.
  • Chen, B., Tai, P., Harrison, R., & Pan, Y. (2005). Novel hybrid hierarchical-K-means clustering method (HK-means) for microarray analysis. In 2005 IEEE Computational Systems Bioinformatics Conference-Workshops (CSBW'05), 105-108.
  • Chen, T., Tsai, T., Chen, Y., Lin, C., Chen, R., Li, S., & Chen, H. (2005). A combined K-means and hierarchical clustering method for improving the clustering efficiency of microarray. In 2005 International symposium on intelligent signal processing and communication systems, 405-408.
  • Cios, K., Pedrycz, W., Swiniarski, R., & Kurgan, L. (2007). Data mining: a knowledge discovery approach. Springer Science & Business Media.
  • (2000). CRISP-DM 1.0 Step-by-step data mining guide. (http://www.crispdm.org/CRISPWP-0800).
  • Diniz, F., & Sequeira, T. (2012). A Social and Economic Development Index: NUTS Ranking in Portugal Ranking in Portugal. American Journal of Economics, 146-163.
  • Djouzi, K., & Beghdad-Bey, K. (2019). A Review of Clustering Algorithms for Big Data. 2019 International Conference on Networking and Advanced Systems (ICNAS), (s. 1-6).
  • Dumith, S., Hallal, P., Reis , R., & Kohl III, H. (2011). Worldwide prevalence of physical inactivity and its association with human development index in 76 countries. Preventive medicine, 53(1-2), 24-28.
  • Ganesh, S. (2002). Data mining: Should it be included in the statistics curriculum? In The 6th international conference on teaching statistics (ICOTS-6). South Africa.
  • Giray, S. (2016). İki Aşamalı Kümeleme Analizi İle Hükümlü Verilerinin İncelenmesi. İstanbul: İstanbul Üniversitesi İktisat Fakültesi Ekonometri Ve İstatistik Dergisi.
  • Grimm, M., Harttgen, K., Klasen, S., & Misselhorn, M. (2008). A human development index by income groups. World development, 36(12), 2527-2546.
  • Güloğlu, H., Güloğlu, B., & Güven, M. (2018). K Means Clustering Analysis of the Determinants of Human Development Index for the Member States of the Organization for Islamic Cooperation. Eurasian Econometrics, Statistics & Emprical Economics Journal.
  • Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and Techniques. Elsevier.
  • Hand, D., Mannila, H., & Smyth, P. (2001). Principles of data mining. MIT press.
  • He, H., & Tan, Y. (2012). A two-stage genetic algorithm for automatic clustering. Neurocomputing, 81, 49-59.
  • Kleissner, C. (1998). Data mining for the enterprise. Proceedings of the Thirty-First Hawaii International Conference on (Vol. 7) (s. 295-304). IEEE.
  • Kloptchenko, A., Eklund, T., Karlsson, J., Back, B., Vanharanta, H., & Visa, A. (2004). Combining data and text mining techniques for analysing financial reports. Intelligent systems in accounting, finance and management.
  • Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2007). Data Preprocessing for Supervised Leaning. International Scholarly and Scientific Research & Innovation 1(12), 4091-4096.
  • Landau, S., & Ster, I. (2010). Cluster analysis: overview. Elsevier, 72-83.
  • Larasati, S., Nisa, K., & Herawati, N. (2021). Robust Principal Component Trimmed Clustering of Indonesian Provinces Based on Human Development Index Indicators. In Journal of Physics: Conference Series (Vol. 1751, No. 1, p. 012021). IOP Publishing.
  • Luan, J. (2002). Data mining and its applications in higher education. New directions for institutional research, 17-36.
  • Majerova, I., & Nevima, J. (2017). The measurement of human development using the Ward method of cluster analysis. Journal of International Studies 10(2), 239-257.
  • Neumayer, E. (2001). The human development index and sustainability—a constructive proposal. Ecological Economics, 39(1), 101-114.
  • Ngai, E., Xiu, L., & Chau, D. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications.
  • Nurhasanah, N., Salwa, N., & Ornila, L. (2017). Clustering Regency/City in Indonesia based on Human Development Index Indicators. Proceedings of AICS-Social Sciences, 7, 859.
  • Oğuzlar, A. (2004). Veri Madenciliğine Giriş. Bursa: Ekin Kitapevi.
  • Perner, P. (2010). Advances in Data Mining. Applications and Theoretical Aspects (Vol. 6171). Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Piramuthu, S. (2004). Evaluating Feature Selection Methods for Learning in Data Mining Applications. European journal of operational research, 483-494.
  • Pyle, D. (1999). Data Preparation for Data Mining. San Francisco: Morgan Kaufmann Publishers, Inc.
  • Roiger, R. (2017). Data mining: a tutorial-based primer. Minneapolis: Chapman and Hall/CRC.
  • Sagar, A., & Najam, A. (1998). The human development index: a critical review. Ecological economics, 25(3), 249-264.
  • Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O., Tiwari, A., . . . Lin, C. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664-681.
  • Sen , A. (2000). A decade of human development. Journal of human development, 1(1), 17-23.
  • Shah, S. (2016). Determinants of human development index: A cross-country empirical analysis. Munich Personal RePEc Archive.
  • Şchiopu, D. (2010). Applying TwoStep cluster analysis for identifying bank customers' profile. Ploiesti: Buletinul.
  • Taşkın, Ç., & Emel, G. (2010). Veri Madenciliğinde Kümeleme Yaklaşımları ve Kohonen Ağları İle Perakendecilik Sektöründe Bir Uygulama.Süleyman Demirel Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi.
  • Tufféry, S. (2011). Data mining and statistics for decision making (Vol. 2). United Kingdom: John Wiley & Sons Ltd.
  • Tüzüntürk, S. (2010). Veri Madenciliği ve İstatistik. Uludağ Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi.
  • Zhao, C., & Luan, J. (2006). Data mining: Going beyond traditional statistics. New Directions for Institutional Research.