Müşteri Kayıplarının Tahmini Üzerine Bir Veri Madenciliği Uygulaması

Müşteri memnuniyeti ve sadakati uygun fiyat, ürün çeşitliliği, hızlı tedarik ve sevkiyat, ürün kalitesi, satış öncesi ve sonrası hizmetler ve müşteri davranışlarının analiz edilmesi ile sağlanır. Müşteri davranışlarını analiz eden işletmeler hem mevcut müşterilerini koruyabilir hem de yenilerini kazanabilir. Bu çalışmanın amacı işletmeleri terk etme ihtimali olan müşterileri tahmin edebilen gözetimli modeller üretmektir. Bu amaçla toplamda 21 sınıflandırma yöntemi ve telekomünikasyon, bankacılık ve e–ticaret sektörlerine ait veri kümeleri kullanılarak deney çalışmaları gerçekleştirilmiştir. Ayrıca işletmelerin harcama alışkanlıklarına göre müşterileri sıralamak ve sınıflandırmak için kullandıkları basit ama etkili bir pazarlama analiz aracı olan RFM (Recency, Frequency, Monetary Value) bölümlemesi, Ki-Kare Testi ile birlikte boyut indirgeme metodu olarak kullanılmıştır. Böylelikle optimal eleman sayısına sahip öznitelik altkümelerinin elde edilmesi ve öznitelik seçim öncesi ve sonrası model performanslarının kıyaslanması hedeflenmiştir.

A Data Mining Application in Customer Churn Prediction

Customer satisfaction and loyalty can be achieved through reasonable prices, product variety, fast supply and delivery, product quality, pre and post-sales services and analysis of customer behaviors. Businesses that analyze customer behavior can both retain existing customers and gain new ones. This study aims to build supervised models that can predict customers who are likely to leave businesses. To this end, experiments were carried out using a total of 21 classification methods and datasets from the telecommunications, banking, and e-commerce industries. In addition, RFM (Recency, Frequency, Monetary Value) segmentation, a simple but effective marketing analysis tool used by businesses to rank and classify customers according to their spending habits, was used as a dimension reduction method together with the Chi-Square Test. Thus, it is aimed to obtain feature subsets with optimal number of elements and to compare model performances before and after feature selection.

___

  • Harvard Business School (HBS), Business Analytics Program. Business Intelligence vs. Business Analytics. https://analytics.hbs.edu/blog/business–intelligence–vs–business–analytics (Erişim Tarihi: 26.09.2021)
  • Patricia, M.W., Brockett, P.L., Golden, L.L. 1997. A Comparative Analysis of Neural Networks and Statistical Methods for Predicting Consumer Choice, Marketing Science, 16(4):370–391. DOI: 10.1287/mksc.16.4.370
  • Eiben, A.E., Koudijs, A.E., Slisser, F. 1998. Genetic Modelling of Customer Retention, EuroGP 1998: Genetic Programming, 1391:178–186. DOI: 10.1007/BFb0055937
  • Madden, G.G., Savage, S.J., Coble-Neal, G. 1999. Subscriber Churn in The Australian ISP Market, Information Economics and Policy, 11:195–207. DOI: 10.1016/S0167-6245(99)00015-3
  • Datta, P., Masand, B., Mani, D.R., Li, B. 2000. Automated Cellular Modeling and Prediction on a Large Scale, Artificial Intelligence Review, 14:485–502. DOI: 10.1023/A:1006643109702
  • Koçoğlu, F.Ö., Özcan, T., Baray, Ş.A. 2016. Veri Madenciliğinde Ayrılan Müşteri Analizi Problemi Üzerine Bir Literatür Araştırması, Uluslararası Katılımlı Üretim Araştırmaları Sempozyumu “4. Sanayi Devriminde Üretim”, 868–874.
  • Huang, B., Kechadi, M.T., Buckley B. 2012. Customer Churn Prediction in Telecommunications, Expert Systems with Applications, 39(1):1414–1425. DOI: 10.1016/j.eswa.2011.08.024
  • Xie, Y., Li, X., Ngai, E.W.T., Ying, W. 2009. Customer Churn Prediction Using Improved Balanced Random Forests, Expert Systems with Applications, 36(3-Part 1):5445–5449. DOI: 10.1016/j.eswa.2008.06.121
  • Tsai, C–F., Lu, Y–H. 2009. Customer Churn Prediction by Hybrid Neural Networks, Expert Systems with Applications, 36(10):12547–12553. DOI: 10.1016/j.eswa.2009.05.032
  • Vafeiadis, T., Diamantaras, K.I., Sarigiannidis, G., Chatzisavvas, K.C. 2015. A Comparison of Machine Learning Techniques for Customer Churn Prediction, Simulation Modelling Practice and Theory, 55:1–9. DOI: 10.1016/j.simpat.2015.03.003
  • Burez, J., Van den Poel, D. 2009. Handling Class Imbalance in Customer Churn Prediction, Expert Systems with Applications, 36(3-Part 1):4626–4636. DOI: 10.1016/j.eswa.2008.05.027
  • Verbeke, W., Martens, D., Mues, C., Baesens, B. 2011. Building Comprehensible Customer Churn Prediction Models with Advanced Rule Induction Techniques, Expert Systems Applications, 38(3):2354–2364. DOI: 10.1016/j.eswa.2010.08.023
  • Xia, G-E., Jin, W-D. 2008. Model of Customer Churn Prediction on Support Vector Machine, Systems Engineering – Theory & Practice, 28(1):71–77. DOI: 10.1016/S1874-8651(09)60003-X
  • Verbeke, W., Martens, D., Baesens, B. 2014. Social Network Analysis for Customer Churn Prediction, Applied Soft Computing, 14(Part C):431–446. DOI: 10.1016/j.asoc.2013.09.017
  • Lu, N., Lin, H., Lu, J., Zhang, G. 2014. A Customer Churn Prediction Model in Telecom Industry Using Boosting, IEEE Transactions on Industrial Informatics, 10(2):1659–1665. DOI: 10.1109/TII.2012.2224355
  • Caigny, A.D., Coussement, K., De Bock, K.W. 2018. A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees, European Journal of Operational Research, 269(2):760–772. DOI: 10.1016/j.ejor.2018.02.009
  • Khan, A.A., Jamwal, S., Sepehri, M.M. 2010. Applying Data Mining to Customer Churn Prediction in an Internet Service Provider, International Journal of Computer Applications, 9(7):8–14. DOI: 10.5120/1400-1889
  • De Bock, K.W., Van den Poel, D. 2011. An Empirical Evaluation of Rotation-Based Ensemble Classifiers for Customer Churn Prediction, Expert Systems with Applications, 38(10):12293–12301. DOI: 10.1016/j.eswa.2011.04.007
  • Mishra, A., Reddy, U.S. 2017. A Novel Approach for Churn Prediction Using Deep Learning, 2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 1–4. DOI: 10.1109/ICCIC.2017.8524551
  • Kim, S., Choi, D., Lee, E., Rhee, W. 2017. Churn Prediction of Mobile and Online Casual Games Using Play Log Data, PLoS ONE 12(7):e0180735. DOI: 10.1371/journal.pone.0180735
  • Spanoudes, P., Nguyen, T. 2017. Deep Learning in Customer Churn Prediction: Unsupervised Feature Learning on Abstract Company Independent Feature Vectors, Machine Learning (cs.LG), 1–22. arXiv:1703.03869
  • Bhattacharya, C.B. 1998. When Customers are Members: Customer Retention in Paid Membership Contexts, Journal of the Academy of Marketing Science, 26(1):31–44. DOI: 10.1177/0092070398261004
  • Lariviere, B., Van den Poel, D. 2004. Investigating the Role of Product Features in Preventing Customer Churn, By Using Survival Analysis and Choice Modeling: The Case of Financial Services, Expert Systems with Applications, 27:277–285. DOI: 10.1016/j.eswa.2004.02.002
  • Greis, N.P., Gilstein, C.Z. 1991. Empirical Bayes Methods for Telecommunications Forecasting, International Journal of Forecasting, 7(2):183–197. DOI: 10.1016/0169-2070(91)90053-X
  • Wong, K.K-K. 2011. Using Cox Regression to Model Customer Time to Churn in The Wireless Telecommunications Industry, Journal of Targeting, Measurement and Analysis for Marketing 19(1):37–43. DOI: 10.1057/jt.2011.1
  • Fayyad, U. 1997. Data Mining and Knowledge Discovery in Databases: Implications for Scientific Databases, 9th International Conference on Scientific and Statistical Database Management, 2–11. DOI: 10.1109/SSDM.1997.621141
  • Maimon, O., Rokach, L. 2005. Introduction to Knowledge Discovery in Databases. 1–17s. Maimon O., Rokach L. eds. 2005. Data Mining and Knowledge Discovery Handbook, Springer, Boston, MA, USA, 1383s. DOI: 10.1007/0-387-25465-X_1
  • Hox, J., Boeije, H.R. 2005. Data Collection, Primary versus Secondary, Encyclopedia of Social Measurement 1, 593–599. DOI: 10.1016/B0-12-369398-5/00041-4
  • Han, J., Kamber, M., Pei, J. 2011. Data Mining: Concepts and Techniques. 3rd Edition. Morgan Kaufmann, 744s.
  • Tan, P-N., Steinbach, M., Karpatne, A., Kumar, V. 2018. Introduction to Data Mining. 2nd Edition. Pearson, 864s.
  • Liao, S-H., Chu P-H., Hsiao P-Y. 2012. Data Mining Techniques and Applications – A Decade Review from 2000 to 2011, Expert Systems with Applications, 39(12):11303–11311. DOI: 10.1016/j.eswa.2012.02.063
  • Ngai, E.W.T., Hu, Y., Wong, Y.H., Chen, Y., Sun, X. 2011. The Application of Data Mining Techniques in Financial Fraud Detection: A Classification Framework and an Academic Review of Literature, Decision Support Systems, 50(3):559–569. DOI: 10.1016/j.dss.2010.08.006
  • Hossin, M., Sulaiman, M.N. 2015. A Review on Evaluation Metrics for Data Classification Evaluations, International Journal of Data Mining & Knowledge Management Process, 5(2):1–11. DOI: 10.5121/ijdkp.2015.5201
  • Ang, J.C., Mirzal, A., Haron, H., Hamed, H.N.A. 2015. Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 13(5):971–989. DOI: 10.1109/TCBB.2015.2478454
  • Jin, X., Xu, A., Bie, R., Guo, P. 2006. Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles, BioDM 2006: Data Mining for Biomedical Applications, 3916:106–115. DOI: 10.1007/11691730_11
  • Investopedia. Recency, Frequency, Monetary Value. https://www.investopedia.com/terms/r/rfm-recency-frequency-monetary-value.asp (Erişim Tarihi: 22.09.2021).
  • Wikipedia. RFM (Market Research). https://en.wikipedia.org/wiki/RFM_(market_research) (Erişim Tarihi: 24.09.2021).
  • IBM SPSS Statistics. RFM Binning. https://www.ibm.com/docs/en/spss-statistics/24.0.0?topic=analysis-rfm-binning (Erişim Tarihi: 24.09.2021).
  • Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F. 2011. An Overview of Ensemble Methods for Binary Classifiers in Multi-Class Problems: Experimental Study on One-Vs-One and One-Vs-All Schemes, Pattern Recognition, 44(8):1761–1776. DOI: 10.1016/j.patcog.2011.01.017
  • Rokach L., Maimon O. 2005. Decision Trees. 165–192s. Maimon O., Rokach L. eds. 2005. Data Mining and Knowledge Discovery Handbook, Springer: Boston, MA, 1383s. DOI: 10.1007/0-387-25465-X_9
  • Press, S.J., Wilson S. 1978. Choosing Between Logistic Regression and Discriminant Analysis, Journal of the American Statistical Association, 73(364):699–705. DOI: 10.1080/01621459.1978.10480080
  • Hoare, Z. 2008. Landscapes of Naïve Bayes Classifiers, Pattern Analysis & Applications, 11(1):59–72. DOI: 10.1007/s10044-007-0079-5
  • Cortes, C., Vapnik, V.N. 1995. Support-vector Networks, Machine Learning, 20(3):273–297. DOI: 10.1007/BF00994018
  • Rokach, L. 2010. Ensemble-based Classifiers, Artificial Intelligence Review 33, 1–39. DOI: 10.1007/s10462-009-9124-7
  • Jain, A.K., Jianchang M., Mohiuddin, K.M. 1996. Artificial Neural Networks: A Tutorial, Computer, 29(3): 31–44. DOI: 10.1109/2.485891
  • Taherdoost, H. 2016. Sampling Methods in Research Methodology; How to Choose a Sampling Technique for Research, International Journal of Academic Research in Management (IJARM), 5(2):18–27. DOI: 10.2139/ssrn.3205035
  • Tharwat, A. 2021. Classification Assessment Methods, Applied Computing and Informatics, 17(1):168–192. DOI: 10.1016/j.aci.2018.08.003
  • Schmitt, J. 1999. Churn: Can Carriers Cope? Skyrocketing Subscriber Defections Have Carriers Worldwide Seeking New Churn Solutions, Telecommunication North American Edition, 32–33.
Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi-Cover
  • ISSN: 1302-9304
  • Yayın Aralığı: Yılda 3 Sayı
  • Başlangıç: 1999
  • Yayıncı: Dokuz Eylül Üniversitesi Mühendislik Fakültesi