Categorization of Customer Complaints in Food Industry Using Machine Learning Approaches

Categorization of Customer Complaints in Food Industry Using Machine Learning Approaches

Customer feedback is one of the most critical parameters that determine the market dynamics of product development. In this direction, analyzing product-related complaints helps sellers to identify the quality characteristics and consumer focus. There have been many studies conducted on the design of Machine Learning (ML) systems to address the causes of customer dissatisfaction. However, most of the research has been particularly performed on English. This paper contributes to developing an accurate categorization of customer complaints about package food products, written in Turkish. Accordingly, various ML algorithms using TF-IDF and word2vec feature representation strategies were performed to determine the category of complaints. Corresponding results of Linear Regression (LR), Naive Bayes (NB), k Nearest Neighbour (kNN), Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) classifiers were provided in related sections. Experimental results show that the best-performing method is XGBoost with TF-IDF weighting scheme and it achieves %86 F-measure score. The other considerable point is word2vec based ML classifiers show poor performance in terms of F-measure compared to the TF-IDF term weighting scheme. It is also observed that each experimented TF-IDF based ML algorithm gives a more successful prediction performance on the optimal subsets of features selected by the Chi Square (CH2) method. Performing CH2 on TF-IDF features increases the F-measure score from 86% to 88% in XGBoost.

___

  • Akın, A. A., & Akın, M. D. (2007). Zemberek, an open source nlp framework for turkic languages. Structure, 10, 1-5.
  • Berrar, D. (2018). Bayes’ theorem and naive Bayes classifier. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics; Elsevier Science Publisher: Amsterdam, The Netherlands, 403-412.
  • Bhole, B., & Hanna, B. (2017). The effectiveness of online reviews in the presence of self-selection bias. Simulation Modelling Practice and Theory, 77, 108-123.
  • Biau, G., & Scornet, E. (2016). A random forest guided tour. Test, 25(2), 197-227. Bozyiğit, A., Utku, S., & Nasiboğlu, E. (2019, September). Cyberbullying detection by using artificial neural network models. In 2019 4th International Conference on Computer Science and Engineering (UBMK) (pp. 520-524). IEEE.
  • Bozyiğit, F., Şahin, M., Gündüz, T., Işık, C., & Kilinç, D. (2020) Regression based risk analysis in life insurance industry. International Journal of Engineering and Innovative Research, 2(3), 178-184.
  • Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., & Cho, H. (2015). Xgboost: extreme gradient boosting. R package version 0.4-2, 1(4), 1-4.
  • Chen, W. K., Riantama, D., & Chen, L. S. (2021). Using a Text Mining Approach to Hear Voices of Customers from Social Media toward the Fast-Food Restaurant Industry. Sustainability, 13(1), 268.
  • Fox, G. L. (2008). Getting good complaining without bad complaining. Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behavior, 21, 23.
  • HaCohen-Kerner, Y., Dilmon, R., Hone, M., & Ben-Basan, M. A. (2019). Automatic classification of complaint letters according to service provider categories. Information Processing & Management, 56(6), 102102.
  • Harish, B. S., Guru, D. S., & Manjunath, S. (2010). Representation and classification of text documents: A brief review. IJCA, Special Issue on RTIPPR (2), 110-119.
  • Haron, S. A., & Fah, B. C. Y. (2010). Unpleasant market experience and consumer complaint behavior. Asian Social Science, 6(5), 63.
  • Hong, M., & Wang, H. (2021). Research on customer opinion summarization using topic mining and deep neural network. Mathematics and Computers in Simulation, 185, 88-114. Howell, D. C. (2011). Chi-Square Test: Analysis of Contingency Tables.
  • Jin, J., Yan, X., Yu, Y., & Li, Y. (2013). Service failure complaints identification in social media: A text classification approach. In: Proceedings of the 34th International Conference on Information Systems.
  • Kılınç, D., Borandağ, E., Yücalar, F., Özçift, A., & Bozyiğit, F. (2015). Yazılım hata kestiriminde kolektif sınıflandırma modellerinin etkisi. Proceedings of IX. Ulusal Yazılım Mühendisliği Sempozyumu. İzmir, Turkey: Yaşar Üniversitesi.
  • Kılınç, D., Yücalar, F., Borandağ, E., & Aslan, E. (2016). Multi‐level reranking approach for bug localization. Expert Systems, 33(3), 286-294.
  • Krishna, G. J., Ravi, V., Reddy, B. V., Zaheeruddin, M., Jaiswal, H., Teja, P. S. R., & Gavval, R. (2019, October). Sentiment Classification of Indian Banks' Customer Complaints. In TENCON 2019-2019 IEEE Region 10 Conference (TENCON) (pp. 429-434). IEEE.
  • Lee, J. H., & Choi, H. (2020). An analysis of public complaints to evaluate ecosystem services. Land, 9(3), 62.
  • Lemos, J. G., Garcia, M. V., de Oliveira Mello, R., & Copetti, M. V. (2018). Consumers complaints about moldy foods in a Brazilian website. Food Control, 92, 380-385.
  • Noble, William S. “What is a support vector machine?” Nature biotechnology 24.12 (2006): 1565-1567.
  • Onan, A., Atik, E., & Yalçın, A. (2020). Machine learning approach for automatic categorization of service support requests on university information management system. In International Conference on Intelligent and Fuzzy Systems (pp. 1133-1139). Springer, Cham.
  • Onishi, T., & Shiina, H. (2020). Distributed Representation Computation Using CBOW Model and Skip–gram Model. In 2020 9th International Congress on Advanced Applied Informatics (IIAI-AAI) (pp. 845-846). IEEE.
  • Özçift, A., Kilinç, D., & Bozyiğit, F. (2019). Application of Grid Search Parameter Optimized Bayesian Logistic Regression Algorithm to Detect Cyberbullying in Turkish Microblog Data. Akademik Platform Mühendislik ve Fen Bilimleri Dergisi, 7(3), 355-361.
  • Peterson, L. E. (2009). K-nearest neighbor. Scholarpedia, 4(2), 1883.
  • Pinto, M. B., & Mansfield, P. (2012). Facebook as a complaint mechanism: An investigation of millennials. Journal of Behavioral Studies in Business, 5, 1.
  • Rossmann, A., Wilke, T., & Stei, G. (2017). Usage of social media systems in customer service strategies. In Proceedings of the 50th Hawaii International Conference on System Sciences.
  • Salton, G., & Yang, C. S. (1973). On the specification of term values in automatic indexing. Journal of documentation.
  • Shahzad, K., Majid, H. S., & Fahad, Y. (2013). Determinants of Customer Satisfaction in Fast Food Industry A Study of Fast Food Restaurants Peshawar Pakistan. Studia commercialia Bratislavensia, 6(21), 56-65.
  • Sohail, S. S., Siddiqui, J., & Ali, R. (2016). Feature extraction and analysis of online reviews for the recommendation of books using opinion mining technique. Perspectives in Science, 8, 754-756.
  • Stoica, E. A., & Özyirmidokuz, E. K. (2015). Mining customer feedback documents. International Journal of Knowledge Engineering, 1(1), 68-71.
  • Tax, S. S., Brown, S. W., & Chandrashekaran, M. (1998). Customer evaluations of service complaint experiences: implications for relationship marketing. Journal of marketing, 62(2), 60-76.
  • Tripp, T. M., & Grégoire, Y. (2011). When unhappy customers strike back on the Internet. MIT Sloan Management Review, 52(3), 37-44.
  • Tunali, V., & Bilgin, T. T. (2012, June). PRETO: A high-performance text mining tool for preprocessing Turkish texts. In Proceedings of the 13th International Conference on Computer Systems and Technologies (pp. 134-140).
  • Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2012). Experimentation in software engineering. Springer Science & Business Media.
  • Wright, R. E. (1995). Logistic regression.
  • Zhang, W., Yoshida, T., & Tang, X. (2011). A comparative study of TF* IDF, LSI and multi-words for text classification. Expert Systems with Applications, 38(3), 2758-2765.
Zeki Sistemler Teori ve Uygulamaları Dergisi-Cover
  • Başlangıç: 2018
  • Yayıncı: Özer UYGUN
Sayıdaki Diğer Makaleler

Bulanık Kesitsel SWOT Kullanılarak Türkiye için Yenilenebilir Enerjiye İlişkin Odak Stratejilerinin Belirlenmesi

Buket KARATOP, Buşra TAŞKAN, Elanur ADAR

Yapay Zekâ Tabanlı Doğal Dil İşleme Yaklaşımını Kullanarak İnternet Ortamında Yayınlanmış Sahte Haberlerin Tespiti

Mesut TOĞAÇAR, Kamil Abdullah EŞİDİR, Burhan ERGEN

Sınıflama Algoritmalarının Yağışın Varlığını Kestirme Konusundaki Performanslarının Karşılaştırması

Hakan KOÇAK

Stockwell Dönüşümü Tabanlı Güç Kalitesi Bozunumlarının Destek Vektör Makinası ve Yapay Sinir Ağları ile Sınıflandırılması

Ezgi GÜNEY, Ozan ÇAKMAK, Çağri KOCAMAN

Makine Öğrenme Teknikleri Kullanılarak Kükürt Giderme İşleminde Kullanılan Malzeme Miktarının Tahmini

Emrullah SONUÇ, Esra ÖZCAN

Plastik Ekstrüzyon Sürecinde Derin Öğrenme İle Hata Kategorilerinin Tahmini

Fatma DEMİRCAN KESKİN, Ural ÇİÇEKLİ, Doğukan İÇLİ

Göğüs Kanseri Verileri Üzerinde Makine Öğrenmesi Yöntemlerinin Uygulanması

Ebru AYDINDAĞ BAYRAK, Pınar KIRCI, Tolga ENSARİ, Engin SEVEN, Mustafa DAĞTEKİN

Makine Öğrenimi ve Beta Regresyon Modelleri Kullanılarak Lise Giriş Sınavı Başarı Oranlarının Tahmini

Tuba KOC, Pelin AKIN

Categorization of Customer Complaints in Food Industry Using Machine Learning Approaches

Fatma BOZYİĞİT, Onur DOĞAN, Deniz KILINÇ

Toplu Taşıma Sistemlerinin Evrimsel Algoritmalarla Optimizasyonu

Salih Serkan KALELİ, Mehmet BAYĞIN, Abdullah NARALAN