Application of Random Forest Algorithm for the Prediction of Online Food Delivery Service Delay

Application of Random Forest Algorithm for the Prediction of Online Food Delivery Service Delay

Online shopping industry nowadays has been growing rapidly with the evolution of technology. Consumers have started to shop according to certain criteria with the spread of the online shopping sector. One of the sectors that enlightens the future customer in terms of service quality by getting feedback from purchases (comments or ratings) is the online food sector. In this study, a classification study is conducted to investigate the observance with the fast delivery criteria, which is one of the cornerstone criteria in the online food industry. Random Forest (RF) algorithm is applied for the classification. The most important advantage of the RF is it handles a large number of input variables also it is speediness. More than that RF algorithm reduces the overfitting problem and as a result variance is small and therefore it improves the accuracy. The application is implemented through R programming language. In this study, online food delivery variable is created as two categories (On time or Early and Late) and is estimated by RF algorithm that is applied this data for the first time. According to the results, the correct classification rate of the testing data for the estimation of the online food delivery status variable is found as 95.85%. In addition, the performances of the restaurants are compared for the customers. It turns out that the traffic situation does not greatly affect the result of the delivery status. As a result, RF algorithm is applied to the data obtained by web scraping techniques and the delivery status performance of restaurants is revealed with this study.

___

  • M. Akman, Y. Genç, H. Ankaralı, Random Forests Yöntemi ve Sağlık Alanında Bir Uygulama, Turkiye Klin. J. Biostat. 3 (2011) 36–48. https://www.turkiyeklinikleri.com/article/en-random-forests-yontemi-ve-saglik-alaninda-bir-uygulama-59725.html (accessed July 19, 2020).
  • Ö. Akar, O. Güngör, Rastgele Orman Algoritması Kullanılarak Çok Bantlı Görüntülerin Sınıflandırılması, J. Geod. Geoinf. 1 (2012) 139–146. doi:10.9733/jgg.241212.1t.
  • S. Özdemir, Random Forest Yöntemi Kullanılarak Potansiyel Dağılım Modellemesi ve Haritalaması: Yukarıgökdere Yöresi Örneği, Turkish J. For. | Türkiye Orman. Derg. 19 (2018) 51–56. doi:10.18182/tjf.342504.
  • T.E. Kalaycı, Kimlik Hırsızı Web Sitelerinin Sınıflandırılması İçin Makine Öğrenmesi Yöntemlerinin Karşılaştırılması, Pamukkale Univ. J. Eng. Sci. 24 (2018) 870–878. doi:10.5505/pajes.2018.10846.
  • M.E. Irmak, İ.B. Aydilek, Hava Kalite İndeksinin Tahmin Başarısının Artırılması için Topluluk Regresyon Algoritmalarının Kullanılması, Acad. Platf. J. Eng. Sci. 7 (2019) 507–514. doi:10.21541/apjes.478038.
  • S. Canaz Sevgen, Airborne Lidar Data Classification in Complex Urban Area Using Random Forest: A Case Study of Bergama, Turkey, Int. J. Eng. Geosci. 4 (2019) 45–51. doi:10.26833/ijeg.440828.
  • R. Çömert, D. Küçük Matcı, U. Avdan, Object Based Burned Area Mapping With Random Forest Algorithm, Int. J. Eng. Geosci. 4 (2019) 78–87. doi:10.26833/ijeg.455595.
  • R. Ünlü, Classification of Historical Anatolian Coins with Machine Learning Algorithms, Alphanumeric J. 7 (2019) 275–288. doi:10.17093/alphanumeric.620095.
  • H. Ekelik, D. Altaş, Dijital Reklam Verilerinden Yararlanarak Potansiyel Konut Alıcılarının Rastgele Orman Yöntemiyle Sınıflandırılması, İktisat Araştırmaları Derg. 3 (2019) 28–45. doi:10.24954/JORE.2019.27.
  • P. Akın, Y. Terzi, Dengesiz Veri Setli Sağkalım Verilerinde Cox Regresyon ve Rastgele Orman Yöntemlerin Karşılaştırılması, Veri Bilim. 3 (2020) 21–25. https://dergipark.org.tr/tr/pub/veri/issue/55996/642147 (accessed July 19, 2020).
  • B. Baba, G. Sevil, Predicting IPO Initial Returns Using Random Forest, Borsa Istanbul Rev. 20 (2020) 13–23. doi:10.1016/j.bir.2019.08.001.
  • M.A. Segura, J.C. Correa, Data of collaborative consumption in online food delivery services, Data Br. 25 (2019) 104007. doi:10.1016/j.dib.2019.104007.
  • J.C. Correa, Raw Data of A Web Mining Approach to Collaborative Consumption of Food Delivery Services, 1 (2018). doi:10.17632/M9Z9HW4NSC.1.
  • J.C. Correa, W. Garzón, P. Brooker, G. Sakarkar, S.A. Carranza, L. Yunado, A. Rincón, Evaluation of collaborative consumption of food delivery services through web mining techniques, J. Retail. Consum. Serv. 46 (2019) 45–50. doi:10.1016/j.jretconser.2018.05.002.
  • A. Güven, Topluluk Öğrenmesi (Ensemble Learning) Yöntemleri, (2019). https://medium.com/@anilguven1055/topluluk-öğrenmesi-ensemble-learning-3b71524297d5 (accessed July 27, 2020).
  • L. Breiman, Bagging predictors, Mach. Learn. 24 (1996) 123–140. doi:10.1023/A:1018054314350.
  • A. Singh, A Comprehensive Guide to Ensemble Learning, (2018). https://www.analyticsvidhya.com/blog/2018/06/comprehensive-guide-for-ensemble-models/ (accessed July 27, 2020).
  • H. Yılmaz, Random Forests Yönteminde Kayıp Veri Probleminin İncelenmesi ve Sağlık Alanında Bir Uygulama, 2014.
  • L. Breiman, Random forests, Mach. Learn. 45 (2001) 5–32. doi:10.1023/A:1010933404324.
  • K.J. Archer, R. V. Kimes, Empirical characterization of random forest variable importance measures, Comput. Stat. Data Anal. 52 (2008) 2249–2260. doi:10.1016/j.csda.2007.08.015.
  • L. Breiman, A. Cutler, Random forests - classification description, (2005). https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm (accessed August 4, 2020).
  • T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, 2009. doi:10.1007/978-0-387-84858-7.
  • J. Cohen, A Coefficient of Agreement for Nominal Scales, Educ. Psychol. Meas. 20 (1960) 37–46. doi:10.1177/001316446002000104.