Otel Oda Fiyatlarını Açıklamada Makine Öğrenmesi Algoritmalarının Kıyaslanması
Amaç – Çalışmada, makine öğrenmesi algoritmalarından bazılarının, web madenciliği ile elde edilen büyük veri kullanılarak, analiz edilmesi ve otel oda fiyatlarını açıklama performanslarının test edilmesi amaçlanmaktadır. Böylelikle otel oda fiyatlarını en doğru açıklayan modelin belirlenmesidir. Yöntem – Verinin elde edilmesinde web madenciliği/web kazıma yöntemi kullanılmıştır. Hedef web sitesi geliştirilen bir algoritma yardımıyla yaklaşık altı ay boyunca taranmış ve elde edilen 6558 konaklama tesisine ait veriler, analizlerde kullanılmıştır. Araştırmanın ikinci kısmı istatistiksel analizlerden ve makine öğrenmesi algoritmalarının uygulanmasından oluşmaktadır. Analizlerin yapılması ve algoritmaların uygulanması için Python programlama dili kullanılmıştır. Bu dile ait kütüphaneler, pandas, numpy veri işleme için, seaborn, matplotlib grafikler ve görselleştirme için, scikit-learn ise makine öğrenmesi algoritmaları için kullanılmıştır. Analizlerden sonra veri için en uygun olduğu düşünülen lojistik regresyon ile bir model oluşturulmuştur. Bulgular – Karşılaştırılan Rassal Orman ve Karar Ağacı algoritmalarının her ikisinin de yaklaşık %99,89 oranında veri setini açıkladığı dolayısıyla ağaç/dallanmaların başarı ile gerçekleştiği görülmektedir. KNN algoritması ise en yüksek performansı üç kümeli bir sınıflandırma ile %62,12 oranında gerçekleştirmiştir. Doğrusal sınıflandırma yöntemini kullanan Lojistik Regresyon, Olasılıksal Dereceli Azalma ve Destek Vektör Makineleri algoritmalarından en yüksek skoru %39,13 ile lojistik regresyon yöntemi elde etmiştir. Lojistik regresyon ile oluşturulan modelde, konukların tesise verdikleri puan, tesisin bölgede diğer tesisler arasındaki sırası, tesisin türü ve bulunduğu şehir istatistiki olarak anlamlı bulunmuştur (p
Comparison of Machine Learning Algorithms in Explaining of Hotel Room Prices
Purpose: The aim of the study is to analyze some of the main machine learning algorithms using big data obtained by web mining and to test the performance of these algorithms to explain hotel room prices. Thus, it is the determination of the model that best explains the hotel room prices. Design/methodology/approach – Web mining/scraping method was used to obtain research data. The target website was scanned for about six months with the help of an algorithm, and the data obtained from 6558 accommodation facilities were used in analysis. The second part of the research consists of statistical analysis and comparison of machine learning algorithms. Python programming language was used for analysis and implementation of algorithms. Pandas, numpy libraries for data processing; seaborn, matplotlib for graphics and visualization; scikit-learn is used to run machine learning algorithms. After the analysis, a model was created by logistic regression, which was thought to be the most suitable for the data. Results: It is seen that the compared Random Forest and Decision Tree algorithms both explain the data set at a rate of approximately 99.89%, so the tree/branching has been successful. The KNN algorithm achieved the highest performance with a classification of three clusters at 62.12%. Logistic Regression, Stochastic Gradient Decent and Support Vector Machines using the linear classification method obtained the highest score with 39.13% logistic regression method. In the model created by logistic regression, the score given to the hotel by the guests, the rank of the hotel among other hotels in the region, the type of the hotel and the city in which it is located were found to be statistically significant (p
___
- Aguiló, E., Alegre, J. ve Sard, M. (2003). Examining the Market Structure of The German and UK Tour Operating Industries Through An Analysis of Package Holiday Prices. Tourism Economics, 9(3), 255- 278.
- Ağca, Y. (2019). Çevrimiçi Seyahat Acentalarında Oda Fiyatlarına Etki Eden Faktörlerin Araştırılması (Yayınlanmamış Doktora Tezi). Atatürk Üniversitesi, Sosyal Bilimler Enstitüsü, Erzurum.
- Akın, M. (2015). A novel approach to model selection in tourism demand modeling. Tourism Management, 64- 72. doi:https://doi.org/10.1016/j.tourman.2014.11.004
- Andersson, D. E. (2008). Hotel attributes and hedonic prices: an analysis of internet-based transactions in Singapore’s market for hotel rooms. Ann Reg Sci, 229-240. doi:10.1007/s00168-008-0265-4
- Antonio, N., Almeida, A. d. ve Nunes, L. (2017). Predicting Hotel Bookings Cancellation with a Machine Learning Classification Model. 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (s. 1049-1054). Cancun, Mexico: IEEE. doi:10.1109/ICMLA.2017.00-11
- Bilogur, A. (2018, 04 28). Support vector machines and stoch gradient descent. https://www.kaggle.com/residentmario/support-vector-machines-and-stoch-gradient-descent adresinden alındı
- Bull, A. O. (1994). Pricing A Motel's Location. International Journal of Contemporary Hospitality Management, 6(6), 10-15.
- Burger, C., Dohnal, M., Kathrada, M. ve Law, R. (2001). A practitioner's guide to time-series methods for tourism demand forecasting – a case study of Durban, South Africa. Tourism Management, 403-409.
- Carvell, S. A. ve Herrin, W. E. (1990). Pricing in the Hospitality Industry: An Implicit Market Approach. Hospitality Review, 27-37. http://scholarship.sha.cornell.edu/articles/194/ adresinden alındı
- Castro, C., Ferreira, F. A. ve Ferreira, F. (2016). Trends in hotel pricing Identifying guest value hotel attributes using the cases of Lisbon and Porto. Worldwide Hospitality and Tourism Themes, 8(6), 691-698. doi:10.1108/WHATT-09-2016-0047
- Cho, V. (2003). A comparison of three different approaches to tourist arrival forecasting. Tourism Management, 323-330. doi:https://doi.org/10.1016/S0261-5177(02)00068-7
- Crown, W. H. (1998). Statistical Models for the Social and Behavioral Sciences: Multiple Regression and Limiteddependent Variable Models. London, UK: Greenwood Publishing Group.
- Dontha, R. (2018, 12 19). Digital Transformation. 05 04, 2019 tarihinde Data Mining Steps: https://digitaltransformationpro.com/data-mining-steps/ adresinden alındı
- Espinet, J. M., Saez, M., Coenders, G. ve Fluvià, M. (2003). Effect on Prices of the Attributes of Holiday Hotels: A Hedonic Prices Approach. Tourism Economics, 9(2), 1-13.
- Frumusanu, A. (2018, 03 26). www.anandtech.com. 04 28, 2019 tarihinde The Samsung Galaxy S9 and S9+ Review: Exynos and Snapdragon at 960fps: https://www.anandtech.com/show/12520/the-galaxy-s9- review/6 adresinden alındı
- García, S., Luengo, J. ve Herrera, F. (2015). Data Preprocessing in Data Mining. Cham, Switzerland: Springer International Publishing.
- Hadavandi, E., Ghanbari, A., Shahanaghi, K. ve Abbasian-Naghneh, S. (2011). Tourist arrival forecasting by evolutionary fuzzy systems. Tourism Management, 1196-1203. doi:https://doi.org/10.1016/j.tourman.2010.09.015
- Han, J., Kamber, M. ve Pei, J. (2012). Data Mining: Concepts and Techniques (3 b.). Waltham: Morgan Kaufmann Publishers.
- http://www3.tcmb.gov.tr/enflasyoncalc/enflasyonyeni.php, erişim tarihi 14.02.2021
- https://evds2.tcmb.gov.tr/index.php, erişim tarihi 14.02.2021
- Isuhuaylas, L. A., Hirata, Y., Santos, L. C. ve Torobeo, N. S. (2018). Natural Forest Mapping in the Andes (Peru): A Comparison of the Performance of Machine-Learning Algorithms. Remote Sensing, 10(5), 782. doi:http://dx.doi.org/10.3390/rs10050782
- Keskinkılıç, M., Ağca, Y. ve Karaman, E. (2016). İnternet ve Bilgi Sistemleri Kullanımının Turizm Dağıtım Kanallarına Etkisi Üzerine Bir Uygulama. İşletme Araştırmaları Dergisi, 8(4), 445-472. doi:10.20491/isarder.2016.227
- Ku, C. H., Chang, Y.-C., Wang, Y., Chen, C.-H. ve Hsiao, S.-H. (2019). Artificial Intelligence and Visual Analytics: A Deep-Learning Approach to Analyze Hotel Reviews & Responses. 52nd Hawaii International Conference on System Sciences(s. 5268-5277). Honolulu, US: University of Hawaii at Manoa. doi:10.24251/HICSS.2019.634
- Law, R. (2000). Back-propagation learning in improving the accuracy of neural network-based tourism demand forecasting. Tourism Management, 331-340. doi:https://doi.org/10.1016/S0261-5177(99)00067-9
- Law, R. ve Au, N. (1999). A neural network model to forecast Japanese demand for travel to Hong Kong. Tourism Management, 89-97. doi:https://doi.org/10.1016/S0261-5177(98)00094-6
- Lebanon, G. ve El-Geish, M. (2018). Computing with Data: An Introduction to the Data Industry. Cham, Switzerland: Springer.
- Lewis-Beck, M. (1995). Data Analysis: An Introduction. Thousand Oak, US: SAGE Publications.
- Löch, M. ve Axhausen, K. W. (2010). Modeling Hedonic Residential Rents for Land Use and Transport Simulation While Considering Spatial Effects. The Journal of Transport and Land Use, 3(2), 39-63. doi:10.1598/jtlu.v3i2.117
- Lutz, M. (2009). Learning Python: Powerful Object-Oriented Programming (4 b.). Sebastopol, US.: O'Reilly Media, Inc.
- Maksymenko, S. (2020). 10 AI And Machine Learning Trends To Impact Business In 2020. mobidev.biz: https://mobidev.biz/blog/future-ai-machine-learning-trends-to-impact-business adresinden alındı
- Monson, M. (2009). Valuation Using Hedonic Pricing Models. Cornell Real Estate Review, 7, 62-73.
- Noi, P. T. ve Kappas, M. (2017). Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors, 18(1), 1-20. doi:http://dx.doi.org/10.3390/s18010018
- Papatheodorou, A. (2002). Exploring Competitiveness in Mediterranean Resorts. Tourism Economics, 8(2), 133- 150.
- Pattie, D. C. ve Snyder, J. (1996). Using a neural network to forecast visitor behavior. Annals of Tourism Research, 151-164. doi:https://doi.org/10.1016/0160-7383(95)00052-6.
- Razavi, R. ve Israeli, A. A. (2019). Determinants of online hotel room prices: comparing supply-side and demand-side decisions. International Journal of Contemporary Hospitality Management, 31(5), 2149-2168.
- Rose, S. (2020, 03 21). What is the Future of Machine Learning? codeburst.io: https://codeburst.io/what-is-thefuture-of-machine-learning-f93749833645 adresinden alındı
- Sánchez-Medina, A. J. ve C-Sánchez, E. (2020). Using machine learning and big data for efficient forecasting of hotel booking cancellations. International Journal of Hospitality Management, 89, 1-9. doi:10.1016/j.ijhm.2020.102546
- Shalev-Shwartz, S. ve Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms (1 b.). New York, US: Cambridge University Press.
- Shehhi, M. A. ve Karathanasopoulos, A. (2020). Forecasting hotel room prices in selected GCC cities using deep learning. Journal of Hospitality and Tourism Management, 42, 40-50.
- Sun, S., Wei, Y., Tsui, K.-L. ve Wang, S. (2019). Forecasting tourist arrivals with machine learning and internet search index. Tourism Management, 70, 1-10. doi:10.1016/j.tourman.2018.07.010
- Taylor, P. (1995). Measuring Changes in the Relative Competitiveness of Package Tour Destinations. Tourism Economics, 1(2), 169-182.
- Tomiazzi, J. S., Pereira, D. R., Judai, M. A., Antunes, P. A. ve Favareto, A. P. (2019). Performance of machinelearning algorithms to pattern recognition and classification of hearing impairment in Brazilian farmers exposed to pesticide and/or cigarette smoke. Environmental Science and Pollution Research, 26, 6481–6491. doi:https://doi.org/10.1007/s11356-018-04106-w
- Williams, N., Zander, S. ve Armitage, G. (2006). A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. ACM SIGCOMM Computer Communication Review, 36(5), 7-15. doi:https://doi.org/10.1145/1163593.1163596
- Yang, Y., Tang, J., Luo, H. ve Law, R. (2015). Hotel location evaluation: A combination of machine learning tools and web GIS. International Journal of Hospitality Management, 47, 14-24.
- Yayar, R. ve Gül, D. (2014). Mersin Kent Merkezinde Konut Piyasası Fiyatlarının Hedonik Tahmini. Anadolu Üniversitesi Sosyal Bilimler Dergisi, 14(3), 87-99.