İşbirlikçi Filtreleme için Pearson Korelasyonu Üzerine Statik ve Dinamik Önem Ağırlıklandırma Çarpanları Çalışması

Veri madenciliği ve bilgi keşfinin bir alanı olarak öneri sistemleri, film tavsiye platformları üzerinde muazzam bir etkiye sahiptir. Profilleri göz önünde bulundurarak izleyiciler için uygun tavsiye ölçülebilir bir argümandır. Kullanıcı oylama eylemleri gibi bazı sayısal veri içerisindeki doğrusal kombinasyonları çıkararak istatistiksel analizler yapılabilir. Böylece, film gibi herhangi bir öğe kullanıcıya önerilebilir veya önerilmeyebilir. Korelasyonların sayısal hesaplaması, yani benzerlik ağırlığı, kullanıcı benzerliklerinin etkisini daha fazla sabit çarpımla arttırmak için tahminden önce yeniden hesaplanmalıdır. Bu yöntem, benzerliklerin etkisini vurgulamak için bir adım daha işleyen önem ağırlıklandırması olarak adlandırılır. Kullanıcılar arasındaki yakınlık, ortak oylanan öğelerin toplam sayısı veya daha karmaşık hesaplamalar yapılan başka bir çıkarım olabilir. Bu çalışmada, Pearson Korelasyonu ile ilgili önem ağırlıklandırma yöntemi karşılaştırmalı yaklaşımlar kullanılarak incelenmiştir. Deneylerde hem ML100K hem de ML1M sürümlerini içeren MovieLens veri kümesi kullanılır. k-katlamalı çapraz doğrulama yöntemi, test sayısını artırmak için kaydırmalı tarzda uygulanır. Kullanıcı-kullanıcı benzerlikleri için Pearson Korelasyon Katsayılarını elde ettikten sonra, ağırlıklar üç farklı yaklaşım kullanılarak ifade edilir. Ardından komşular, testteki kullanıcı için en yakın N kullanıcıyı seçmek üzere sıralanır. Deneysel sonuçlarla ilgili olarak, diğer iki tekniğe göre, basitliği ve performansı hesaba katılarak, sadece ortak oylanan öğe sayısını kullanan açık yöntem tercih edilir. Deneysel grafiklerde, doğruluk ve hata ölçümleri üç farklı önem ağırlıklandırma yaklaşımı için sunulmuştur. Özellikle ML100K veri kümesi için, basit ağırlıklandırma yöntemi hata ölçümleri açısından daha iyi performans gösterir.

A Study of Static and Dynamic Significance Weighting Multipliers on the Pearson Correlation for Collaborative Filtering

Recommender systems as a field of data mining and knowledge discovery have a tremendous impact on movie recommendation platforms. Proper recommendation for the audience, considering profiles, is a measurable argument. By inferencing the linear combinations between some numerical data such as user rating actions, statistical analyses can be done. Thus, any item such as a movie can be recommended or not. The numerical calculation of correlations, namely the similarity weight, should be recomputed before prediction to increase the effect of user similarities for further constant multiplications. This method is named as the significance weighting that processes one more step to stress the impact of similarities. The affinity between users can simply be the total number of co-rated items, or any further inference using more complex computations. In this work, the significance weighting method related to Pearson Correlation is inspected using comparative approaches. The MovieLens dataset, both including ML100K and ML1M releases, are used in the experiments. k-fold cross-validation method is applied in a shifting fashion to increase the number of tests. After having Pearson Correlation Coefficients for user-user similarities, weights are signified using three different approaches. Then, neighbors are sorted to choose the top-N closest users for the user in the test. Concerning experimental results, over two other techniques, an explicit method that utilizes only the co-rated item count is preferred taking its simplicity and performance into account. In the plots of experimental results section, accuracy and error metrics are presented for three different significance weighting approaches. Especially for the ML100K dataset, the simple weighting method outperforms in terms of the error metrics.

___

  • Ahmad, S. & Afzal, M. T. (2020). Combining metadata and co-citations for recommending related papers. Turkish Journal of Electrical Engineering and Computer Sciences 28 (3), 1519–34.
  • Aiolli, F. (2013). Efficient top-N recommendation for very large scale binary rated datasets.” RecSys 2013 - Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, 273–280.
  • Aygun, S. & Okyay, S. (2015). Improving the Pearson similarity equation for recommender systems by age parameter. Adv. Information, Electron. Electr. Eng. AIEEE, Riga, Latvia.
  • Bellogín, A., Castells, P., & Cantador, I. (2014). Neighbor selection and weighting in user-based collaborative filtering: a performance prediction approach. ACM Trans. on the Web, (8), 12.
  • Breese, J. S., Heckerman, D., & Kadie, C. (2013). Empirical analysis of predictive algorithms for collaborative filtering. UAI'98, The 14th Conference on Uncertainty in Artificial Intelligence, San Francisco, CA, USA, 43–52.
  • Dhawan, S., Singh, K., & Jyoti. (2015). High rating recent preferences based recommendation system. Procedia Computer Science (70), 259–64.
  • Gao, M., Fu, Y, Chen, Y, & Jiang, F. (2012). User-weight model for item-based recommendation systems. Journal of Software 7 (9), 2133–2140.
  • Ghazanfar, M. A. & Prugel-Bennett, A. (2010). Novel significance weighting schemes for collaborative filtering: generating improved recommendations in sparse environments. DMIN’10, International Conference on Data Mining, USA.
  • Harper, F. M. & Konstan, J. A. (2015). The MovieLens datasets: History and context. ACM Transactions on Interactive Intelligent Systems 5 (4), 1–19.
  • Herlocker, J. L. , Konstan, J. A., Borchers, A., & Riedl, J. (2017). An algorithmic framework for performing collaborative filtering. ACM SIGIR Forum.
  • Herlocker, J., Konstan, J. A., & Riedl, J. (2002). An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information Retrieval 5 (4), 287–310.
  • Hong-Xia, W. (2019). An improved collaborative filtering recommendation algorithm.” 4th IEEE International Conference on Big Data Analytics, ICBDA, Suzhou, China, 431–35.
  • Hwang, C. S. & Chen Y. P. (2007). Using trust in collaborative filtering recommendation. Lect. Notes Comput. Sci., New Trends in Applied Artificial Intelligence, (4570).
  • Levinas, C. A. (2014). An analysis of memory based collaborative filtering recommender systems with improvement proposals. M.Sc. Thesis, UPC, 1–89.
  • LVN, R. Wang, R., & Raj, J. D. (2014). Recommending news articles using Cosine Similarity function. Proc. SAS Glob. Forum, Washington, DC, USA.
  • Ma, H., King, I., & Lyu, M. R. (2007). Effective missing data prediction for collaborative filtering. Proc. 30th Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retrieval, SIGIR’07, 39–46.
  • Madadipouya, K. (2015). A location-based movie recommender system using collaborative filtering. International Journal in Foundations of Computer Science & Technology 5 (4), 13–19.
  • McLaughlin, M. R. & Herlocker, J. L. (2004). A collaborative filtering algorithm and evaluation metric that accurately model the user experience. 27th Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., 329–336.
  • Nguyen, L. V., Hong, M. S., Jung, J. J., & Sohn, B. S. (2020). Cognitive similarity-based collaborative filtering recommendation system. Applied Sciences, 10(12), 1–14.
  • Philip, S., Shola, P. B., & John, A. O. (2014). Application of content-based approach in research paper recommendation system for a digital library. International Journal of Advanced Computer Science and Applications 5 (10), 37–40.
  • Powers, D. M. W. (2007). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies 2 (1), 37–63.
  • Raeesi, M. & Shajari, M. (2012). An enhanced significance weighting approach for collaborative filtering. 6th International Symposium on Telecommunications, IST 2012, Tehran, Iran, 1165–1169.
  • Samad, A., Islam, M. A., Iqbal, M. A., & Aleem, M. (2019). Centrality-based paper citation recommender system. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems 6 (19).
  • Saric, A., Hadzikadic, M., & Wilson, D. (2009). Alternative formulas for rating prediction using collaborative filtering,” Int. Symp. on Method. for Intell. Sys., Lecture Notes in Computer Science, Springer, 5722, 301–310.
  • Sheugh, L. & Alizadeh, S. H. (2015). A note on Pearson correlation coefficient as a metric of similarity in recommender system. 2015 AI and Robotics, IRANOPEN 2015, Qazvin, Iran.
  • Singh, R. H., et al. (2020). Movie recommendation system using cosine similarity and KNN. International Journal of Engineering and Advanced Technology 9 (5), 556–559.
  • Weng, J., Miao, C., & and Goh., A. (2006). Improving collaborative filtering with trust-based metrics.” Proceedings of the ACM Symposium on Applied Computing, 1860–1864.
  • Zeybek, H., & Kaleli, C. (2018). Dynamic k neighbor selection for collaborative filtering. Anadolu Universıty Journal of Science And Technology A - Applied Sciences and Engineering, 19 (2), 303-315.
  • Zhang, B. & Yuan, B. (2017). Improved collaborative filtering recommendation algorithm of similarity measure. Materials Science, Energy Technology, and Power Engineering I, AIP Conf. Proc., 1839.
  • Zhang, L. et al. (2020). Diversity balancing for two-stage collaborative filtering in recommender systems. Applied Sciences 10 (4), 1-16.