İşbirlikçi Filtreleme Algoritmalarının Çok-Beğenilen Ürünlere Yönelik Yanlılığı

Öneri sistemleri, bireysel kullanıcılara herhangi bir kişisel çaba gerektirmeden geçmişteki tercihlerine ve özelliklerine göre uygun ürünleri/hizmetleri öneren otomatikleştirilmiş araçlardır. Bu sistemlerde, işbirlikçi filtreleme algoritmaları, ürünler için bireysel tahminler veya kullanıcılar için tercih edilir ürünlerin sıralı bir listesini üretmek için en çok kullanılan yaklaşımlardır. Bu tür algoritmaların verimliliği genellikle sağlanan önerilerin doğruluğu ile değerlendirilse de, ürün kataloğu kapsamı gibi doğruluk-üstü değerlendirmeler de nitelikli önerilerde kritik faktörler olarak kabul edilir. Ancak, son zamanlarda yapılan birçok çalışma, bu algoritmaların, belirli özellikleri (örn. popülerlik) nedeniyle bazı ürünleri üretilen sıralı listelerde diğerlerinden daha çok öne çıkarma eğiliminde olduğunu göstermiştir. Bu çalışmada, ürün profillerini farklı bir bakış açısıyla, beğenilme dereceleriyle irdeliyor ve işbirlikçi filtreleme algoritmalarının çok beğenilen ürünlere yönelik bir yanlılığının olup olmadığını araştırıyoruz. Bu amaçla, üç farklı kategoriden dokuz önemli işbirlikçi filtreleme algoritmasını kullanıyoruz ve iki gerçek-dünya veri kümesi üzerinde çeşitli deneyler gerçekleştiriyoruz. Deneysel sonuçlar, hemen hemen tüm algoritmaların çok beğenilen ürünlere yönelik güçlü bir yanlılığının olduğunu ve SVD ile SVD++ gibi matris çarpanlarına ayırma tabanlı algoritmaların yüksek kalitede öneriler üretmede diğerlerinden daha başarılı olduğunu göstermiştir.

Collaborative Filtering Algorithms’ Bias Towards Highly-liked Items

Recommender systems are automated tools that suggest appropriate products/services to individual users based on their preferences in the past and characteristics without requiring any personal effort. In these systems, collaborative filtering algorithms are the most utilized approaches to produce individual predictions or a ranked list of preferable items for users. Although such algorithms' efficiency is generally assessed with the accuracy of provided recommendations, beyond-accuracy evaluations such as item catalog coverage are also considered critical factors in qualified recommendations. However, many recent studies demonstrate that these algorithms tend to feature certain items than others in the produced ranked lists because of their specific properties (e.g., popularity). In this study, we scrutinize item profiles with a different point of view, the degrees of being liked, and investigate whether there is any bias of collaborative filtering algorithms towards highly-liked items or not. To this end, we adopt nine prominent collaborative filtering algorithms in three different categories and perform various experiments on two real-world datasets. The experimental results demonstrate that almost all algorithms are strongly biased towards highly-liked items, and matrix factorization based algorithms such as SVD and SVD++ are more successful than others in producing high-quality recommendations.

Keywords:

Recommender Systems, Collaborative Filtering, Algorithmic Bias, Item Profile Catalog Coverage,

PDF

___

[1] Lu, J., Wu, D., Mao, M., Wang, W. & Zhang, G. (2015). Recommender system application developments: a survey. Decision Support Systems, 74, 12-32.
[2] Afoudi, Y., Lazaar, M. & Al Achhab, M. (2018). Collaborative filtering recommender system. International Conference on Advanced Intelligent Systems for Sustainable Development, 332-345.
[3] Batmaz, Z., Yurekli, A., Bilge, A. & Kaleli, C. (2019). A review on deep learning for recommender systems: challenges and remedies. Artificial Intelligence Review, 52(1), 1-37.
[4] Su, X. & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Advances in artificial intelligence, 2009, 1-19.
[5] Kaleli, C. (2014). An entropy-based neighbor selection approach for collaborative filtering. KnowledgeBased Systems, 56, 273-280.
[6] Yalçın, E., İsmailoğlu, F. & Bilge, A. (2021). An entropy empowered hybridized aggregation technique for group recommender systems. Expert Systems with Applications, 166, 114111.
[7] Abdollahpouri, H., Mansoury, M., Burke, R. & Mobasher, B. (2020). The Connection Between Popularity Bias, Calibration, and Fairness in Recommendation. Fourteenth ACM Conference on Recommender Systems, 726-731.
[8] Abdollahpouri, H., Mansoury, M., Burke, R. & Mobasher, B. (2019). The unfairness of popularity bias in recommendation. arXiv preprint arXiv, 1907, 13286.
[9] Boratto, L., Fenu, G. & Marras, M. (2019). The effect of algorithmic bias on recommender systems for massive open online courses. European Conference on Information Retrieval, 457-472.
[10] Chen, J., Dong, H., Wang, X., Feng, F., Wang, M. & He, X. (2020). Bias and Debias in Recommender System: A Survey and Future Directions. arXiv preprint arXiv, 2010, 03240.
[11] Ramos, G., Boratto, L. & Caleiro, C. (2020). On the negative impact of social influence in recommender systems: A study of bribery in collaborative hybrid algorithms. Information Processing & Management, 57(2), 102058.
[12] Boratto, L., Fenu, G. & Marras, M. (2021). Connecting user and item perspectives in popularity debiasing for collaborative recommendation. Information Processing & Management, 58(1), 102387.
[13] Bobadilla, J., Ortega, F., Hernando, A. & Gutiérrez, A. (2013). Recommender systems survey. Knowledgebased systems, 46, 109-132.
[14] Marlin, B., Zemel, R. S., Roweis, S. & Slaney, M. (2012). Collaborative filtering and the missing at random assumption. arXiv preprint arXiv, 1206. 5267.
[15] Steck, H. (2013). Evaluation of recommendations: rating-prediction and ranking. Proceedings of the 7th ACM conference on Recommender systems, 213-220.
[16] Hernández-Lobato, J. M., Houlsby, N. & Ghahramani, Z. (2014). Probabilistic matrix factorization with nonrandom missing data. International Conference on Machine Learning, 1512-1520.
[17] Ge, Y., Zhao, S., Zhou, H., Pei, C., Sun, F., Ou, W. & Zhang, Y. (2020). Understanding echo chambers in ecommerce recommender systems. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2261-2270.
[18] Steck, H. (2010). Training and testing of recommender systems on data missing not at random. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 713-722.
[19] Chen, J., Wang, C., Ester, M., Shi, Q., Feng, Y. & Chen, C. (2018). Social recommendation with missing not at random data. 2018 IEEE International Conference on Data Mining (ICDM), 29-38.
[20] Saito, Y. (2020). Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 309-318.
[21] Wang, T. & Wang, D. (2014). Why Amazon's ratings might mislead you: The story of herding effects. Big data, 2(4), 196-204.
[22] Liu, Y., Cao, X. & Yu, Y. (2016). Are You Influenced by Others When Rating? Improve Rating Prediction by Conformity Modeling. Proceedings of the 10th ACM Conference on Recommender Systems, 269-272.
[23] Krishnan, S., Patel, J., Franklin, M. J. & Goldberg, K. (2014). A methodology for learning, analyzing, and mitigating social influence bias in recommender systems. Proceedings of the 8th ACM Conference on Recommender systems, 137-144.
[24] Chaney, A. J., Blei, D. M. & Eliassi-Rad, T. (2015). A probabilistic model for using social networks in personalized item recommendation. Proceedings of the 9th ACM Conference on Recommender Systems, 43- 50.
[25] Wang, X., Hoi, S. C., Ester, M., Bu, J. & Chen, C. (2017). Learning personalized preference of strong and weak ties for social recommendation. Proceedings of the 26th International Conference on World Wide Web, 1601-1610.
[26] Tang, J., Gao, H. & Liu, H. (2012). mTrust: Discerning multi-faceted trust in a connected world. Proceedings of the fifth ACM international conference on Web search and data mining, 93-102.
[27] Abdollahpouri, H., Burke, R. & Mobasher, B. (2017). Controlling popularity bias in learning-to-rank recommendation. Proceedings of the Eleventh ACM Conference on Recommender Systems, 42-46.
[28] Jannach, D., Kamehkhosh, I. & Bonnin, G. (2016). Biases in automated music playlist generation: A comparison of next-track recommending techniques. Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization, 281-285.
[29] Liu, D., Cheng, P., Dong, Z., He, X., Pan, W. & Ming, Z. (2020). A general knowledge distillation framework for counterfactual recommendation via uniform data. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 831-840.
[30] Joachims, T., Granka, L., Pan, B., Hembrooke, H. & Gay, G. (2017). Accurately interpreting clickthrough data as implicit feedback. ACM SIGIR Forum, 4-11.
[31] Collins, A., Tkaczyk, D., Aizawa, A. & Beel, J. (2018). A study of position bias in digital library recommender systems. arXiv preprint arXiv, 1802.06565.
[32] Koren, Y. (2010). Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1-24.
[33] Herlocker, J., Konstan, J.A. & Riedl, J. (2002). An Empirical Analysis of Design Choices in NeighborhoodBased Collaborative Filtering Algorithms. Information Retrieval, 5, 287–310.
[34] Lemire, D. & Maclachlan, A. (2005). Slope one predictors for online rating-based collaborative filtering. Proceedings of the 2005 SIAM International Conference on Data Mining, 471-475.
[35] Bokde, D., Girase, S. & Mukhopadhyay, D. (2015). Matrix factorization model in collaborative filtering algorithms: A survey. Procedia Computer Science, 49, 136-146.
[36] George, T. & Merugu, S. (2005). A scalable collaborative filtering framework based on co-clustering. Fifth IEEE International Conference on Data Mining (ICDM'05), 4.
[37] Khenissi, S. & Nasraoui, O. (2020). Modeling and counteracting exposure bias in recommender systems. arXiv preprint arXiv, 2001.04832.