Feature selection for movie recommendation

Feature selection for movie recommendation

TV users have an abundance of different movies they could choose from, and with the quantity and quality of data available both on user behavior and content, better recommenders are possible. In this paper, we evaluate and combine different content-based and collaborative recommendation methods for a Turkish movie recommendation system. Our recommendation methods can make use of user behavior, different types of content features, and other users behavior to predict movie ratings. We gather different types of data on movies, such as the description, actors, directors, year, and genre. We use natural language processing methods to convert the Turkish movie descriptions into keyword vectors. Then, for each user, we use the content features and the user s past implicit ratings to produce content feature-based user profiles. In order to have more reliable profiles, we do feature selection on these profiles. We show that for each feature space, such as actor, director, or keyword, a different amount of feature selection may be optimal. Different recommenders may also perform best for a different number of movies available as training data for a user. We also combine different content-based recommenders and collaborative recommenders using an aggregation or the best of the available recommenders. Experimental results on a dataset with hundreds of users and movies show that, especially for users who have watched a small number of movies in the past, feature selection can increase recommendation success.

___

  • [1] Santos da Silva F, Alves LGP, Bressan G. PersonalTVware: an infrastructure to support the context-aware recommendation for personalized digital TV. Int J Comput Inf Sci 2012; 4: 131–135.
  • [2] Ali K, Van Stam W. TiVo: Making show recommendations using distributed collaborative filtering architecture. In: KDD ’04: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2004; Seattle, WA, USA. New York, NY, USA: ACM. pp. 394–401.
  • [3] Wang J, de Vries AP, Reinders MJT. Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: 29th ACM SIGIR Conference on Information Retrieval; 2006; Seattle, WA, USA. New York, NY, USA: ACM. pp. 501-508.
  • [4] Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems, Comput J 2009; 42: 30–37.
  • [5] Melville P, Mooney RJ, Nagarajan R. Content-boosted collaborative filtering. In: Proceedings of Eighteenth National Conference on Artificial Intelligence (AAAI-2002); Alberta, Canada. pp 187–192.
  • [6] Glaser WT, Westergren TB, Stearns JP, Kraft JM. Consumer Item Matching Method and System. U.S. Patent No. 7,003,515. Washington, DC, USA: US Patent and Trademark Office, 2006.
  • [7] Mooney RJ, Roy, L. Content-based book recommending using learning for text categorization. In: 5th ACM Conference on Digital Libraries; 2000; San Antonio, TX, USA. New York, NY, USA: ACM. pp. 195–204.
  • [8] Pu P, Chen L, Hu R. Evaluating recommender systems from the users perspective: survey of the state of the art. User Model User-Adap 2012; 22: 317–355.
  • [9] Konstan JA, Riedl J. Recommender systems: from algorithms to user experience. User Model User-Adap 2012; 22: 101–123.
  • [10] Burke, R. Hybrid recommender systems: survey and experiments. User Model User-Adap 2002; 12: 331–370.
  • [11] Goren-Bar D, Glinansky O. FIT-recommending TV programs to family members. Comput Graph 2004; 28: 149–156.
  • [12] Yu Z, Zhou X, Hao Y, Gu, J. TV program recommendation for multiple viewers based on user profile merging. User Model User-Adap 2006; 16: 63–82.
  • [13] Ikawa K, Fukuhara T, Fujii H, Takeda H. Evaluation of a TV programs recommendation using the EPG and viewer’s log data. In: EuroITV; 2010; Tampere, Finland. New York, NY, USA: ACM. pp. 182–185.
  • [14] Degemmis M, Lops P, Semeraro G. A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation. User Model User-Adap 2007; 17: 217–255.
  • [15] Bilgin O, C¸ etino˘glu O, Oflazer K. Building a WordNet for Turkish. Rom J Inf Scı Tech 2004; 7: 163–172.
  • [16] Madylova A, G¨und¨uz-O˘g¨ud¨uc¨u S. Comparison of similarity measures for clustering Turkish documents. Intell Data ¨ Anal 2009; 13: 815–832.
  • [17] Ceylan U, Birt¨urk A. Combining feature weighting and semantic similarity measure for a hybrid movie recommender system. In: Fifth SNAKDD-ACM SIGKDD Workshop on Social Network Mining and Analysis, held in conjunction with the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2011); 21 August 2011; San Diego, CA, USA. New York, NY, USA: ACM. pp. 42–50.
  • [18] Guyon I, Elisseeff, A. An introduction to variable and feature selection. J Mach Learn Res 2003; 3: 1157–1182.
  • [19] Uluya˘gmur M, C¸ ataltepe Z, Tayfur E. Content-based movie recommendation using different feature sets. In: International Conference on Machine Learning and Data Analysis (ICMLDA’12); 2012; San Francisco, CA, USA. Hong Kong: IAENG. pp. 517–521.
  • [20] Yu Z, Zhou X. TV3P: An adaptive assistant for personalized TV. IEEE T Consum Electr 2004; 50: 393–399.
  • [21] Akın AA, Akın MD. Zemberek, an open source NLP framework for Turkic languages. Structure 2007; 10: 1–5.
  • [22] Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inform Process Manag 1988; 24: 513–523.
  • [23] Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. In: IEEE 2003 Bioinformatics Conference; 11–14 August 2003; Stanford, CA, USA. New York, NY, USA: IEEE. pp. 523–528.
  • [24] S¸enliol B, G¨ulgezen G, Yu L, C¸ ataltepe Z. Fast correlation based filter (FCBF) with a different search strategy. In: ISCIS 2008 Conference; 27–29 October 2008; ˙Istanbul, Turkey. New York, NY, USA: IEEE. pp. 1–4.
  • [25] Cover T, Thomas J. Elements of Information Theory. 2nd ed. New York, NY, USA: Wiley, 1991.
  • [26] C¸ ataltepe Z, Uluya˘gmur M, Tayfur E. TV program recommendation using implicit feedback with adaptive regularization. In: IEEE 20th Signal Processing and Communications Applications Conference (SIU); 2012; Mu˘gla, Turkey. New York, NY, USA: IEEE. pp. 18–20 (in Turkish with abstract in English).
  • [27] Lathia N, Hailes S, Capra, L. Temporal collaborative filtering with adaptive neighbourhoods. In: 32nd international ACM SIGIR Conference on Research and Development in Information Retrieval; 2009; Boston, MA, USA. New York, NY, USA: ACM. pp. 796–797.
  • [28] Berkovsky S, Kuflik T, Ricci F. Mediation of user models for enhanced personalization in recommender systems. User Model User-Adap 2008; 18: 245–286.
  • [29] Zukerman I, Albrecht DW. Predictive statistical models for user modeling. User Model User-Adap 2001; 11: 5–18.
  • [30] Macskassy SA, Provost F. Classification in networked data: A toolkit and a univariate case study. J Mach Learn Res 2007; 8: 935–983.
  • [31] Adomavicius G, Sankaranarayanan R, Sen S, Tuzhilin A. Incorporating contextual information in recommender systems using a multidimensional approach. ACM T Inform Syst 2005; 23: 103–145.
  • [32] Hsu SH, Wen M, Lin H, Lee CC, Lee CH. AIMED-A personalized TV recommendation system. In: 5th European Conference, EuroTV; 2007; Amsterdam, the Netherlands. Berlin, Germany: Springer. pp. 166–174.
  • [33] Uluya˘gmur M. Hybrid movie recommendation. MSc, ˙Istanbul Technical University, ˙Istanbul, Turkey, 2012 (in Turkish with abstract in English).
  • [34] Murakami T, Mori K, Orihara R. Metrics for evaluating the serendipity of recommendation lists. In: JSAI Conference and Workshops; 18–22 June 2007; Miyazaki, Japan. Berlin, Germany: Springer. pp. 40–46.
  • [35] Carmagnola F, Federica Cena F, Gena C. User model interoperability: a survey. User Model User-Adap 2011; 21: 285–331.
Turkish Journal of Electrical Engineering and Computer Sciences-Cover
  • ISSN: 1300-0632
  • Yayın Aralığı: Yılda 6 Sayı
  • Yayıncı: TÜBİTAK