A hybrid sentiment analysis method for Turkish

A hybrid sentiment analysis method for Turkish

This paper presents a hybrid methodology for Turkish sentiment analysis, which combines the lexicon-basedand machine learning (ML)-based approaches. On the lexicon-based side, we use a sentiment dictionary that is extendedwith a synonyms lexicon. Besides this, we tackle the classification problem with three supervised classifiers, naive Bayes,support vector machines, and J48, on the ML side. Our hybrid methodology combines these two approaches by generatinga new lexicon-based value according to our feature generation algorithm and feeds it as one of the features to machinelearning classifiers. Despite the linguistic challenges caused by the morphological structure of Turkish, the experimentalresults show that it improves the accuracy by 7% on average.

___

  • [1] Boudad N, Faizi R, Rachid OHT, Chiheb R. Sentiment analysis in Arabic: a review of the literature. Ain Shams Engineering Journal 2018; 9 (4): 2479-2490. doi: 10.1016/j.asej.2017.04.007
  • [2] Maynard D, Funk A. Automatic detection of political opinions in tweets. In: Proceedings of the 8th International Conference on the Semantic Web; Heraklion, Greece; 2012. pp. 88-99.
  • [3] Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Engineering Journal 2014; 5 (4): 1093-1113. doi: 10.1016/j.asej.2014.04.011
  • [4] Dehkharghani R, Saygin Y, Yanikoglu B, Oflazer K. SentiTurkNet: A Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation 2015; 50 (3): 667-685. doi: 10.1007/s10579-015-9307-6
  • [5] Aktaş Ö, Birant Ç, Aksu B, Çebi Y. Automated synonym dictionary generation tool for Turkish (ASDICT). BILIG - Turk Dunyasi Sosyal Bilimler Dergisi 2013; 65 (9): 47-68
  • [6] Zhang L, Hua K, Wang H, Qian G, Zhang L. Sentiment analysis on reviews of mobile users. Procedia Computer Science 2014; 34 (11): 458-465. doi: 10.1016/j.procs.2014.07.013
  • [7] Vinodhini G, Chandrasekaran RM. Effect of feature reduction in sentiment analysis of online reviews. International Journal of Advanced Research in Computer Engineering & Technology 2013; 2 (6): 2278–1323
  • [8] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques; In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing; Stroudsburg, PA, USA; 2002. pp. 79-86.
  • [9] Rushdi-Saleh M, Martín-Valdivia MT, Ureña-López LA, Perea-Ortega JM. OCA: Opinion corpus for Arabic. Journal of the Association for Information Science and Technology 2011; 62 (10): 2045–2054
  • [10] Govindarajan M. Sentiment analysis of movie reviews using hybrid method of naive Bayes and genetic algorithm. International Journal of Advanced Computer Research 2013; 3 (4): 139-146
  • [11] Duwairi RM. Sentiment analysis for dialectical Arabic. In: Proceedings 6th International Conference on Information and Communication Systems; Amman, Jordan; 2015. pp. 166-170.
  • [12] Baloglu A, Aktas MS. An automated framework for mining reviews from blogosphere. International Journal of Advances in Internet Technology 2010; 3 (4): 234-244.
  • [13] Esuli A, Sebastiani F. SENTIWORDNET: A publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation; Genoa, Italy; 2006. pp. 417-422.
  • [14] Sharma R, Nigam S, Jain R. Opinion mining of movie reviews at document level. International Journal on Information Theory 2014; 3 (3): 13-21. doi: 10.5121/ijit.2014.3302
  • [15] Fellbaum C. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). Cambridge, MA: MIT Press, 1998.
  • [16] Appel O, Chiclana F, Carter J, Fujita H. A hybrid approach to sentiment analysis. In: IEEE Congress on Evolutionary Computation; Sendai, Japan; 2016. pp. 4950-4957.
  • [17] Ohana B, Tierney B. Sentiment classification of reviews using SentiWordNet. In: 9th IT&T Conference; Dublin, Ireland; 2009; pp. 10-19. doi: 10.21427/D77S56
  • [18] Akgül ES, Ertano C, Diri B. Sentiment analysis with Twitter. Pamukkale University Journal of Engineering Sciences 2016; 22 (2): 106-110. doi: 10.5505/pajes.2015.37268
  • [19] Turkmenoglu C, Tantug AC. Sentiment analysis in Turkish media. In: International Conference on Machine Learning; Beijing, China; 2014; pp. 32-42. doi: 10.13140/2.1.1502.1125
  • [20] Oğul BB, Ercan G. Sentiment classification on Turkish hotel reviews. In: 24th Signal Processing and Communication Application Conference; Zonguldak, Turkey; 2016. pp. 497-500.
  • [21] Kaynar O, Görmez Y, Yildiz M, Albayrak A. Sentiment analysis with machine learning techniques. In: International Artificial Intelligence and Data Processing Symposium; Malatya, Turkey; 2016. pp. 80-86.
  • [22] Yildirim E, Çetin F, Eryigit G, Temel T. The impact of NLP on Turkish sentiment analysis. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 2017; 7 (1): 43-51.
  • [23] Çoban Ö, Özyer B, Özyer GT. Sentiment analysis for Turkish Twitter feeds. In: 23nd Signal Processing and Communications Applications Conference; Malatya, Turkey; 2015. pp. 2388-2391.
  • [24] Vural AG, Cambazoglu BB, Senkul P, Tokgoz ZO. A framework for sentiment analysis in Turkish: application to polarity detection of movie reviews in Turkish. In: Computer and Information Sciences III; London, UK; 2012. pp. 437-445.
  • [25] Thelwall M, Buckley K, Paltoglou G. Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology 2012; 63 (1): 163-173. doi: 10.1002/asi.21662
  • [26] Kaya M, Fidan G, Toroslu IH. Sentiment analysis of Turkish political news. In: International Conferences on Web Intelligence and Intelligent Agent Technology; Washington, DC, USA; 2012. pp. 174-180.
  • [27] Boynukalın Z. Emotion analysis of Turkish texts by using machine learning methods. MSc, Middle East Technical University, Ankara, Turkey, 2012.
  • [28] Eroğul U. Sentiment analysis in Turkish. MSc, Middle East Technical University, Ankara, Turkey, 2012.
  • [29] Dehkharghani R, Yanikoglu B, Saygin Y, Oflazer K. Sentiment analysis in Turkish at different granularity levels. Natural Language Engineering 2017; 23 (4): 535-559. doi: 10.1017/S1351324916000309
  • [30] Eryiğit G, Torunoğlu-Selamet D. Social media text normalization for Turkish. Natural Language Engineering 2017; 23 (6): 1-41. doi: 10.1017/S1351324917000134
  • [31] Ehsani R, Solak E, Yildiz OT. Constructing a WordNet for Turkish using manual and automatic annotation. ACM Transactions on Asian Language Information Processing 2018; 17 (3): 1-15. doi: 10.1145/3185664
  • [32] Thisted RA. Elements of Statistical Computing: Numerical Computation. New York, NY, USA: Chapman & Hall, 1988.
  • [33] Ucan A, Naderalvojoud B, Sezer EA, Sever H. SentiWordNet for new language: automatic translation approach. In: 12th International Conference on Signal-Image Technology & Internet-Based Systems; Naples, Italy; 2016. pp. 308-315.
  • [34] Vapnik VN. The Nature of Statistical Learning Theory. New York, NY, USA: Springer, 1995.
  • [35] Joachims T. Text categorization with support vector machines: learning with many relevant features. In: European Conference on Machine Learning; Chemnitz, Germany; 1998. pp. 137-142.
  • [36] Goyal A, Mehta R. Performance comparison of naïve Bayes and J48 classification algorithms. International Journal of Applied Engineering Research 2012; 7 (11): 1389-1393
  • [37] Cetin M, Fatih AM. Active learning for Turkish sentiment analysis. In: IEEE International Symposium on Innovations in Intelligent Systems and Applications; Turkey; 2013. pp. 1-4.
  • [38] Parlar T, Sarac E, Ozel SA. Comparison of feature selection methods for sentiment analysis on Turkish Twitter data. In: 25th Signal Processing and Communications Applications Conference; Turkey; 2017. pp. 1-4.