Extending a sentiment lexicon with synonym–antonym datasets: SWNetTR++

Extending a sentiment lexicon with synonym–antonym datasets: SWNetTR++

In our previous studies on developing a general-purpose Turkish sentiment lexicon, we constructed SWNetTRPLUS, a sentiment lexicon of 37K words. In this paper, we show how to use Turkish synonym and antonym word pairsto extend SWNetTR-PLUS by almost 33% to obtain SWNetTR++, a Turkish sentiment lexicon of 49K words. Theextension was done by transferring the problem into the graph domain, where nodes are words, and edges are synonym–antonym relations between words, and propagating the existing tone and polarity scores to the newly added wordsusing an algorithm we have developed. We tested the existing and new lexicons using a manually labeled Turkish newsmedia corpus of 500 news texts. The results show that our method yielded a significantly more accurate lexicon thanSWNetTR-PLUS, resulting in an accuracy increase from 72.2% to 80.4%. At this level, we have now maximized theaccuracy rates of translation-based sentiment analysis approaches, which first translate a Turkish text to English andthen do the analysis using English sentiment lexicons.

___

  • [1] Ronen F. Techniques and applications for sentiment analysis. Communications of the ACM 2013; 56 (4): 82-89.
  • [2] Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-based methods for sentiment analysis. Computational Linguistics 2011; 37 (2): 267-307.
  • [3] Liu B. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 2012; 5 (1): 1-167.
  • [4] Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 2010; 61 (12): 2544-2558.
  • [5] Turney PD. Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: ACL-02 40th Annual Meeting on Association for Computational Linguistics Conference; Philadelphia, PA, USA; 2002. pp. 417-424.
  • [6] Pang B, Lee L, Vaithyanathan S. Thumbs up?: Sentiment classification using machine learning techniques. In: ACL-02 Empirical Methods in Natural Language Processing Conference; Pennsylvania, PA, USA; 2002. pp. 79-86.
  • [7] Nasser A, Dinçer K, Sever H. Investigation of the feature selection problem for sentiment analysis in Arabic language. Research in Computing Science 2016; 110: 41-54.
  • [8] Baccianella S, Esuli A, Sebastiani F. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: The Seventh International Conference on Language Resources and Evaluation; Valletta, Malta; 2010. pp. 2200-2204.
  • [9] Perez-Rosas V, Banea C, Mihalcea R. Learning sentiment lexicons in Spanish. In: The Eight International Conference on Language Resources and Evaluation; Istanbul, Turkey; 2012. pp. 73-77.
  • [10] Clematide S, Klenner M. Evaluation and extension of a polarity lexicon for German. In: The First Workshop on Computational Approaches to Subjectivity and Sentiment Analysis; Lisbon, Portugal; 2010. pp. 7-13.
  • [11] Cambria E, Speer R, Havasi C, Hussain A. Senticnet: a publicly available semantic resource for opinion mining. In: AAAI Fall Symposium: Commonsense Knowledge; Arlington, VA, USA; 2010. pp. 14-18.
  • [12] Dehkharghani R, Saygin Y, Yanikoglu B, Oflazer K. SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation 2016; 50 (3): 667-685.
  • [13] Choi Y, Cardie C. Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification. In: Empirical Methods in Natural Language Processing Conference; Singapore; 2009. pp. 590-598.
  • [14] Lu Y, Castellanos M, Dayal U, Zhai C. Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: The Twentieth International WWW Conference; Hyderabad, India; 2011. pp. 347-356.
  • [15] Demiroz G, Yanikoglu B, Tapucu D, Saygin Y. Learning domain-specific polarity lexicons. In: IEEE 12th International Conference on Data Mining Workshops; Brussels, Belgium; 2012. pp. 674-679.
  • [16] Ucan A. Automatic sentiment dictionary translation and using in sentiment analysis. MSc, Hacettepe University, Ankara, Turkey, 2014.
  • [17] Sağlam F, Sever H, Genç B. Developing Turkish sentiment lexicon for sentiment analysis using online news media. In: IEEE/ACS 13th International Conference of Computer Systems and Applications; Agadir, Morocco; 2016. pp. 1-5.
  • [18] Zhao J, Liu K, Xu L. Sentiment analysis: mining opinions, sentiments, and emotions. Computational Linguistics 2016; 42 (3): 595-598.
  • [19] Abdul-Mageed M, Diab MT, Korayem M. Subjectivity and sentiment analysis of modern standard Arabic. In: The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies Conference; Portland, OR, USA; 2011. pp. 587-591.
  • [20] An J, Kim HW. Building a Korean sentiment lexicon using collective intelligence. Journal of Intelligence and Information Systems 2015; 21 (2): 49-67.
  • [21] Zaśko-Zielińska M, Piasecki M, Szpakowicz S. A large wordnet-based sentiment lexicon for Polish. In: Recent Advances in Natural Language Processing Conference; Hissar, Bulgaria; 2015. pp. 721-730.
  • [22] Kamps J, Marx M, Mokken RJ, De Rijke M. Using WordNet to measure semantic orientations of adjectives. In: 4th International Conference on Language Resources and Evaluation; Lisbon, Portugal; 2004. pp. 1115-1118.
  • [23] Hassan A, Radev D. Identifying text polarity using random walks. In: ACL-10 48th Annual Meeting on Association for Computational Linguistics Conference; Uppsala, Sweden; 2010. pp. 395-403.
  • [24] Kaji N, Kitsuregawa M. Building lexicon for sentiment analysis from massive collection of HTML documents. In: The 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning; Prague, Czech Republic; 2007. pp. 1075-1083.
  • [25] Deng S, Sinha AP, Zhao H. Adapting sentiment lexicons to domain-specific social media texts. Decision Support Systems 2017; 94: 65-76.
  • [26] Ye Q, Zhang Z, Law R. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications 2009; 36 (3): 6527-6535.
  • [27] Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau R. Sentiment analysis of twitter data. In: Workshop on Language in Social Media; Portland, OR, USA; 2011. pp. 30-38.
  • [28] Gezici G, Yanikoglu B, Tapucu D, Saygin Y. New features for sentiment analysis: Do sentences matter? In: 1st International Workshop on Sentiment Discovery from Affective Data; Bristol, UK; 2012. pp. 5-15.
  • [29] Vural AG, Cambazoglu BB, Senkul P, Tokgoz ZO. A framework for sentiment analysis in Turkish: Application to polarity detection of movie reviews in Turkish. In: 27th International Symposium on Computer and Information Sciences; Paris, France; 2012. pp. 437-445.
  • [30] Thelwall M. The Heart and soul of the web? Sentiment strength detection in the social web with SentiStrength. In: Holyst J. (editor). Cyberemotions: Collective Emotions in Cyberspace. Cham, Switzerland: Springer Publishing, 2017, pp. 119-134.
  • [31] Türkmenoğlu C, Tantuğ AC. Sentiment analysis in Turkish media. In: International Conference on Machine Learning; Beijing, China; 2014. pp. 1-11.
  • [32] Aytekin Ç. An opinion mining task in Turkish language: a model for assigning opinions in Turkish blogs to the polarities. Journalism and Mass Communication 2013; 3 (3): pp. 179-198.
  • [33] Kaya M, Conley S. Comparison of sentiment lexicon development techniques for event prediction. Social Network Analysis and Mining 2016; 6 (1): 7.
  • [34] Ekinci E, Omurca Sİ. A new approach for a domain-independent Turkish sentiment seed lexicon compilation. International Arab Journal of Information Technology 2019; 5: 1-11.
  • [35] Gezici G, Yanıkoğlu B. Sentiment analysis in Turkish. In: Oflazer K, Saraçlar M (editor). Turkish Natural Language Processing. Cham, Switzerland: Springer Publishing, 2018, pp. 255-271.
  • [36] Omurca Sİ, Ekinci E, Türkmen H. An annotated corpus for Turkish sentiment analysis at sentence level. In: International Artificial Intelligence and Data Processing Symposium; Malatya, Turkey; 2017. pp. 1-5.
  • [37] Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S et al. SemEval-2016 task 5: aspect based sentiment analysis. In: 10th International Workshop on Semantic Evaluation; San Diego, CA, USA; 2016. pp. 19-30.
  • [38] Strapparava C, Valitutti A. Wordnet affect: an affective extension of wordnet. In: 4th International Conference on Language Resources and Evaluation; Lisbon, Portugal; 2004. pp. 1083-1086.