Extending a sentiment lexicon with synonym--antonym datasets: SWNetTR++

In our previous studies on developing a general-purpose Turkish sentiment lexicon, we constructed SWNetTR-PLUS, a sentiment lexicon of 37K words. In this paper, we show how to use Turkish synonym and antonym word pairs to extend SWNetTR-PLUS by almost 33 % to obtain SWNetTR++, a Turkish sentiment lexicon of 49K words. The extension was done by transferring the problem into the graph domain, where nodes are words, and edges are synonym--antonym relations between words, and propagating the existing tone and polarity scores to the newly added words using an algorithm we have developed. We tested the existing and new lexicons using a manually labeled Turkish news media corpus of 500 news texts. The results show that our method yielded a significantly more accurate lexicon than SWNetTR-PLUS, resulting in an accuracy increase from 72.2 % to 80.4 %. At this level, we have now maximized the accuracy rates of translation-based sentiment analysis approaches, which first translate a Turkish text to English and then do the analysis using English sentiment lexicons.

Keywords:

Turkish sentiment lexicon, sentiment analysis, sentiment lexicon graph model, GDELT, SWNetTR++,

PDF

Turkish Journal of Electrical Engineering and Computer Science-Cover

ISSN: 1300-0632
Yayın Aralığı: Yılda 6 Sayı
Yayıncı: TÜBİTAK

Arşiv