Film Duygu Analizi İçin Ön İşleme Teknikleri: En İyi Özellik Setini Belirlemek İçin Karşılaştırmalı Bir Çalışma

Pre-Processing Techniques for Movie Review Sentiment Analysis: A Comparative Study for Best Feature Set Determination

Sentiment analysis is considered the process to extract the overall expression, opinions, or feelings from reviews about something such as products, services, or movies. Pre-processing is considered a crucial phase in sentiment analysis for text mining because it allows us to analyze the reviews according to their intended meaning by removing all of the appendages which are the words that do not affect the semantics of sentences. And therefore, the number of features will decrease and thus accuracy will increase. Accordingly, we have decided to evaluate our experiment in identifying the best influencing technique of pre-processing for several features by making a comparison between the features and by combining them together to reach the best result based on the feature number for each pre-processing technique and classification accuracy. this comparison was done by using three algorithms for classification SVM, NB, and DT after applying tools for feature selection and feature extraction with three techniques for tokenization. We concluded that there are some of these techniques that have a negative effect like lemmatization and the part of them is not due to any difference, other, which a little part, have an effect

___

  • .[1] F. Boschetti, M. Romanello, A. Babeu, D. Bamman, and G. Crane, “Research and Advanced Technology for Digital Libraries,” Res. Adv. Technol. Digit. Libr., vol. 5714, no. September, pp. 156–167, 2009.
  • [2] S. Symeonidis, D. Effrosynidis, and A. Arampatzis, “A comparative evaluation of pre-processing techniques and their interactions for Twitter sentiment analysis,” Expert Syst. Appl., vol. 110, pp. 298–310, 2018.