CLASSIFICATION OF TURKISH TWEETS BY DOCUMENT VECTORS AND INVESTIGATION OF THE EFFECTS OF PARAMETER CHANGES ON CLASSIFICATION SUCCESS

CLASSIFICATION OF TURKISH TWEETS BY DOCUMENT VECTORS AND INVESTIGATION OF THE EFFECTS OF PARAMETER CHANGES ON CLASSIFICATION SUCCESS

Natural language processing is an artificial intelligence field which is gaining in popularity in recent years. To make an emotional deduction from texts related to an issue, or classify documents are of great importance considering the increasing data size in today's world. Understanding and interpreting written texts is a feature that pertains to people. But, it is possible to deduce from texts or classify texts using natural language processing which is a sub-branch of machine learning and artificial intelligence. In this study, both text classification was made on Turkish tweets, and text classification success of method parameter changes was investigated using two different methods of the algorithm mentioned as document vectors in the literature. It was found in the study that as well as higher accuracy values were obtained by the DBoW (Distributed Bag of Words) method than DM (Distributed Memory) method; higher accuracy values were also obtained by DBoW-NS (Negative Sampling) architecture than others.

___

  • [1] Chowdhury. G.G., “Natural language processing”, Annual review of information science and technology,37,1, 51-89, 2005.
  • [2] Maron, M.E., “Automatic indexing: an experimental inquiry”, Journal of the ACM,8,3,404-417,1961.
  • [3] Fabrizio, S., “Machine learning in automated text categorization”, ACM computing surveys,34,1,1-47,2001.
  • [4] Dalal, M.K., Mukesh, A. Z., “Automatic Text Classification: A Technical Review”, International Journal of Computer Applications, 28,2,37- 40, 2011.
  • [5] Sommer, S.,Schieber, A., Hilbert, A., Heinrich K., “Analyzing customer sentiments in microblogs–A topic-model-based approach for Twitter datasets”, Americas conference on information systems (AMCIS),Detroit, Michigan, USA, (2011), 1-7.
  • [6] Liu, B, Lei, Z., “Mining Text Data: A survey of opinion mining and sentiment analysis”, Mining Text Data, Springer, Boston, USA, 2012, 415-463.
  • [7] Prabowo, R., Thelwall, M., “Sentiment analysis: A combined approach”, Journal of Informetrics, 3,2, 143-157, 2009.
  • [8] Zhang, D., Xu H., Su, Z., Xu, Y., “Chinese comments sentiment classification based on word2vec and SVMperf”, Expert Systems with Applications, 42,4,1857-1863,2014.
  • [9] Dickinson, B., Wei, H., “Sentiment analysis of investor opinions on twitter”, Social Networking, 4,3, 62-71,2015.
  • [10] Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T., Qin, B., “Learning sentiment-specific word embedding for twitter sentiment classification”, 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, USA, (2014), 1555-1565.
  • [11] Polpinij. J., Natthakit, S., Paphonput, S., “Word2Vec Approach for Sentiment Classification Relating to Hotel Reviews”, 13th. International Conference on Computing and Information Technology, Bangkok, Thailand, (2017), 308-316.
  • [12] Şahin, G., “Turkish document classification based on Word2Vec and SVM classifier”, Signal Processing and Communications Applications Conference, Antalya, Turkey, (2017), 1-4.
  • [13] Xue, B., Chen, F. and Zhan, S., “A study on sentiment computing and classification of sina weibo with word2vec”, IEEE International Congress on Big Data, Alaska, USA, (2014), 358-363.
  • [14] Bilgin, M. and Şentürk, İ. F., “Sentiment analysis on Twitter data with semi-supervised Doc2Vec”, 2nd. International Conference on Computer Science and Engineering, Antalya, Turkey, (2017), 661-666.
  • [15] Bilgin, M. and Köktaş, H., “Word2Vec Based Sentiment Analysis for Turkish Texts”, International Conference on Engineering Technologies, Konya, Turkey, (2017), 106-109.
  • [16] Bilgin, M. and Köktaş, H., “Sentiment Analysis with Term Weighting and Word Vectors”, International Arab Journal of Information Technology, 16,5, 953-959, 2019.
  • [17] Ayata, D., Saraçlar, M. and Özgür, A., “Turkish tweet sentiment analysis with word embedding and machine learning”, 25th Signal Processing and Communications Applications Conference (SIU), Antalya, Turkey, (2017), 1-4.
  • [18] Yüksel, A.E., Türkmen, Y.A., Özgür, A. and Altınel, B., “Turkish Tweet Classification with Transformer Encoder, International Conference on Recent Advances in Natural Language Processing”, Varna, Bulgaria, (2019), 1380-387.
  • [19] Akkol, E., Alıcı, S., Aydın, C. and Tarhan, Ç., “What Happened in Turkey After Booking.com Limitation: Sentiment Analysis of Tweets via Text Mining”, Economic and Financial Challenges for Balkan and Eastern European Countries, 291-301, 2020.
  • [20] Akkol, E., Alıcı, S., Aydın, C. and Tarhan, Ç., “Sentiment Analysis of How Turkish Customers Affected by PayPal Closure”, Economic and Financial Challenges for Balkan and Eastern European Countries, 303-313, 2020.
  • [21] Mikolov. T., Chen, K., Corrado, G., Dean, J., “Efficient estimation of word representations in vector space”, International Conference on Learning Representations, Scottsdale, Arizona, USA, (2013), 1-12.
  • [22] Campr, M., Karel, J., “Comparing semantic models for evaluating automatic document summarization”, International Conference on Text, Speech and Dialogue, Pilsen, Czech Republic, (2015), 252-260.
  • [23] Kamkarhaghighi. M., Masoud, M., “Content Tree Word Embedding for document representation”, Expert Systems with Applications, 90,241-249,2017.
  • [24] Le. Q., Mikolov, T., “Distributed representations of sentences and documents”, International Conference on Machine Learning, Beijing, China, (2014),1188-1196.
  • [25] Amasyalı, M.F., Taşköprü, H., Çalışkan, K. (2019) Duygu durum Analizinde Kelimeler, Anlamlar, Karakterler [Internet] Yıldız Technical University. Available from:http://www.kemik.yildiz.edu.tr/data/File/17bintweet.zip [accessed November 10, 2019].
Sigma Journal of Engineering and Natural Sciences-Cover
  • ISSN: 1304-7191
  • Başlangıç: 1983
  • Yayıncı: Yıldız Teknik Üniversitesi
Sayıdaki Diğer Makaleler

ASSESSMENT OF PROJECT CHARACTERISTICS AFFECTING RISK OCCURRENCES IN CONSTRUCTION PROJECTS USING FUZZY AHP

Ozan OKUDAN, Cenk BUDAYAN

CHARACTERIZATIONS OF HELICES BY USING THEIR DARBOUX VECTORS

Mustafa DÜLDÜL, Bahar UYAR DÜLDÜL

LINKAGE OF OPERATIONAL PARAMETERS AND MICROBIOME IN ANAEROBIC CO-DIGESTION WITH GRAPHITE

Sevgi DEMİREL, Öznur Begüm GÖKÇEK, Hamdi MURATÇOBANOĞLU

THE EFFECT OF CONCRETE SLAB THICKNESS ON SEISMIC PERFORMANCE OF CONCRETE FACE SLAB OF CFR DAMS

Murat Emre KARTAL

THE SIZE OPTIMIZATION OF STEEL BRACED BARREL VAULT STRUCTURE BY USING RAO-1 ALGORITHM

Tayfun DEDE, Maksym GRZYWIŃSKI, Ravipudi Venkata RAO, Barbaros ATMACA

BIOSORPTION OF PHENOL USING MODIFIED BARLEY HUSK: STUDIES ON EQUILIBRIUM ISOTHERM, KINETICS, AND THERMODYNAMICS OF INTERACTIONS

Davoud BALARAK, Kethineni CHANDRIKA, Chinenye Adaobi IGWEGBE, Shahin AHMADI, Chinedu Josiah UMEMBAMALU

CHROMIUM (Cr(VI)) REMOVAL FROM WATER WITH BENTONITE-MAGNETITE NANOCOMPOSITE USING RESPONSE SURFACE METHODOLOGY (RSM)

Pınar BELİBAĞLI, Buşra Nur ÇİFTCİ, Yağmur Uysal UYSAL

A GENERAL OVERVIEW ON THE REGION-TIME BEHAVIORS OF THE RECENT EARTHQUAKE ACTIVITY IN THE MARMARA REGION OF TURKEY

Serkan ÖZTÜRK, Serpil GERDAN

COMPARISON OF PERFORMANCES OF BIOLOGICAL NUTRIENT REMOVAL SYSTEMS FOR MUNICIPAL WASTEWATER TREATMENT

Selami DEMİR

RELATIONAL DESCRIPTION OF AN ADSORPTION SYSTEM BASED ON ISOTHERM, ADSORPTION DENSITY, ADSORPTION POTENTIAL, HOPPING NUMBER AND SURFACE COVERAGE

Chukwunonso O. ANIAGOR, Matthew Chukwudi MENKITI