The Corpus Based Approach to Sentiment Analysis in Modern Standard Arabic and Arabic Dialects: A Literature Review

Sentiment Analysis, is the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has taken a considerable place in the field of text mining research and has since been used to explore the opinions of users about various products or topics discussed over the Internet. When the literature on Sentiment Analysis is examined, it is seen that the natural language of the Internet information sources that form the basis of the analysis is mostly English. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies made from natural languages other than English. The purpose of this study is to examine the literature of Sentiment Analysis conducted in Arabic internet information sources. The literature review includes studies based on the corpus approach, which is made up of Arabic Internet information sources. Studies are being carried out on the works which constitute their own corpora for both Modern Standard Arabic and Arabic dialects and on which sentiment analysis is performed.

The Corpus Based Approach to Sentiment Analysis in Modern Standard Arabic and Arabic Dialects: A Literature Review

Sentiment Analysis, is the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has taken a considerable place in the field of text mining research and has since been used to explore the opinions of users about various products or topics discussed over the Internet. When the literature on Sentiment Analysis is examined, it is seen that the natural language of the Internet information sources that form the basis of the analysis is mostly English. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies made from natural languages other than English. The purpose of this study is to examine the literature of Sentiment Analysis conducted in Arabic internet information sources. The literature review includes studies based on the corpus approach, which is made up of Arabic Internet information sources. Studies are being carried out on the works which constitute their own corpora for both Modern Standard Arabic and Arabic dialects and on which sentiment analysis is performed.

___

  • [1] Aliane A., Aliane H., Ziane M., and Bensaou N., "A Genetic Algorithm Feature Selection Based Approach for Arabic Sentiment Classification", 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Agadir, Morocco, 1-6, (2016).
  • [2] Ravi K. and Ravi V., "A Survey on Opinion Mining and Sentiment Analysis: Tasks, Approaches and Applications", Knowledge-Based Systems, 89: 14-46, (2015)
  • [3] Bhadane C., Dalal H., and Doshi H., "Sentiment Analysis: Measuring Opinions", Procedia Computer Science, 45: 808-814, (2015)
  • [4] Alhumoud S. O., Altuwaijri M. I., Albuhairi T. M., and Alohaideb W. M., "Survey on Arabic Sentiment Analysis in Twitter", International Journal of Social, Behavioral, Educational, Economic, Business and Industrial Engineering 9: 364-368, (2015)
  • [5] Internet: WEEDOO, Twitter Arab World – Statistics Feb 2017, 2017, Available: https://weedoo.tech/twitter-arab-world-statistics-feb-2017/, Accessed: 29 July 2017
  • [6] Internet: WEEDOO, Facebook Arab World – Statistics Feb 2017, 2017, Available: https://weedoo.tech/facebook-arab-world-statistics-feb-2017/, Accessed: 29 July 2017
  • [7] Al-Kabi M. N., Gigieh A. H., Alsmadi I. M., Wahsheh H. A., and Haidar M. M., "Opinion Mining and Analysis for Arabic Language", International Journal of Advanced Computer Science and Applications (IJACSA), 5: 181-195, (2014)
  • [8] Hamed O. and Zesch T. "The Role of Diacritics in Designing Lexical Recognition Tests for Arabic", In: Proceedings of the 3rd International Conference on Arabic Computational Linguistics, ACLing 2017, Dubai, United Arab Emirates, 119-128, (2017).
  • [9] Biskri I., Berrakem F.-Z., and Jebali A. "The Applicative Combinatory Categorial Analysis of Arabic", In: Proceedings of the 3rd International Conference on Arabic Computational Linguistics, ACLing 2017, Dubai, United Arab Emirates, 199-207, (2017).
  • [10] Abuata B. and Al-Omari A., "A Rule-Based Stemmer for Arabic Gulf Dialect", Journal of King Saud University-Computer and Information Sciences, 27: 104-112, (2015)
  • [11] Alshutayri A. and Atwell E., "Exploring Twitter as a Source of an Arabic Dialect Corpus", International Journal of Computational Linguistics (IJCL), 8: 37-44, (2017)
  • [12] Medhat W., Hassan A., and Korashy H., "Sentiment Analysis Algorithms and Applications: A Survey", Ain Shams Engineering Journal, 5: 1093-1113, (2014)
  • [13] Boudad N., Faizi R., Thami R. O. H., and Chiheb R., "Sentiment Analysis in Arabic: A Review of the Literature", Ain Shams Engineering Journal, (2017)
  • [14] Internet: UNESCO, Unesco World Arabic Language Day, 2012, Available: http://www.unesco.org/new/en/unesco/events/prizes-and-celebrations/celebrations/international-days/world-arabic-language-day, Accessed: 23 March 2017
  • [15] Al-Kabi M. N., Abdulla N. A., and Al-Ayyoub M. "An Analytical Study of Arabic Sentiments: Maktoob Case Study", In: Proceedings of the 2013 8th International Conference for Internet Technology and Secured Transactions (ICITST), London, UK, 89-94, (2013).
  • [16] Sharma A. and Dey S. "A Comparative Study of Feature Selection and Machine Learning Techniques for Sentiment Analysis", In: Proceedings of the Proceedings of the 2012 ACM research in applied computation symposium, San Antonio, Texas,USA, 1-7, (2012).
  • [17] Awwad H. and Alpkocak A. "Performance Comparison of Different Lexicons for Sentiment Analysis in Arabic", In: Proceedings of the 2016 Third European Network Intelligence Conference (ENIC), Wrocław, Poland, 127-133, (2016).
  • [18] Ibrahim M. A. and Salim N., "Opinion Analysis for Twitter and Arabic Tweets: A Systematic Literature Review", Journal of Theoretical & Applied Information Technology, 56: (2013)
  • [19] Cherif W., Madani A., and Kissi M. "A New Modeling Approach for Arabic Opinion Mining Recognition", In: Proceedings of the 2015 Intelligent Systems and Computer Vision (ISCV), Fez, Morocco, 1-6, (2015).
  • [20] Al-Smadi M., Qawasmeh O., Talafha B., and Quwaider M. "Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis", In: Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud), Rome, Italy, 726-730, (2015).
  • [21] Aly M. A. and Atiya A. F. "Labr: A Large Scale Arabic Book Reviews Dataset", In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, 494-498, (2013).
  • [22] Cherif W., Madani A., and Kissi M., "Towards an Efficient Opinion Measurement in Arabic Comments", Procedia Computer Science, 73: 122-129, (2015)
  • [23] AL-Smadi M., Al-Ayyoub M., Al-Sarhan H., and Jararweh Y. "Using Aspect-Based Sentiment Analysis to Evaluate Arabic News Affect on Readers", In: Proceedings of the 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC), Limassol, Cyprus, 436-441, (2015).
  • [24] Stenetorp P., Pyysalo S., Topić G., Ohta T., Ananiadou S., and Tsujii J. i. "Brat: A Web-Based Tool for Nlp-Assisted Text Annotation", In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, 102-107, (2012).
  • [25] Althobaiti M., Kruschwitz U., and Poesio M. "Aranlp: A Java-Based Library for the Processing of Arabic Text", In: Proceedings of the Ninth International Conference on Language Resources and Evaluation, Reykjavik, Iceland, (2014).
  • [26] Al-Rfou R., Kulkarni V., Perozzi B., and Skiena S. "Polyglot-Ner: Massive Multilingual Named Entity Recognition", In: Proceedings of the Proceedings of the 2015 SIAM International Conference on Data Mining, British Columbia, Canada, 586-594, (2015).
  • [27] Duwairi R. and Qarqaz I. "Arabic Sentiment Analysis Using Supervised Classification", In: Proceedings of the 2014 International Conference on Future Internet of Things and Cloud (FiCloud), Barcelona, Spain 579-583, (2014).
  • [28] Duwairi R., Marji R., Shaban N., and Ershaidat S., "Sentiment Analysis", B.S. thesis, Jordan University of Science and Technology, (2012).
  • [29] Duwairi R., Marji R., Sha'ban N., and Rushaidat S. "Sentiment Analysis in Arabic Tweets", In: Proceedings of the 2014 5th international conference on Information and communication systems (ICICS), Irbid, Jordan, 1-6, (2014).
  • [30] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., et al., "Scikit-Learn: Machine Learning in Python", Journal of Machine Learning Research, 12: 2825-2830, (2011)
  • [31] Abdul-Mageed M. and Diab M. T. "Awatif: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis", In: Proceedings of the eighth international conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 3907-3914, (2012).
  • [32] Maamouri M., Bies A., Buckwalter T., and Mekki W. "The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus", In: Proceedings of the NEMLAR conference on Arabic language resources and tools, Cairo, Egypt, 466-467, (2004).
  • [33] Rushdi‐Saleh M., Martín‐Valdivia M. T., Ureña‐López L. A., and Perea‐Ortega J. M., "Oca: Opinion Corpus for Arabic", Journal of the Association for Information Science and Technology, 62: 2045-2054, (2011)
  • [34] Biadsy F., Hirschberg J., and Habash N. "Spoken Arabic Dialect Identification Using Phonotactic Modeling", In: Proceedings of the Proceedings of the eacl 2009 workshop on computational approaches to semitic languages, Athens, Greece, 53-61, (2009).
  • [35] Itani M., Roast C., and Al-Khayatt S. "Corpora for Sentiment Analysis of Arabic Text in Social Media", In: Proceedings of the 2017 8th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 64-69, (2017).
  • [36] Al-Rubaiee H., Qiu R., and Li D. "Identifying Mubasher Software Products through Sentiment Analysis of Arabic Tweets", In: Proceedings of the 2016 International Conference on Industrial Informatics and Computer Systems (CIICS), Sharjah, United Arab Emirates, 1-6, (2016).
  • [37] Hathlian N. F. B. and Hafezs A. M. "Sentiment-Subjective Analysis Framework for Arabic Social Media Posts", In: Proceedings of the Saudi International Conference on Information Technology (Big Data Analysis) (KACSTIT), Riyadh, Saudi Arabia, 1-6, (2016).
  • [38] Sghaier M. A. and Zrigui M. "Sentiment Analysis for Arabic E-Commerce Websites", In: Proceedings of the International Conference on Engineering & MIS (ICEMIS), Agadir, Morocco, 1-7, (2016).
  • [39] Shoukry A. and Rafea A. "A Hybrid Approach for Sentiment Classification of Egyptian Dialect Tweets", In: Proceedings of the 2015 First International Conference on Arabic Computational Linguistics (ACLing), Cairo, Egypt, 78-85, (2015).
  • [40] Ibrahim H. S., Abdou S. M., and Gheith M. "Mika: A Tagged Corpus for Modern Standard Arabic and Colloquial Sentiment Analysis", In: Proceedings of the 2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS), , Kolkata, India, 353-358, (2015).
  • [41] Abdulla N. A., Ahmed N. A., Shehab M. A., and Al-Ayyoub M. "Arabic Sentiment Analysis: Lexicon-Based and Corpus-Based", In: Proceedings of the 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Amman, Jordan, 1-6, (2013).
  • [42] Shoukry A. and Rafea A. "Sentence-Level Arabic Sentiment Analysis", In: Proceedings of the 2012 International Conference on Collaboration Technologies and Systems (CTS), Denver, CO, USA, 546-550, (2012).