Xhemal Zenuni, Jaumin Ajdari, Florije Ismaili, Bujar Raufi

AUTOMATIC HATE SPEECH DETECTION IN ONLINE CONTENTS USING LATENT SEMANTIC ANALYSIS

Internet in general and social media in particular have greatly facilitated the communication, interaction and collaboration among people and different entities. As generally there is no censorship, these media sometimes are used to proliferate discourses that contain hateful messages targeting ethnic origin, religious or sexual groups, which potentially may degenerate to violent acts against individuals of such groups. Therefore, we explore the idea of building of automatic classifier that can be used for detection of hate speech in public Albanian language pages. A hate speech corpus for Albanian language is created, and then based on Support Vector Machine (SVM) approach, an automatic hate speech detection system is proposed. Such system can be used to detect and analyze hate speech in online contents over time and to enhance our knowledge on how they affect opinion creation in society.

Keywords:

Hate speech detection text classification, support vector machines, NLP, Albanian language,

PDF

___

Chen, Y., Zhu, S., Zhou, Y., & Xu, H. (2012). Detecting Offensive Language in Social Media to Protect Adolescent Online Safety. Proceedings of the Fourth ASE/IEEE International Conference on Social Computing. Amsterdam.
Commision, E. (2016). CODE OF CONDUCT ON COUNTERING ILLEGAL HATE SPEECH ONLINE.
Djuric, N., Zhou, J., & Morris, R. (2015). Hate Speech Detection with Comment Embeddings. Proceedings of the 24th International Conference on World Wide Web, (s. 29-30).
Gitari, N., Zuping, Z., Damien, H., & Long, J. (2015). A Lexicon-based Approach for Hate Speech Detection. International Journal of Multimedia and Ubiquitous Engineering, 2015-230. (tarih yok). http://www.rsystems.com/. https://developers.facebook.com/docs/graph-api. (tarih yok).
Joachims, T. (1998). Text Categorization with Support Vector Machines: Learning with Many Relevant Features. European Conference on Machine Learning, (s. 137-142).
Kottasova, I. (2016). Facebook and Twitter pledge to remove hate speech within 24 hours. http://money.cnn.com/2016/05/31/technology/hate-speech-facebook-twitter-eu/.
Kwok, I., & Wang, Y. (2013). Locate the Hate: Detecting Tweets against Blacks. Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, (s. 1621-1622).
Thomas Davidson, D. W. (2017). Automated Hate Speech Detection and the Problem of Offensive Language. In the Proceedings of ICWSM 2017.
Vigna, D. V., Cimino, A., Dell'Orlleta, F., Petrocchi, M., & Tesconi, M. (2017). Hate Me, Hate Me Not: Hate Speech Detection on Facebook. ITASEC.
Waseem, Z., & Hovy, D. (2016). Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (s. 88-93).