OVERVIEW AND COMPARISON OF THREE CLASSIFIERS: ARABIC DOCUMENTS AS A CASE STUDY

OVERVIEW AND COMPARISON OF THREE CLASSIFIERS: ARABIC DOCUMENTS AS A CASE STUDY

Nowadays,  text classification is used in various fields of research and applications, such as information retrieval, text mining, and data mining. This study tests the Naïve Bayes, K-Nearest Neighbors, and Support Vector Machine algorithms on a relatively large dataset of Arabic documents. This dataset comprise 1,000 Arabic documents that are distributed across 10 classes. This comparison is based on recall and precision measures. The evaluation results show that the Support Vector Machine algorithms classifier outperforms the other two

___

  • Ababneh, J., Almomani, O., Hadi, W., El-Omari, N.K.T., and Al-Ibrahim, A., "Vector Space Models to Classify Arabic Text," International Journal of Computer Trends and Technology (IJCTT), vol 7, 2014
  • Agirre E., Lacalle O., and Soroa A., “Knowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD,” in Proceedings of the 21st International Joint Conference on Artificial Intelligence, San Francisco, USA, pp. 1501-1506, 2009.
  • Alsaleem, S., " Automated Arabic Text Categorization Using SVM and NB," International Arab Journal of e-Technology, Vol. 2, 2011
  • Al-Harbi, S., Almuhareb, A., Al-Thubaity, A., Khorsheed, M. S. and Al-Rajeh, A. "Automatic Arabic Text Classification," Proceedings of The 9th International Conference on the Statistical Analysis of Textual Data, Lyon-France, 2008
  • Al-Kabi, M. N., & Al-Sinjilawi, S. I. (2007). a Comparative Study of the Efficiency of Different Measures To Classify Arabic Text. University of Sharjah Journal of Pure & Applied Sciences, 4(2), 13–26.
  • Bawaneh, M.J., Alkoffash, M.S., and Al Rabea A.I."ArabicText Classification using K-NN and Naive Bayes". Journal of Computer Science, vol. 4, 2008. Duwairi, R. "Arabic Text Categorization," The International Arab Journal of Information Technology, Vol. 4, 2007.
  • El-halees, A. (2011). Arabic Opinion Mining Using Combined Classification Approach. Proceeding The International Arab Conference On Information Technology, Azrqa, Jordan.
  • Gharib, T. F., Habib, M. B., & Fayed, Z. T. (2009). Arabic Text Classification Using Support Vector Machines. International Journal of Computers and Their Applications, 16(4), 192–199. Retrieved from http://purl.utwente.nl/publications/75679
  • Hanandeh E., Mamoun S.The Automated VSMs to Categorize Arabic Text Data Sets,INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY,VOL 13, NO 1 (2014)MARCH-2014.PP.4047-4081
  • Khorsheed, M. S., & Al-Thubaity, A. O. (2013). Comparative evaluation of text classification techniques using a large diverse Arabic dataset. Language Resources and Evaluation, 47(2), 513–538. http://doi.org/10.1007/s10579-013-9221-8
  • Khreisat, L. "A machine learning approach for Arabic text classification using N-gram frequency statistics," Journal of Informatics, Volume 3, 2009.
  • Karima, A, Zakaria, E and Yamina, T.G. "Arabic Text Categorization: A Comparative Study of different Representation Model, " Journal of Theoretical and Applied Information Technology, Vol. 38, 2005.
  • Mesleh, A.M.A. Support Vector Machine text Classifier for Arabic Articles: Ant Colony Optimization-based Feature Subset Selection., The Arab Academy for banking and financial Science, PHD. Thesis, 2008.
  • Syiam. M. M., Z. T. Fayed & M. B. Habib. An intelligent system for Arabic text categorization. IJICIS, Vol.6, No. 1 JANUARY 2006.
  • Wahbeh, A. H., Al-Radaideh, Q. A., Al-Kabi, , M. N., & Al-Shawakfa, E. M. (2012). A Comparison Study between Data Mining Tools over some Classification Methods. International Journal of Advanced Computer Science and Applications, Special Issue on Artificial Intelligence, , 2(8), 19–26