LBP Özellik Çıkarma ve İstatistiksel Havuzlama Tabanlı Görüntü Spam Tespit Modeli

Elektronik posta anlamına gelen e-posta, iki veya daha fazla kişi arasındaki bir dijital iletişim biçimidir. İletişimi kolaylaştıran bu teknolojik araçlar, yaygın olarak spam mail olarak bilinen önemsiz e-postalar nedeniyle hayatımızı olumlu ve olumsuz bir şekilde etkileyebilir. Genellikle kuruluşlar/bireyler tarafından dolaylı veya doğrudan çıkar elde etmek amacıyla ticari amaçlarla iletilen bu spam iletiler, yalnızca insanların dikkatini dağıtmakla kalmaz, aynı zamanda işlem gücü, bellek ve ağ bant genişliği gibi önemli miktarda sistem kaynağını da tüketir. Bu çalışmada, spam veya ham (spam olmayan) görüntüleri sınıflandırmak için LBP (Local Binary Patterns) özellik çıkarma ve istatistiksel havuzlamaya dayalı bir yöntem önerilmiştir. Önerilen yöntemi test etmek için iki veri seti kullanılmıştır. ISH veri seti literatürde yaygın olarak kullanılmaktadır ve 1738 görüntü içermektedir. Bu veri setine ek olarak, topladığımız veri seti toplamda 1015 görüntüden oluşmaktadır. Bu görüntüler üzerinde özellik çıkarımı yapılmıştır. Elde edilen öznitelikler SVM (Support Vector Machine) algoritması ile sınıflandırılmıştır. Önerilen yöntemde ISH veri setinde %98.56, topladığımız veri setinde ise %79.01 doğruluk oranı hesaplanmıştır. Elde edilen sonuçlar literatürdeki çalışmalarla karşılaştırıldı.

LBP Feature Extraction and Statistical Pooling-Based Image Spam Detection Model

Email, which stands for electronic mail, is a form of digital communication between two or more individuals. These technological instruments that facilitate communication can have a positive and negative impact on our lives due to junk e-mails, widely known as spam mail. These spam messages, which are typically delivered for commercial purposes by organizations/individuals for indirect or direct benefits, not only distract people but also consume a significant amount of system resources such as processing power, memory, and network bandwidth. In this study, a method based on LBP (Local Binary Patterns) feature extraction and statistical pooling is proposed to classify spam or raw (non-spam) images. Two datasets are used to test the proposed method. The ISH dataset is widely used in the literature and contains 1738 images. In addition to this dataset, the dataset our collect consists of 1015 images in total. Feature extraction was performed on these images. Obtained features were classified by SVM (Support Vector Machine) algorithm. In the proposed method, 98.56% and 79.01% accuracy were calculated for the ISH dataset and our collected dataset, respectively. The results obtained were compared with the studies in the literature.

___

  • Abuzaid, N. N., & Abuhammad, H. Z. (2022). Image SPAM Detection Using ML and DL Techniques. International Journal of Advances in Soft Computing and its Applications, 14(1), 226-243. https://doi.org/10.15849/IJASCA.220328.15
  • Annadatha, A., & Stamp, M. (2018). Image spam analysis and detection. Journal of Computer Virology and Hacking Techniques, 14(1), 39-52. https://doi.org/10.1007/s11416-016-0287-x
  • Belkhouche, Y. (2022). A language processing-free unified spam detection framework using byte histograms and deep learning. 2022 Fourth International Conference on Transdisciplinary AI (TransAI), 83-86. https://doi.org/10.1109/TransAI54797.2022.00021
  • Bhuiyan, H., Ashiquzzaman, A., Juthi, T. I., Biswas, S., & Ara, J. (2018). A Survey of Existing E-Mail Spam Filtering Methods Considering Machine Learning Techniques. Global Journal of Computer Science and Technology: C Software and Data Engineering, 1(2). http://creativecommons.
  • Budanović, N. (2021). What’s On the Other Side of Your Inbox – 20 SPAM Statistics for 2021. DataProt, 1.
  • Çayır, A., Yenidoğan, I., & Dağ, H. (2018). Feature Extraction Based on Deep Learning for Some Traditional Machine Learning Methods. UBMK’18 3rd International Conference on Computer Science and Engineering, 494-497.
  • Gao, Y., Yang, M., Zhao, X., Pardo, B., Wu, Y., Pappas, T. N., & Choudhary, A. (2008). Image spam hunter. ICASSP 2008, 1765-1768.
  • Ghizlane, H., Jamal, R., Mahraz, M. A., Ali, Y., & Hamid, T. (2022). Spam image detection based on convolutional block attention module. 2022 International Conference on Intelligent Systems and Computer Vision, ISCV 2022, 0-3. https://doi.org/10.1109/ISCV54655.2022.9806065
  • Kihal, M., & Hamza, L. (2023). Robust multimedia spam filtering based on visual, textual, and audio deep features and random forest. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-023-15170-x
  • Kumar, A. D., R, V., & KP, S. (2018). DeepImageSpam: Deep Learning based Image Spam Detection. arXiv. http://arxiv.org/abs/1810.03977
  • Mahdi Salih, A., & Nadeem Dhannoon, B. (2020). Color Model Based Convolutional Neural Network for Image Spam Classification. Al-Nahrain Journal of Science, 23(4), 44-48. https://doi.org/10.22401/anjs.23.4.08
  • Metlapalli, A. C., Muthusamy, T., & Battula, B. P. (2022). Classification of Image Spam Using Convolution Neural Network. Traitement Du Signal, 39(1), 363-369. https://doi.org/10.18280/ts.390138
  • Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971-987. https://doi.org/10.1109/TPAMI.2002.1017623
  • Pietikäinen, M. (2010). Local Binary Patterns. Scholarpedia, 5(3), 9775. https://doi.org/10.4249/scholarpedia.9775
  • Rusland, N. F., Wahid, N., Kasim, S., & Hafit, H. (2017). Analysis of Naïve Bayes Algorithm for Email Spam Filtering across Multiple Datasets. IOP Conference Series: Materials Science and Engineering, 226(1), 1-9. https://doi.org/10.1088/1757-899X/226/1/012091
  • Singh, A. B., Singh, K. M., Chanu, Y. J., Thongam, K., & Singh, K. J. (2022). An Improved Image Spam Classification Model Based on Deep Learning Techniques. Security and Communication Networks, 2022, 1-11. https://doi.org/10.1155/2022/8905424
  • Trivedi, S. K. (2016). A study of machine learning classifiers for spam detection. 2016 4th International Symposium on Computational and Business Intelligence, ISCBI 2016, 176-180. https://doi.org/10.1109/ISCBI.2016.7743279
  • Yasar, H., & Ceylan, M. (2021). A new deep learning pipeline to detect Covid-19 on chest X-ray images using local binary pattern, dual tree complex wavelet transform and convolutional neural networks. Applied Intelligence, 51(5), 2740-2763. https://doi.org/10.1007/s10489-020-02019-1
  • Zhang, Z., Damiani, E., Hamadi, H. A., Yeun, C. Y., & Taher, F. (2022). Explainable Artificial Intelligence to Detect Image Spam Using Convolutional Neural Network. 2022 International Conference on Cyber Resilience (ICCR), 1-5. https://doi.org/10.1109/ICCR56254.2022.9995839