LBP Özellik Çıkarma ve İstatistiksel Havuzlama Tabanlı Görüntü Spam Tespit Modeli

Elektronik posta anlamına gelen e-posta, iki veya daha fazla kişi arasındaki bir dijital iletişim biçimidir. İletişimi kolaylaştıran bu teknolojik araçlar, yaygın olarak spam mail olarak bilinen önemsiz e-postalar nedeniyle hayatımızı olumlu ve olumsuz bir şekilde etkileyebilir. Genellikle kuruluşlar/bireyler tarafından dolaylı veya doğrudan çıkar elde etmek amacıyla ticari amaçlarla iletilen bu spam iletiler, yalnızca insanların dikkatini dağıtmakla kalmaz, aynı zamanda işlem gücü, bellek ve ağ bant genişliği gibi önemli miktarda sistem kaynağını da tüketir. Bu çalışmada, spam veya ham (spam olmayan) görüntüleri sınıflandırmak için LBP (Local Binary Patterns) özellik çıkarma ve istatistiksel havuzlamaya dayalı bir yöntem önerilmiştir. Önerilen yöntemi test etmek için iki veri seti kullanılmıştır. ISH veri seti literatürde yaygın olarak kullanılmaktadır ve 1738 görüntü içermektedir. Bu veri setine ek olarak, topladığımız veri seti toplamda 1015 görüntüden oluşmaktadır. Bu görüntüler üzerinde özellik çıkarımı yapılmıştır. Elde edilen öznitelikler SVM (Support Vector Machine) algoritması ile sınıflandırılmıştır. Önerilen yöntemde ISH veri setinde %98.56, topladığımız veri setinde ise %79.01 doğruluk oranı hesaplanmıştır. Elde edilen sonuçlar literatürdeki çalışmalarla karşılaştırıldı.

Anahtar Kelimeler:

Spam görüntü algılama, Makine öğrenimi, LBP özellik çıkarma, SVM.

LBP Feature Extraction and Statistical Pooling-Based Image Spam Detection Model

Email, which stands for electronic mail, is a form of digital communication between two or more individuals. These technological instruments that facilitate communication can have a positive and negative impact on our lives due to junk e-mails, widely known as spam mail. These spam messages, which are typically delivered for commercial purposes by organizations/individuals for indirect or direct benefits, not only distract people but also consume a significant amount of system resources such as processing power, memory, and network bandwidth. In this study, a method based on LBP (Local Binary Patterns) feature extraction and statistical pooling is proposed to classify spam or raw (non-spam) images. Two datasets are used to test the proposed method. The ISH dataset is widely used in the literature and contains 1738 images. In addition to this dataset, the dataset our collect consists of 1015 images in total. Feature extraction was performed on these images. Obtained features were classified by SVM (Support Vector Machine) algorithm. In the proposed method, 98.56% and 79.01% accuracy were calculated for the ISH dataset and our collected dataset, respectively. The results obtained were compared with the studies in the literature.

Keywords:

Spam image detection, Machine learning, LBP feature extraction, SVM.,

PDF

___

Abuzaid, N. N., & Abuhammad, H. Z. (2022). Image SPAM Detection Using ML and DL Techniques. International Journal of Advances in Soft Computing and its Applications, 14(1), 226-243. https://doi.org/10.15849/IJASCA.220328.15
Annadatha, A., & Stamp, M. (2018). Image spam analysis and detection. Journal of Computer Virology and Hacking Techniques, 14(1), 39-52. https://doi.org/10.1007/s11416-016-0287-x
Belkhouche, Y. (2022). A language processing-free unified spam detection framework using byte histograms and deep learning. 2022 Fourth International Conference on Transdisciplinary AI (TransAI), 83-86. https://doi.org/10.1109/TransAI54797.2022.00021
Bhuiyan, H., Ashiquzzaman, A., Juthi, T. I., Biswas, S., & Ara, J. (2018). A Survey of Existing E-Mail Spam Filtering Methods Considering Machine Learning Techniques. Global Journal of Computer Science and Technology: C Software and Data Engineering, 1(2). http://creativecommons.
Budanović, N. (2021). What’s On the Other Side of Your Inbox – 20 SPAM Statistics for 2021. DataProt, 1.
Çayır, A., Yenidoğan, I., & Dağ, H. (2018). Feature Extraction Based on Deep Learning for Some Traditional Machine Learning Methods. UBMK’18 3rd International Conference on Computer Science and Engineering, 494-497.
Gao, Y., Yang, M., Zhao, X., Pardo, B., Wu, Y., Pappas, T. N., & Choudhary, A. (2008). Image spam hunter. ICASSP 2008, 1765-1768.
Ghizlane, H., Jamal, R., Mahraz, M. A., Ali, Y., & Hamid, T. (2022). Spam image detection based on convolutional block attention module. 2022 International Conference on Intelligent Systems and Computer Vision, ISCV 2022, 0-3. https://doi.org/10.1109/ISCV54655.2022.9806065
Kihal, M., & Hamza, L. (2023). Robust multimedia spam filtering based on visual, textual, and audio deep features and random forest. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-023-15170-x
Kumar, A. D., R, V., & KP, S. (2018). DeepImageSpam: Deep Learning based Image Spam Detection. arXiv. http://arxiv.org/abs/1810.03977
Mahdi Salih, A., & Nadeem Dhannoon, B. (2020). Color Model Based Convolutional Neural Network for Image Spam Classification. Al-Nahrain Journal of Science, 23(4), 44-48. https://doi.org/10.22401/anjs.23.4.08
Metlapalli, A. C., Muthusamy, T., & Battula, B. P. (2022). Classification of Image Spam Using Convolution Neural Network. Traitement Du Signal, 39(1), 363-369. https://doi.org/10.18280/ts.390138
Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971-987. https://doi.org/10.1109/TPAMI.2002.1017623
PietikÃ¤inen, M. (2010). Local Binary Patterns. Scholarpedia, 5(3), 9775. https://doi.org/10.4249/scholarpedia.9775
Rusland, N. F., Wahid, N., Kasim, S., & Hafit, H. (2017). Analysis of Naïve Bayes Algorithm for Email Spam Filtering across Multiple Datasets. IOP Conference Series: Materials Science and Engineering, 226(1), 1-9. https://doi.org/10.1088/1757-899X/226/1/012091
Singh, A. B., Singh, K. M., Chanu, Y. J., Thongam, K., & Singh, K. J. (2022). An Improved Image Spam Classification Model Based on Deep Learning Techniques. Security and Communication Networks, 2022, 1-11. https://doi.org/10.1155/2022/8905424
Trivedi, S. K. (2016). A study of machine learning classifiers for spam detection. 2016 4th International Symposium on Computational and Business Intelligence, ISCBI 2016, 176-180. https://doi.org/10.1109/ISCBI.2016.7743279
Yasar, H., & Ceylan, M. (2021). A new deep learning pipeline to detect Covid-19 on chest X-ray images using local binary pattern, dual tree complex wavelet transform and convolutional neural networks. Applied Intelligence, 51(5), 2740-2763. https://doi.org/10.1007/s10489-020-02019-1
Zhang, Z., Damiani, E., Hamadi, H. A., Yeun, C. Y., & Taher, F. (2022). Explainable Artificial Intelligence to Detect Image Spam Using Convolutional Neural Network. 2022 International Conference on Cyber Resilience (ICCR), 1-5. https://doi.org/10.1109/ICCR56254.2022.9995839

ISSN: 2548-1304
Yayın Aralığı: Yılda 2 Sayı
Başlangıç: 2016
Yayıncı: Ali KARCI

Arşiv

Sayıdaki Diğer Makaleler

LBP Özellik Çıkarma ve İstatistiksel Havuzlama Tabanlı Görüntü Spam Tespit Modeli

Aytaç KAŞOĞLU, Orhan YAMAN

A New Approach Based on Centrality Value in Solving the Maximum Independent Set Problem: Malatya Centrality Algorithm

Selman YAKUT, Furkan ÖZTEMİZ, Ali KARCİ

Güç Sistemlerinde Farklı Enerji Depolama Seviyelerinde Sistem Kararlılığının İncelenmesi

Ayşe ACAR, Asım KAYGUSUZ

Geri Çekildi: 5G Sistemleri için DL Tabanlı Kanal Tahmini

Bircan ÇALIŞIR

Calculating the Centrality Values According to the Strengths of Entities Relative to their Neighbours and Designing a New Algorithm for the Solution of the Minimal Dominating Set Problem

Şeyda KARCI, Fatih OKUMUŞ, Ali KARCİ

Parçacık Sürü Optimizasyonu Yoluyla Geliştirilen Doğrusal Bir Sınıflandırıcının Analizi

Fatih AYDIN