Image Based Malware Classification with Multimodal Deep Learning

Image Based Malware Classification with Multimodal Deep Learning

Today, there are many different methods for analyzing and detecting malware. Some of these methods are basically based on statistical analysis, some on static and dynamic analysis methods, and some on machine learning methods. The studies carried out to classify malware with statistical machine learning-based analysis methods are generally based on complex and challenging feature extraction methods, and manual feature extraction is a very tedious process. However, the capability of deep learning methods to automatically extract complex features in a way simplifies this arduous process. In this study, a novel multimodal convolutional neural network-based deep learning architecture and singular value decomposition-based image feature extraction method are proposed to classify malware files using intermediate-level feature fusion. In addition to this, the performances of classical machine learning algorithms, neural networks, and the proposed multimodal convolutional neural networks-based deep learning algorithm are compared, and their performance is revealed. The performance of the proposed algorithm was also compared with the results of studies conducted with the same data set in the literature. The experimental results concluded that the proposed method is more successful than other methods or showed the same performance even though it did not use manual feature extraction techniques. It has been observed that with architecture, intermediate fusion approaches have the ability to obtain more specific features more effectively than other methods, thus improving performance values more than other methods.

___

  • [1] L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, “Malware Images: Visualization and Automatic Classification,” in Proceedings of the 8th International Symposium on Visualization for Cyber Security. New York, NY, USA: Association for Computing Machinery, 2011. [Online]. Available: 10.1145/2016904.2016908
  • [2] J. Singh and J. Singh, “A survey on machine learningbased malware detection in executable files,” Journal of Systems Architecture, vol. 112, p. 101861, 2021. [Online]. Available: https://doi.org/10.1016/j.sysarc.2020.101861
  • [3] D. Gibert, C. Mateu, and J. Planes, “The rise of machine learning for detection and classification of malware: Research developments, trends and challenges,” Journal of Network and Computer Applications, vol. 153, p. 102526, 2020. [Online]. Available: https://doi.org/ 10.1016/j.jnca.2019.102526
  • [4] H. V. Nath and B. M. Mehtre, “Static Malware Analysis Using Machine Learning Methods,” in Recent Trends in Computer Networks and Distributed Systems Security, G. M. Pérez, S. M. Thampi, R. Ko, and L. Shu, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014, pp. 440–450.
  • [5] A. Ojugo and A. Eboka, “Signature-Based Malware Detection Using Approximate Boyer Moore String Matching Algorithm,” International Journal of Mathematical Sciences and Computing, vol. 5, pp. 49–62, 2019.
  • [6] P. Harshalatha and R. Mohanasundaram, “Classification Of Malware Detection Using Machine Learning Algorithms: A Survey,” International Journal of Scientific & Technology Research, vol. 9, pp. 1796–1802, 2020.
  • [7] R. Kumar and A. R. E. Vaishakh, “Detection of Obfuscation in Java Malware,” Procedia Computer Science, vol. 78, pp. 521–529, 2016. [Online]. Available: https://doi.org/10.1016/j.procs.2016.02.097
  • [8] Y. Ding, W. Dai, S. Yan, and Y. Zhang, “Control flowbased opcode behavior analysis for Malware detection,” Computers & Security, vol. 44, pp. 65–74, 2014. [Online]. Available: https://doi.org/10.1016/j.cose.2014.04.003
  • [9] A. Damodaran, F. D. Troia, C. A. Visaggio, T. H. Austin, and M. Stamp, “A comparison of static, dynamic, and hybrid analysis for malware detection,” Journal of Computer Virology and Hacking Techniques, vol. 13, no. 1, pp. 1–12, Feb 2017. [Online]. Available: 10.1007/s11416-015-0261-z
  • [10] A. Kumar, K. S. Kuppusamy, and G. Aghila, “A learning model to detect maliciousness of portable executable using integrated feature set,” Journal of King Saud University - Computer and Information Sciences, vol. 31, no. 2, pp. 252–265, 2019. [Online]. Available: https://doi.org/10.1016/j.jksuci.2017.01.003
  • [11] B. Anderson, C. Storlie, M. Yates, and A. McPhall, “Automating Reverse Engineering with Machine Learning Techniques,” in Proceedings of the 2014 Workshop on Artificial Intelligent and Security Workshop. New York, NY, USA: Association for Computing Machinery, 2014, pp. 103–112. [Online]. Available: 10.1145/2666652.2666665
  • [12] D. Gavriluţ, M. Cimpoeşu, D. Anton, and L. Ciortuz, “Malware detection using machine learning,” in 2009 International Multiconference on Computer Science and Information Technology, 2009, pp. 735–741. [Online]. Available: 10.1109/IMCSIT.2009.5352759
  • [13] L. Liu, B. sheng Wang, B. Yu, and Q. xi Zhong, “Automatic malware classification and new malwaredetection using machine learning,” Frontiers of Information Technology {&} Electronic Engineering, vol. 18, no. 9, pp. 1336–1347, Sep 2017. [Online]. Available: 10.1631/FITEE.1601325
  • [14] J. Z. Kolter and M. Maloof, “Learning to Detect and Classify Malicious Executables in the Wild,” Journal of Machine Learning Research, vol. 7, pp. 2721–2744, 2006.
  • [15] M. Wojnowicz, G. Chisholm, M. Wolff, and X. Zhao, “Wavelet decomposition of software entropy reveals symptoms of malicious code,” Journal of Innovation in Digital Ecosystems, vol. 3, no. 2, pp. 130– 140, 2016. [Online]. Available: https://doi.org/10.1016/ j.jides.2016.10.009
  • [16] D. Baysa, R. M. Low, and M. Stamp, “Structural entropy and metamorphic malware,” Journal of Computer Virology and Hacking Techniques, vol. 9, no. 4, pp. 179–192, Nov 2013. [Online]. Available: 10.1007/s11416- 013-0185-4
  • [17] I. Sorokin, “Comparing files using structural entropy,” Journal in Computer Virology, vol. 7, no. 4, p. 259, Jun 2011. [Online]. Available: 10.1007/s11416-011-0153-9
  • [18] A. Sami, B. Yadegari, H. Rahimi, N. Peiravian, S. Hashemi, and A. Hamze, “Malware Detection Based on Mining API Calls,” in Proceedings of the 2010 ACM Symposium on Applied Computing. New York, NY, USA: Association for Computing Machinery, 2010, pp. 1020–1025. [Online]. Available: 10.1145/1774088.1774303
  • [19] Y. Ye, D. Wang, T. Li, D. Ye, and Q. Jiang, “An intelligent PE-malware detection system based on association mining,” Journal in Computer Virology, vol. 4, no. 4, pp. 323–334, Nov 2008. [Online]. Available: 10.1007/s11416-008-0082-4
  • [20] D. Yuxin and Z. Siyi, “Malware detection based on deep learning algorithm,” Neural Computing and Applications, vol. 31, no. 2, pp. 461–472, Feb 2019. [Online]. Available: 10.1007/s00521-017-3077-6
  • [21] J. Saxe and K. Berlin, “Deep neural network based malware detection using two dimensional binary program features,” in 2015 10th International Conference on Malicious and Unwanted Software (MALWARE), 2015, pp. 11–20. [Online]. Available: 10.1109/MALWARE.2015.7413680
  • [22] W. Huang and J. W. Stokes, “MtNet: A Multi-Task Neural Network for Dynamic Malware Classification,” in Detection of Intrusions and Malware, and Vulnerability Assessment, J. Caballero, U. Zurutuza, and R. J. Rodríguez, Eds. Cham: Springer International Publishing, 2016, pp. 399–418.
  • [23] B. Kolosnjaji, A. Zarras, G. Webster, and C. Eckert, “Deep Learning for Classification of Malware System Call Sequences,” in AI 2016: Advances in Artificial Intelligence, B. H. Kang and Q. Bai, Eds. Cham: Springer International Publishing, 2016, pp. 137–149.
  • [24] K. Kancherla and S. Mukkamala, “Image visualization based malware detection,” in 2013 IEEE Symposium on Computational Intelligence in Cyber Security (CICS), 2013, pp. 40–44. [Online]. Available: 10.1109/CICYBS.2013.6597204
  • [25] M. Kalash, M. Rochan, N. Mohammed, N. D. B. Bruce, Y. Wang, and F. Iqbal, “Malware Classification with Deep Convolutional Neural Networks,” in 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), 2018, pp. 1–5. [Online]. Available: 10.1109/NTMS.2018.8328749
  • [26] S.-C. Hsiao, D.-Y. Kao, Z.-Y. Liu, and R. Tso, “Malware Image Classification Using One-Shot Learning with Siamese Networks,” Procedia Computer Science, vol. 159, pp. 1863–1871, 2019. [Online]. Available: https://doi.org/10.1016/j.procs.2019.09.358
  • [27] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei, and Q. Zheng, “IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture,” Computer Networks, vol. 171, p. 107138, 2020. [Online]. Available: https://doi.org/10.1016/j.comnet.2020.107138
  • [28] D. Vasan, M. Alazab, S. Wassan, B. Safaei, and Q. Zheng, “Image-Based malware classification using ensemble of CNN architectures (IMCEC),” Computers & Security, vol. 92, p. 101748, 2020. [Online]. Available: https://doi.org/10.1016/j.cose.2020.101748
  • [29] B. Yadav and S. Tokekar, “Recent Innovations and Comparison of DeepLearning Techniques in Malware Classification : A Review,” International Journal of Information Security Science, vol. 9, no. 4, pp. 230–247, 2020.
  • [30] J. J. M. Cuppen, “The singular value decomposition in product form,” SIAM Journal on Scientific and Statistical Computing, vol. 4, no. 2, pp. 216–222, 1983.
  • [31] R. A. Sadek, “SVD Based Image Processing Applications: State of The Art, Contributions and Research Challenges,” International Journal of Advanced Computer Science and Applications, vol. 3, no. 7, 2012. [Online]. Available: 10.14569/IJACSA.2012.030703
  • [32] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel, “Backpropagation Applied to Handwritten Zip Code Recognition,” Neural Computation, vol. 1, pp. 541–551, 1989.
  • [33] D. Stutz, “Illustrating (Convolutional) Neural Networks in LaTeX with TikZ,” 06 2020. [Online]. Available: https://davidstutz.de/illustratingconvolutional-neural-networks-in-latex-with-tikz/
  • [34] S. Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images,” International Journal of Scientific and Research Publications (IJSRP), vol. 9, no. 10, pp. 9420–9420, 2019. [Online]. Available: https://dx.doi.org/10.29322/ ijsrp.9.10.2019.p9420