Forensic Analysis of APT Attacks based on Unsupervised Machine Learning

Advanced Persistent Threat (APT) has become the concern of many enterprise networks. APT can remain unde- tected for a long time span and lead to undesirable consequences such as stealing of sensitive data, broken workflow, and so on. APTs often use evasion techniques to avoid being detected by security systems like Intrusion Detection System (IDS), Security Event Information Management (SIEMs) or firewalls. Also, it makes it difficult to detect the root cause with forensic analysis. Therefore, companies try to identify APTs by defining rules on their IDS. However, besides the time and effort needed to iteratively refine those rules, new attacks cannot be detected. In this paper, we propose a framework to detect and conduct forensic analysis for APTs in HTTP and SMTP traffic. At the heart of the proposed framework is the detection algorithm that is driven by unsupervised machine learning. Experimental results on public datasets demonstrate the effectiveness of the proposed framework with more than 80% detection rate and with less than 5% false-positive rate.

Forensic Analysis of APT Attacks based on Unsupervised Machine Learning

Advanced Persistent Threat (APT) has become the concern of many enterprise networks. APT can remain unde- tected for a long time span and lead to undesirable consequences such as stealing of sensitive data, broken workflow, and so on. APTs often use evasion techniques to avoid being detected by security systems like Intrusion Detection System (IDS), Security Event Information Management (SIEMs) or firewalls. Also, it makes it difficult to detect the root cause with forensic analysis. Therefore, companies try to identify APTs by defining rules on their IDS. However, besides the time and effort needed to iteratively refine those rules, new attacks cannot be detected. In this paper, we propose a framework to detect and conduct forensic analysis for APTs in HTTP and SMTP traffic. At the heart of the proposed framework is the detection algorithm that is driven by unsupervised machine learning. Experimental results on public datasets demonstrate the effectiveness of the proposed framework with more than 80% detection rate and with less than 5% false-positive rate.

___

  • A. Benzekri, R. Laborde, A. Oglaza, D. Rammal, and F. Barre`re, “Dynamic security management driven by situations: An exploratory analysis of logs for the identification of security situations,” in 2019 3rd Cyber Security in Networking Conference (CSNet), 2019, pp. 66–72.
  • (2015) Introduction to Cybercrime. [Online]. Available: interpol.int/en/Crimes/Cybercrime
  • Q. Zhang, H. Li, and J. Hu, “A study on security framework against advanced persistent threat,” in 2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC), 2017, pp. 128–131.
  • (2022) Advanced persistent threat (apt) attacks. [Online]. Available: https://www.cynet.com/advanced-persistent-threat-apt-attacks
  • M. Khosravi-Farmad, A. A. Ramaki, and A. G. Bafghi, “Moving target defense against advanced persistent threats for cybersecurity enhancement,” in 2018 8th International Conference on Computer and Knowledge Engineering (ICCKE), 2018, pp. 280–285.
  • (2022) Tactics, techniques, and procedures. [Online]. Available: https://attack.mitre.org/
  • T.-H. Cheng, Y.-D. Lin, Y.-C. Lai, and P.-C. Lin, “Evasion techniques: Sneaking through your intrusion detection/prevention systems,” IEEE Communications Surveys Tutorials, vol. 14, no. 4, pp. 1011–1020, 2012.
  • H. Kılıc¸, N. S. Katal, and A. A. Selc¸uk, “Evasion techniques efficiency over the ips/ids technology,” in 2019 4th International Conference on Computer Science and Engineering (UBMK), 2019, pp. 542–547.
  • D. X. Cho and H. H. Nam, “A method of monitoring and detecting apt attacks based on unknown domains,” Procedia Computer Science, vol. 150, pp. 316–323, 2019, proceedings of the 13th International Symposium “Intelligent Systems 2018” (INTELS’18), 22-24 October, 2018, St. Petersburg, Russia. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1877050919304041
  • (2019) Smtp security. [Online]. Available: https://mailtrap.io/blog/smtp- security
  • (2017) Http attacks. [Online]. Available: https://blog.radware.com/security/2017/11/http-attacks
  • S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach. London: Prentice Hall, 2010.
  • M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations of Machine Learning.
  • E. Leon, O. Nasraoui, and J. Gomez, “Anomaly detection based on unsupervised niche clustering with application to network intrusion detection,” in Proceedings of the 2004 Congress on Evolutionary Com- putation (IEEE Cat. No.04TH8753), vol. 1, 2004, pp. 502–508 Vol.1.
  • I. Ghafir, M. Hammoudeh, V. Prenosil, L. Han, R. Hegarty, K. Rabie, and F. J. Aparicio-Navarro, “Detection of advanced persistent threat using machine-learning correlation analysis,” Future Generation Computer Systems, vol. 89, pp. 349–359, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X18307532
  • C. D. Xuan, “Detecting apt attacks based on network traffic using machine learning,” Journal of Web Engineering, vol. 20, no. 1, pp. 171– 190, 2021.
  • T. Q. Nguyen, R. Laborde, A. Benzekri, and B. Qu’hen, “Detecting abnormal dns traffic using unsupervised machine learning,” in 2020 4th Cyber Security in Networking Conference (CSNet), 2020, pp. 1–8.
  • R. Kozik, M. Choras´, R. Renk, and W. Holubowicz, “Semi-unsupervised machine learning for anomaly detection in http traffic,” in CORES, 2015.
  • A. Zamir, H. Khan, T. Iqbal, N. Yousaf, F. Aslam, A. Anjum, and M. Hamdani, “Phishing web site detection using diverse machine learning algorithms,” The Electronic Library, vol. ahead-of-print, 01 2020.
  • Z. Rahman, X. Yi, and I. Khalil, “Blockchain based ai-enabled industry 4.0 cps protection against advanced persistent threat,” IEEE Internet of Things Journal, pp. 1–1, 2022.
  • (2020) Botsv1 dataset. [Online]. Available: https://github.com/splunk/botsv1
  • (2020) Botsv2 dataset. [Online]. Available: https://github.com/splunk/ bots