Nazife BAYKAL, Emre SÜREN, Pelin ANGIN

I see EK: A lightweight technique to reveal exploit kit family by overall URL patterns of infection chains

The prevalence and nonstop evolving technical sophistication of exploit kits (EKs) is one of the mostchallenging shifts in the modern cybercrime landscape. Over the last few years, malware infections via drive-by downloadattacks have been orchestrated with EK infrastructures. Malicious advertisements and compromised websites redirectvictim browsers to web-based EK families that are assembled to exploit client-side vulnerabilities and finally deliverevil payloads. A key observation is that while the webpage contents have drastic differences between distinct intrusionsexecuted through the same EK, the patterns in URL addresses stay similar. This is due to the fact that autogeneratedURLs by EK platforms follow specific templates. This practice in use enables the development of an efficient systemthat is capable of classifying the responsible EK instances. This paper proposes novel URL features and a new techniqueto quickly categorize EK families with high accuracy using machine learning algorithms. Rather than analyzing eachURL individually, the proposed overall URL patterns approach examines all URLs associated with an EK infectionautomatically. The method has been evaluated with a popular and publicly available dataset that contains 240 differentreal-world infection cases involving over 2250 URLs, the incidents being linked with the 4 major EK flavors that occurredthroughout the year 2016. The system achieves up to 100% classification accuracy with the tested estimators.

PDF

___

[1] Provos N, Mavrommatis P, Rajab MA, Monrose F. All your iframes point to us. In: Proceedings of the 17th Usenix Conference on Security Symposium; San Jose, CA, USA; 2008. pp. 1–16.
[2] Wang YM, Beck D, Jiang X, Roussev R. Automated web patrol with strider honeymonkeys: finding web sites that exploit browser vulnerabilities. In: Proceedings of the 13th Annual Network and Distributed System Security Symposium; San Diego, CA, USA; 2006. pp. 35–49.
[3] Provos N, McNamee D, Mavrommatis P, Wang K, Modadugu N. The ghost in the browser analysis of web-based malware. In: Proceedings of the 1st Usenix Workshop on Hot Topics in Understanding Botnets; Cambridge, MA, USA; 2007. p. 4.
[4] Seifert C, Welch I, Komisarczuk P. HoneyC - The low-interaction client honeypot. In: Proceedings of the New Zealand Computer Science Research Student Conference; Hamilton, New Zealand; 2007. pp. 1–9.
[5] Moshchuk A, Bragin T, Deville D, Gribble SD, Levy HM. SpyProxy: Execution-based detection of malicious web content. In: Proceedings of the 16th Usenix Conference on Security Symposium; Boston, MA, USA; 2007. pp. 1-16.
[6] Nazario J. A virtual client honeypot. In: Proceedings of the 2nd Usenix Workshop on Large-Scale Exploits and Emergent Threats; Boston, MA, USA; 2009. pp. 911-919.
[7] Zhang J, Seifert C, Lee W, Stokes JW. ARROW: Generating signatures to detect drive-by downloads. In: Proceedings. of the 20th International Conference on World Wide Web; Hyderabad, India; 2011. pp. 187–196.
[8] Grier C, Pitsillidis A, Provos N, Rafique MZ, Rajab MA et al. Manufacturing compromise: The emergence of exploit-as-a-service. In: Proceedings of the 19th ACM Conference on Computer and Communications Security; Raleigh, NC, USA; 2012. pp. 821–832.
[9] Kotov V, Massacci F. Anatomy of exploit kits: preliminary analysis of exploit kits as software artefacts. In: Proceedings of the 5th International Symposium on Engineering Secure Software and Systems; Paris, France; 2013. pp. 181–196.
[10] Allodi L, Kotov V, Massacci F. MalwareLab: Experimentation with cybercrime attack tools. In: Proceedings of the 6th Usenix Workshop on Cyber Security Experimentation and Test; Washington, DC, USA; 2013. pp. 1–8.
[11] De Maio G, Kapravelos A, Shoshitaishvili Y, Kruegel C, Vigna G. PExy: The other side of exploit kits. In: Proceedings of the 11th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment; Egham, UK; 2014. pp. 132–151.
[12] Eshete B, Venkatakrishnan VN. WebWinnow: Leveraging exploit kit workflows to detect malicious urls. In: Proceedings of the 4th ACM Conference on Data and Application Security and Privacy; San Antonio, TX, USA; 2014. pp. 305–312.
[13] Taylor T, Hu X, Wang T, Jang J, Stoecklin MP et al. Detecting malicious exploit kits using tree-based similarity searches. In: Proceedings of the 6th ACM Conference on Data and Application Security and Privacy; San Antonio, TX, USA; 2016. pp. 255–266.
[14] Taylor T. Using context to improve network-based exploit kit detection. PhD, University of North Carolina, Chapel Hill, NC, USA, 2016.
[15] Stock B, Livshits B, Zorn B. Kizzle: A signature compiler for detecting exploit kits. In: Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks; Toulouse, France; 2013. pp. 455–466.
[16] Jayasinghe GK, Culpepper JS, Bertok P. Efficient and effective realtime prediction of drive-by download attacks. Journal of Network Computer Application 2014; 38: 135-49.
[17] Nappa A, Rafique MZ, Caballero J. The MALICIA dataset: Identification and analysis of drive-by download operations. International Journal of Information Security 2015; 14 (1): 15-33.
[18] Sood AK, Zeadally S. Drive-by download attacks: a comparative study. IT Professional 2016; 18 (5): 18-25.
[19] Takata Y, Akiyama M, Yagi T, Hariu T, Goto S. MineSpider: Extracting hidden URLs behind evasive drive-by download attacks. IEICE Transactions on Information and Systems 2016; 99 (4): 860-872.
[20] Aldwairi M, Hasan M, Balbahaith Z. Detection of drive-by download attacks using machine learning approach. International Journal of Information Security and Privacy 2017; 11 (4): 16-28.
[21] Jagannatha P. Detecting exploit kits using machine learning. MSc, University of Twente, Twente, the Netherlands, 2016.
[22] Sandnes J. Applying machine learning for detecting exploit kit traffic. MSc, University of Oslo, Oslo, Norway, 2017.
[23] Suren E, Angin P. Know your EK: A content and workflow analysis approach for exploit kits. Journal of Internet Services and Information Security 2019; 9 (1): 24-47.
[24] Pedregosa F, Varoquaux G. Scikit-learn: Machine Learning in Python. 2011.