Fatih TİRYAKİ, Ümit ŞENTÜRK, İbrahim YÜCEDAĞ

Kötücül URL Tespitinde Yapay Zekâ Modeli Geliştirme ve Değerlendirilmesi

Günümüzde internetin her geçen yıl kullanımın artmasıyla hayatımızda çok önemli bir hale gelmiş ve yeni iletişim teknolojileri, sosyal ağlar, e-ticaret, çevrimiçi bankacılık dâhil olmak üzere birçok uygulamada işlerin teşvik edilmesinde ve büyütülmesinde önemli bir etkiye sahiptir. Yaptığımız çalışmada, kullandığımız yapay zekâ modeli ile zararlı URL adreslerini tespitinde büyük bir veri seti ile çalışılması ve en iyi sonucu elde etmek hedeflenmiştir. Çalışmada 7 katmanlı RNN modeli kullanılmış, modelde çalıştırmak üzere ulusal ve uluslararası birbirine benzer iki adet veri seti birleştirilmiş, 579.112 adet URL adresinden oluşan devasa bir yeni veri seti oluşturulmuştur. Daha sonra bu yeni veri seti eğitim ve test setlerine ayrılmıştır. İlk olarak veri setimiz modelde eğitilmiş ve ardından ikinci veri seti test edilmiştir. Bu veri seti modelimizde işlendiğinde %91'in üzerinde bir başarı oranı elde edilmiştir. Bu oran zararlı url adreslerini tespit etmesinde çok iyi bir sonuçtur. Bu çalışmamızla, internet kullanımı arttıkça zararlı sitelerin tespiti için daha etkin yöntemlerin geliştirilmesine önemli katkı sağlamakta, yapay zeka modellerinin paralel kullanımı bu tür sitelerin tespitini daha kolay hale getirmekte olup ve potansiyel olarak kullanıcıların çeşitli siber saldırı türlerinden korunmalarına yardımcı olması hedeflenmektedir.

Anahtar Kelimeler:

Kötücül URL, Siber Güvenlik, Yapay Zekâ, RNN, Doğruluk

Developing and Evaluating an Artificial Intelligence Model for Malicious URL Detection

Today, the increased use of the internet has become important in our lives and new communication technologies, social networks, e-commerce, online banking, and among other applications have a significant impact on the promotion and growth of business. In our study, we aimed to work with a large dataset and to achieve the best results in detecting malicious URL addresses using an artificial intelligence model. A 7-layer RNN model was used in the study, and two similar national and international datasets were combined and merged to create a big new dataset consisting of 579,112 URL addresses. Then, this new data set is divided into training and test sets. first data set was trained at the model and then the second data set was processed test. When this data set was processed in our model, we achieved a success rate of over 91%. This rate is a very good result of detecting malicious url addresses. Your contribution with this work is significant in developing more effective methods for detecting harmful sites as internet usage increases, parallel use of artificial intelligence models makes detection of such sites more effective and potentially assist users in protecting from various types of cyber-attacks is targeted.

Keywords:

Malicious URL, Cyber Security, Artificial Intelligence, RNN Model, Accuracy Kötücül URL, Siber Güvenlik, Yapay Zekâ, RNN, Doğruluk,

PDF

___

Nora A. A. and Narmatha C (2022), A Systematic Approach for Malware URL Recognition, 2022 2nd International Conference on Computing and Information Technology (ICCIT) Jan. 25 - 27, 2022/ FCIT/UT/KSA.
M. Alsaedi, F. A. Ghaleb, F. Saeed, J. Ahmad and M. Alasli. (2022), Cyber Threat Intelligence-Based Malicious URL Detection Model Using Ensemble Learning, Sensors 2022, 22, 3373. https://doi.org/10.3390/ s22093373.
Bu, S.-J.; Kim, H.-J.(2022) , Optimized URL Feature Selection Based on Genetic-Algorithm- Embedded Deep Learning for Phishing Website Detection. Electronics, 2022, 1, 1090. https://doi.org/10.3390/ electronics11071090.
Z. Chen Y. Liu, C. Chen, M. Lu and X. Zhang (2021), Malicious URL Detection Based on Improved Multilayer Recurrent Convolutional Neural Network Model, Hindawi Security and Communication Networks Volume 2021, Article ID 9994127, 13 pages https://doi.org/10.1155/2021/9994127.
R. H. GBURI (2021), Detection of Malicious URLs Using Machine Learning, Yök Tez:704886.
SK H. Ahammad, S. D. Kale, G.D. Upadhye et al (2022), Phishing URL detection using machine learning methods, Advances in Engineering Software 173 (2022) 103288.
G. M. Kumar, Sri. S. K. Alisha and Sri. V. B. Murthy (2022), Detecting Mobile Malicious Webpages In Real Time, Journal of Engineering Sciences Vol 13 Issue 07,2022, ISSN:0377-9254.
R.Bharadwaj, A. Bhatia, L. D. Chhibbar, K. Tiwari and A. Agrawal (2022), Is this URL Safe: Detection of Malicious URLs. Using Global Vector for Word Representation | 978-1-6654-1332-9/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICOIN53446.2022.9687204.
S. Vecile, K. Lacroix, K. Grolinger and J. Samarabandu (2022), Malicious and Benign URL Dataset Generation Using Character-Level LSTM Models, 2022 IEEE Conference on Dependable and Secure Computing (DSC) | 978-1-6654-2141-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/DSC54232.2022.9888835.
H. Zhao and Z. Chen (2022), Malicious Domain Names Detection Algorithm Based on Statistical Features of URLs, 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD) | 978-1-6654-0527-0/22/$31.00 ©2022 IEEE | DOI: 10.1109/CSCWD54268.2022.9776264.
A. Pandey and J. Chadawar (2022), Phishing URL Detection using Hybrid Ensemble Model International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 Vol. 11 Issue 04, April-2022.
D. J. Bell, B. D. Loader, N. Pleace and D. Schuler (2004), Cyberculture: The Key Concepts, Routledge LONDON AND NEW YORK
https://www.kaggle.com/datasets/teseract/urldataset/14.09.2022
https://www.usom.gov.tr/adres./12.09.2022
Ü. Şentürk, İ. Yücedağ and K. Polat (2018), Repetitive neural network (RNN) based blood pressure estimation using PPG and ECG signals, 2018 2Nd international symposium on multidisciplinary studies and innovative technologies (ISMSIT).
R. S. Arslan, A Deep Learning Model for Malicious Url Filtering, European Journal of Science and Technology Special Issue 29, pp. 122-128, December 2021.
H. Karamollaoğlu, İ. Yücedağ and İ. A. Doğru (2021), Customer Churn Prediction Using Machine Larning Methods: A Comparative Analysis, UBMK’2021 6th International Conferance on Computer and Engineering – 139.