Siber Saldırılara Karşı Kullanılan Makine Öğrenme Yöntemlerinin Web Uygulamalarında Güvenlik Etkinliğinin Ölçümü

Küresel dünyadaki teknolojik gelişmelerin son zamanlarda hızlı ilerlemesi, kitlelerin hızlı bir şekilde bu gelişmeleri yakından takip etmesi ve paylaşımlarda bulunması siber suçluların odak noktası haline gelmiştir. İnsanlar temel ihtiyaçlarını, isteklerini, paylaşımlarını veya çalışmalarını akıllı cihazlar üzerinden internet alt yapısını kullanarak gerçekleştirmektedirler. Bu eylemleri kullanıcılar gerçekleştirirken web uygulamalar üzerinden ister istemez bir açık kapı bırakabilmektedir. Neticesinde kullanıcıya özel tanımlanmış bilgiler başkalarının eline kolayca geçebilmektedir. Son zamanlarda web siteleri üzerinden gerçekleştirilen faaliyetlerde ciddi artış olmuştur. Bu artışın sebeplerinden biri ve en önemlisi ise dünya genelinde etkisini göstermiş olan pandemi sürecidir. Siber suçlular bu gibi durumları fırsata çevirmek ve maddi kazanç sağlamak isterler. İnsanların yoğun talepte bulunduğu web sitelerine yönelik açıklar ararlar ve onların kullanıcı bilgilerine, kart bilgilerine erişmek isterler. Bu çalışma çeşitli web sitelerinin güvenlik açıklarına karşı makine öğrenme yöntemlerinin performansını ölçen bir yaklaşım önermektedir. Çalışmada kullanılan veri kümesi 1000 adet web sitesinin parametre özelliklerinden oluşmaktadır. Çalışmanın deneysel analizlerinde; Çok Katmanlı Algılayıcı, Destek Vektör Makineleri, Karar Ağaçları, Naif Bayes, Rastgele Orman yöntemleri kullanıldı. Makine öğrenme yöntemlerinden elde edilen genel doğruluk başarıları sırasıyla; %74, %74, %100, %69,5 ve %100'dü. Deneysel analizler siber saldırılarının tespitinde makine öğrenme yöntemlerinin etkin olduğunu göstermiştir.

Measuring the Security Effectiveness of Machine Learning Methods Used Against Cyber Attacks in Web Applications

The rapid progress of technological developments in the global world, the people to closely follow these developments and share them have become the focus of cybercriminals. People realize their basic needs, requests, shares or works via smart devices using the internet infrastructure. While performing these actions, users can inevitably leave an open door through web applications. As a result, user-defined information can easily be passed on to others. Recently, there has been a serious increase in activities carried out on websites. One of the reasons for this increase, and the most important one, is the pandemic that has had an impact worldwide. Cybercriminals want to turn such situations into opportunities and gain financial gain. They look for vulnerabilities in the websites that people demand heavily and they want to access their user information and card information. This study proposes an approach that measures the performance of machine learning methods against the vulnerabilities of various websites. The data set used in the study consists of parameter properties of 1000 websites. In the experimental analysis of the study; Multilayer Perceptron, Support Vector Machines, Decision Trees, Naive Bayesian, Random Forest methods were used. The general accuracy achievements obtained from machine learning methods are; it was 74%, 73.7%, 100%, 69.5% and 100%, respectively. Experimental analysis has shown that machine learning methods are effective in detecting cyber attacks.

___

  • [1] Yin Z., Liu W., Chawla S., Adversarial Attack, Defense, and Applications with Deep Learning Frameworks, (2019) 1–25. doi:10.1007/978-3-030-13057-2_1.
  • [2] Jang-Jaccard J., Nepal S., A survey of emerging threats in cybersecurity, J Comput Syst Sci, (2014) 80:973–93. doi:https://doi.org/10.1016/j.jcss.2014.02.005.
  • [3] Nguyen M.H., Gruber J., Fuchs J., Marler W., Hunsaker A., Hargittai E., Changes in Digital Communication During the COVID-19 Global Pandemic: Implications for Digital Inequality and Future Research, Soc Media + Soc, (2020) 6:2056305120948255. doi:10.1177/2056305120948255.
  • [4] Dunton G.F., Do B., Wang S.D., Early effects of the COVID-19 pandemic on physical activity and sedentary behavior in children living in the U.S., BMC Public Health, (2020) 20:1351. doi:10.1186/s12889-020-09429-3.
  • [5] Buchanan R., What We Know about Identity Theft and Fraud Victims from Research-and Practice-Based Evidence center for victim Research Report, (2019) 34.
  • [6] Hashizume K., Rosado D.G., Fernández-Medina E., Fernandez E.B., An analysis of security issues for cloud computing, J Internet Serv Appl, (2013) 4:5. doi:10.1186/1869-0238-4-5.
  • [7] Marashdih A.W., Zaaba Z.F., Suwais K., Mohd N.A., Web application security: An investigation on static analysis with other algorithms to detect cross site scripting, Procedia Comput Sci, (2019) 161:1173–81. doi:10.1016/j.procs.2019.11.230.
  • [8] Ferrara E., The history of digital spam, Commun ACM, (2019) 62:82–91. doi:10.1145/3299768.
  • [9] Ingle D., Attacks on Web Based Software and Modelling Defence Mechanisms, Int J UbiComp, (2012) 3:11–30. doi:10.5121/iju.2012.3302.
  • [10] Bhagwani H., Log based Dynamic Intrusion Detection of Web Applications. Master of Technology, (2019).
  • [11] Liu Y., Wang Z., Tian S., Security Against Network Attacks on Web Application System BT - Cyber Security, In: Yun X, Wen W, Lang B, Yan H, Ding L, Li J, ve ark., editors., Singapore: Springer Singapore, (2019) 145–52.
  • [12] Pan Y., Sun F., Teng Z., White J., Schmidt D.C., Staples J., ve ark., Detecting web attacks with end-to-end deep learning, J Internet Serv Appl, (2019) 10:16. doi:10.1186/s13174-019-0115-x.
  • [13] Liu T., Qi Y., Shi L., Yan J., Locate-then-DetecT: Real-time web attack detection via attention-based deep neural networks. IJCAI Int Jt Conf Artif Intell, (2019) 4725–31. doi:10.24963/ijcai.2019/656.
  • [14] Kozik R., Choraś M., Renk R., Holubowicz W., Kozik R., Choraś M., ve ark., A Proposal of Algorithm for Web Applications Cyber Attack Detection, (2016) 1–8.
  • [15] Anbiya D.R., Purwarianti A., Asnar Y., Vulnerability Detection in PHP Web Application Using Lexical Analysis Approach with Machine Learning 5th Int. Conf. Data Softw. Eng., (2018) 1–6. doi:10.1109/ICODSE.2018.8705809.
  • [16] Hemane N., Cyber Security: Machine Learning Model to protects web and mobile applications from runtime attacks /(Dataset). Github, (2021) https://github.com/nehahemane/Cyber_Security (Erişim tarihi: 6 Haziran 2021).
  • [17] Thomas P., Suhner M-C., A New Multilayer Perceptron Pruning Algorithm for Classification and Regression Applications, Neural Process Lett, (2015) 42:437–58. doi:10.1007/s11063-014-9366-5.
  • [18] Castro W., Oblitas J., Santa-Cruz R., Avila-George H., Multilayer perceptron architecture optimization using parallel computing techniques, PLoS One, (2017) 12:e0189369. doi:10.1371/journal.pone.0189369.
  • [19] Cervantes J., Garcia-Lamont F., Rodríguez-Mazahua L., Lopez A., A comprehensive survey on support vector machine classification: Applications, challenges and trends, Neurocomputing, (2020) 408:189–215. doi:https://doi.org/10.1016/j.neucom.2019.10.118.
  • [20] Ma Y., Zhang Q., Li D., Tian Y., Linex Support Vector Machine for Large-Scale Classification, IEEE Access, (2019) 7:70319–31. doi:10.1109/access.2019.2919185.
  • [21] Kingsford C., Salzberg S.L., What are decision trees? Nat Biotechnol, (2008) 26:1011–3. doi:10.1038/nbt0908-1011.
  • [22] Gadekallu T.R., Khare N., Bhattacharya S., Singh S., Maddikunta P.K.R., Srivastava G., Deep neural networks to predict diabetic retinopathy, J Ambient Intell Humaniz Comput, (2020) doi:10.1007/s12652-020-01963-7.
  • [23] Xu S., Bayesian Naïve Bayes classifiers to text classification, J Inf Sci, (2016) 44:48–59. doi:10.1177/0165551516677946.
  • [24] Goh J.O.S., Hung H-Y., Su Y-S., Chapter Seven - A conceptual consideration of the free energy principle in cognitive maps: How cognitive maps help reduce surprise. In: Federmeier KDBT-P of L and M, 69, Academic Press, (2018) 205–40. doi:https://doi.org/10.1016/bs.plm.2018.09.005.
  • [25] Zhang H., Zhou J., Jahed Armaghani D., Tahir M.M., Pham B.T., Huynh V. V., A Combination of Feature Selection and Random Forest Techniques to Solve a Problem Related to Blast-Induced Ground Vibration, Appl Sci, (2020) 10. doi:10.3390/app10030869.
  • [26] Wang P., Hu J., A hybrid model for EEG-based gender recognition, Cogn Neurodyn, (2019) 13:541–54. doi:10.1007/s11571-019-09543-y.
  • [27] Amudaakindele K., Telecommunication Churn Prediction, Github, (2020) https://github.com/amudaakindele/Telecommunication-Churn-Prediction/blob/master/Telecom_churn.ipynb (Erişim tarihi: 9 Haziran 2021).
  • [28] Carneiro T., NóBrega RVM D., Nepomuceno T., Bian G., Albuquerque VHC D., Filho PPR., Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications, IEEE Access, (2018), 6:61677–85. doi:10.1109/access.2018.2874767.
  • [29] Hasnain M., Pasha M.F., Ghani I., Imran M., Alzahrani M.Y., Budiarto R., Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking, IEEE Access, (2020) 8:90847–61. doi:10.1109/access.2020.2994222.
  • [30] Demir F., Ismael A.M., Sengur A., Classification of Lung Sounds With CNN Model Using Parallel Pooling Structure, IEEE Access, (2020) 8:105376–83. doi:10.1109/access.2020.3000111.
  • [31] Hemane N. Cyber Security analysis results, Github, (2021) https://github.com/nehahemane/Cyber_Security/blob/main/Cyber_Security.ipynb (Erişim tarihi: 10 Haziran 2021).