Öğrencilerin Siber Güvenlik Farkındalık Düzeylerinin Makine Öğrenmesi Yöntemleri ile Belirlenmesi

Bilgi ve iletişim teknolojilerinin hızla gelişmesi ile birlikte teknoloji ve interneti kullanan cihaz sayısı artmış ve hayatın her alanına girmiştir. Teknolojideki gelişmeler kullanıcıların ve cihazların siber tehditlerle karşılaşma riskini de beraberinde getirmiştir. Bu çalışma; siber tehditlerle ilgili, öğrencilerin siber güvenlik farkındalık düzeylerini makine öğrenme yöntemleri ile tespit etmeyi amaçlamaktadır. Bu nedenle istatistiksel olarak lisans öğrencilerini temsil eden örnek bir kitleden anket tekniğiyle veri toplanmıştır. Elde edilen veriler, betimsel tarama modeli benimsenerek analiz edilmiş ve analiz sonuçları çalışmada ortaya konmuştur. Sonrasında anket verilerinden oluşturulan veri seti ile Naive Bayes, Karar Ağacı, Rastgele Orman, En Yakın Komşu, XGBoost, Gradient Boost, Destek Vektör Makineleri, Çok Katmanlı Algılayıcı algoritmaları kullanılarak öğrencilerin siber güvenlik farkındalık düzeylerinin tespiti yapılmıştır. Yapılan testler sonucunda 0.7-0.98 arasında değişen doğruluk değerleri, 0.7-0.96 arasında değişen F1 skorları elde edilmiştir. En başarılı performans metrikleri 0.98 doğruluk ve 0.96 F1-skoru ile Çok Katmanlı Algılayıcı algoritması ile elde edilmiştir.

Anahtar Kelimeler:

Makine öğrenmesi, Siber güvenlik, Siber güvenlik farkındalık düzeyi

Determination of Cyber Security Awareness Levels of Students with Machine Learning Methods

With the rapid development of information and communication technologies, the number of devices using technology and the internet has increased and has entered all areas of life. Developments in technology have brought the risk of users and devices encountering cyber threats. This work aims to determine students' cyber security awareness levels about cyber threats with machine learning methods. Therefore, data were collected from a sample population that was statistically representative of undergraduate students with the survey technique. The obtained data were analyzed by adopting the descriptive review model and the results of the analysis were presented in the study. Afterwards, the cyber security awareness levels of the students were determined by using the data set created from the survey data, Naive Bayes, Decision Tree, Random Forest, Nearest Neighbor, XGBoost, Gradient Boost, Support Vector Machines, Multi-Layer Perceptron algorithms. As a result of the tests performed, accuracy values ranging from 0.7-0.98 and F1 scores ranging from 0.7-0.96 has been obtained. The most successful performance metrics were obtained with the Multi-Layer Perceptron algorithm with an accuracy of 0.98 and an F1 score of 0.96.

Keywords:

Cyber security, Cyber security awareness level, Machine learning,

PDF

___

Abomhara, M., & Køien, G. M. (2015). Cyber Security and the Internet of Things: Vulnerabilities, Threats, Intruders and Attacks. Journal of Cyber Security and Mobility, 4(1), 65-88.
Aldayel, M. S. (2012, December). K-Nearest Neighbor classification for glass identification problem. 2012 International Conference on Computer Systems and Industrial Informatics, Sharjah, United Arab Emirates. doi:10.1109/ICCSII.2012.6454522
Alzahrani, L. (2021). Statistical analysis of cybersecurity awareness issues in higher education institutes. International Journal of Advanced Computer Science and Applications, 12(11), 630-637. doi:10.14569/IJACSA.2021.0121172
Arpaci, I., & Sevinc, K. (2022). Development of the cybersecurity scale (CS-S): Evidence of validity and reliability. Information Development, 38(2), 218-226. doi:10.1177/0266666921997512
Balan, S., Gawand, S., & Purushu, P. (2018). Application of machine learning classification algorithm to cybersecurity awareness. Information Technology & Management Science (RTU Publishing House), 21, 45-48. doi:10.7250/itms-2018-0006
Berrar, D. (2018). Bayes’ theorem and naive Bayes classifier. In S. Ranganathan, M. Gribskov, K. Nakai, & C. Schönbach (Eds.), Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics (pp. 403-412). Amsterdam, The Netherlands: Elsevier Science Publisher. doi:10.1016/B978-0-12-809633-8.20473-1
Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. doi:10.1023/A:1010933404324
Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco California, USA. doi:10.1145/2939672.2939785
D’Silva, J. L., Samah, B. A., Shaffril, H. A. M., & Hassan, M. A. (2010). Factors that influence attitude towards ICT usage among rural community leaders in Malaysia. Australian Journal of Basic and Applied Sciences, 4(10), 5214-5220.
Fraenkel, J. R., Wallen, N. E., & Hyun, H. (2012). How to design and evaluate research in education (C. 7). New York, USA: McGraw-hill Education.
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189-1232. doi:10.1214/aos/1013203451
Gibert, D., Mateu, C., & Planes, J. (2020). HYDRA: A multimodal deep learning framework for malware classification. Computers & Security, 95, 101873. doi:10.1016/j.cose.2020.101873
Gündüzalp, C. (2021). Üniversite çalışanlarının dijital veri ve kişisel siber güvenlik farkındalıkları (bilgi i̇şlem daire başkanlıkları örneği). Journal of Computer and Education Research, 9(18), 598-625. doi:10.18009/jcer.907022
Hassan, M. A., Samah, B. A., Shaffril, H. M., & D’Silva, J. L. (2011). Perceived usefulness of ICT usage among JKKK members in Peninsular Malaysia. Asian Social Science, 7(10). doi:10.5539/ass.v7n10p255
Humayun, M., Niazi, M., Jhanjhi, N. Z., Alshayeb, M., & Mahmood, S. (2020). Cyber security threats and vulnerabilities: A systematic mapping study. Arabian Journal for Science and Engineering, 45(4), 3171-3189. doi:10.1007/s13369-019-04319-2
IWS. (2022). Internet World Stats. https://www.internetworldstats.com/europa2.htm#tr Erişim Tarihi: 18 Ağustos 2022.
İlker, K. (2019). Kaba kuvvet saldırı tespiti ve teknik analizi. Sakarya University Journal of Computer and Information Sciences, 2(2), 61-69. doi:10.35377/saucis.02.02.561844
Jabeur, S. B., Mefteh-Wali, S., & Viviani, J.-L. (2021). Forecasting gold price with the XGBoost algorithm and SHAP interaction values. Annals of Operations Research, 1-21. doi:10.1007/s10479-021-04187-w
Karacı, A., Akyüz, H. İ., & Bilgici, G. (2017). Üniversite öğrencilerinin siber güvenlik davranışlarının incelenmesi. Kastamonu Eğitim Dergisi, 25(6), 2079-2094. doi:10.24106/kefdergi.351517
Karakaya, A., & Yetgin, M. A. (2020). Karabük üni̇versi̇tesi̇ çalışanlarına yöneli̇k ki̇şi̇sel si̇ber güvenli̇k üzeri̇ne araştırma. Kahramanmaraş Sütçü İmam Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi, 10(2), 157-172. doi:10.47147/ksuiibf.816171
Khan, F., Ncube, C., Ramasamy, L. K., Kadry, S., & Nam, Y. (2020). A digital DNA sequencing engine for ransomware detection using machine learning. IEEE Access, 8, 119710-119719. doi:10.1109/ACCESS.2020.3003785
Khan, N. F., Ikram, N., Murtaza, H., & Asadi, M. A. (2021). Social media users and cybersecurity awareness: Predicting self-disclosure using a hybrid artificial intelligence approach. Kybernetes, 52(1), 401-421. doi:10.1108/K-05-2021-0377
Khonji, M., Iraqi, Y., & Jones, A. (2013). Phishing detection: A literature survey. IEEE Communications Surveys & Tutorials, 15(4), 2091-2121. doi:10.1109/SURV.2013.032213.00009
Kovačević, A., Putnik, N., & Tošković, O. (2020). Factors related to cyber security behavior. IEEE Access, 8, 125140-125148. doi:10.1109/ACCESS.2020.3007867
Li, Y., Nie, X., & Huang, R. (2018). Web spam classification method based on deep belief networks. Expert Systems with Applications, 96, 261-270. doi:10.1016/j.eswa.2017.12.016
Ligthart, A., Catal, C., & Tekinerdogan, B. (2021). Analyzing the effectiveness of semi-supervised learning approaches for opinion spam classification. Applied Soft Computing, 101, 107023. doi:10.1016/j.asoc.2020.107023
Makkar, A., & Kumar, N. (2021). PROTECTOR: An optimized deep learning-based framework for image spam detection and prevention. Future Generation Computer Systems, 125, 41-58. doi:10.1016/j.future.2021.06.026
Makki, S., Assaghir, Z., Taher, Y., Haque, R., Hacid, M.-S., & Zeineddine, H. (2019). An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access, 7, 93010-93022. doi:10.1109/ACCESS.2019.2927266
Mittal, S., & Tyagi, S. (2019, January). Performance evaluation of machine learning algorithms for credit card fraud detection. 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence). doi:10.1109/CONFLUENCE.2019.8776925
Muhirwe, J., & White, N. (2016). Cybersecurity awareness and practice of next generation corporate technology users. Issues in Information Systems, 17(2), 183-192. doi:10.48009/2_iis_2016_183-192
Narudin, F. A., Feizollah, A., Anuar, N. B., & Gani, A. (2016). Evaluation of machine learning classifiers for mobile malware detection. Soft Computing, 20(1), 343-357. doi:10.1007/s00500-014-1511-6
Nusrat, F., Uzbaş, B., & Baykan, Ö. K. (2020). Prediction of diabetes mellitus by using gradient boosting classification. Avrupa Bilim ve Teknoloji Dergisi, Ejosat Special Issue 2020, 268-272. https://doi.org/10.31590/ejosat.803504
Özbek, Y. (2019). Öğretmen adaylarının siber güvenlik farkındalıklarının incelenmesi. (Doktora Tezi), Necmettin Erbakan Üniversitesi, Eğitim Bilimleri Enstitüsü, Konya, Türkiye.
Patel, H. H., & Prajapati, P. (2018). Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering, 6(10), 74-78. doi:10.26438/ijcse/v6i10.7478
Pisner, D. A., & Schnyer, D. M. (2020). Chapter 6 - Support Vector Machine. A. Mechelli & S. Vieira (Ed.), Machine Learning (pp. 101-121). Cambridge, USA: Academic Press. doi:10.1016/B978-0-12-815739-8.00006-7
Potur, E. A., & Erginel, N. (2021). Kalp yetmezliği hastalarının sağ kalımlarının sınıflandırma algoritmaları ile tahmin edilmesi. Avrupa Bilim ve Teknoloji Dergisi, (24), 112-118. doi:10.31590/ejosat.902357
Quayyum, F., Cruzes, D. S., & Jaccheri, L. (2021). Cybersecurity awareness for children: A systematic literature review. International Journal of Child-Computer Interaction, 30, 100343. doi:10.1016/j.ijcci.2021.100343
Ramli, S. A. B., Omar, S. Z., Bolong, J., D’Silva, J. L., & Shaffril, H. A. M. (2013). Influence of behavioral factors on mobile phone usage among fishermen: The case of Pangkor Island Fishermen. Asian Social Science, 9(5), 162. doi:10.5539/ass.v9n5p162
Safa, N. S., Von Solms, R., & Futcher, L. (2016). Human aspects of information security in organisations. Computer Fraud & Security, 2016(2), 15-18. doi:10.1016/S1361-3723(16)30017-3
Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357. doi:10.1016/j.eswa.2018.09.029
Saridewi, V. S., & Sari, R. F. (2021). Implementation of machine learning for human aspect in information security awareness. Journal of Applied Engineering Science, 19(4), 1126-1142. doi:10.5937/jaes0-28530
Sarker, I. H., Kayes, A. S. M., Badsha, S., Alqahtani, H., Watters, P., & Ng, A. (2020). Cybersecurity data science: An overview from machine learning perspective. Journal of Big Data, 7, 41. doi:10.1186/s40537-020-00318-5
Shahrivari, V., Darabi, M. M., & Izadi, M. (2020). Phishing detection using machine learning techniques. arXiv preprint arXiv:2009.11116. doi:10.48550/arXiv.2009.11116
Shaukat, K., Luo, S., Varadharajan, V., Hameed, I. A., Chen, S., Liu, D., & Li, J. (2020). Performance comparison and current challenges of using machine learning techniques in cybersecurity. Energies, 13(10), 2509. doi:10.3390/en13102509
Srinivasan, S., Ravi, V., Alazab, M., Ketha, S., Al-Zoubi, A., & Padannayil, S. K. (2021). Spam emails detection based on distributed word embedding with deep learning. In Y. Maleh, M. Shojafar, M. Alazab, & Y. Baddi (Eds.), Machine Intelligence and Big Data Analytics for Cybersecurity Applications (pp. 161-189). Switzerland: Springer Cham.
Subramaniam, S. R. (2017, December). Cyber security awareness among Malaysian pre-university students. Proceeding of the 6th Global Summit on Education, Kualalumpur, Malasia.
Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics. Boston, USA: Pearson Education.
TÜİK. (2021). Türkiye İstatistik Kurumu. https://data.tuik.gov.tr/Bulten/Index?p=Hanehalki-Bilisim-Teknolojileri-(BT)-Kullanim-Arastirmasi-2021-37437 Erişim Tarihi: 18 Ağustos 2022.
Venkatraman, S., Alazab, M., & Vinayakumar, R. (2019). A hybrid deep learning image-based analysis for effective malware detection. Journal of Information Security and Applications, 47, 377-389. doi:10.1016/j.jisa.2019.06.006
Von Solms, R., & Van Niekerk, J. (2013). From information security to cyber security. Computers & Security, 38, 97-102. doi:10.1016/j.cose.2013.04.004
Weamie, S. J. (2022). Cross-site scripting attacks and defensive techniques: A comprehensive survey. International Journal of Communications, Network and System Sciences, 15(8), 126-148. doi:10.4236/ijcns.2022.158010
Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., & Jiang, C. (2018, March). Random forest for credit card fraud detection. 2018 IEEE 15th international conference on networking, sensing and control (ICNSC), Zhuhai, China. doi:10.1109/ICNSC.2018.8361343
Xue, D., Li, J., Lv, T., Wu, W., & Wang, J. (2019). Malware classification using probability scoring and machine learning. IEEE Access, 7, 91641-91656. doi:10.1109/ACCESS.2019.2927552
Yalçınkaya, M. A., & Küçüksille, E. (2021). Web uygulama sızma testlerinde kapsam genişletme işlemi için metodoloji geliştirilmesi ve uygulanması. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 25(1), 16-27. doi:10.19113/sdufenbed.661867
Yiğit, M. F., & Seferoğlu, S. S. (2019). Öğrencilerin siber güvenlik davranışlarının beş faktör kişilik özellikleri ve çeşitli diğer değişkenlere göre incelenmesi. Mersin Üniversitesi Eğitim Fakültesi Dergisi, 15(1), 186-215. doi:10.17860/mersinefd.437610
Zhou, J., Qiu, Y., Khandelwal, M., Zhu, S., & Zhang, X. (2021). Developing a hybrid model of Jaya algorithm-based extreme gradient boosting machine to estimate blast-induced ground vibrations. International Journal of Rock Mechanics and Mining Sciences, 145, 104856. doi:10.1016/j.ijrmms.2021.104856