Makine Öğrenmesi Yöntemleriyle Anormal Ağ Trafiğinin Tespit Edilmesi

Bilgisayar ağlarının ve geliştirilen uygulamaların büyümesi ile saldırıların oluşturacağı hasarın belirgin olarak artması beklenmektedir. Saldırı Tespit Sistemleri (STS) sürekli büyüyen ağ saldırıları karşısında önemli savunma araçlarındandır. Saldırı Tespit Sistemlerinin makine öğrenmesi algoritmaları ile eğitilmesi ve eğitim sonrası gerçek zamanlı olarak saldırıları oluştuğu anda tespit ederek, gerekli tedbirlerin alınmasını sağlaması amaçlanmaktadır. Bu çalışmada da karar ağacı ve rastgele orman yöntemleri kullanılarak bilgisayar ağlarında akan normal ve anormal paketlerin sınıflandırılması amaçlanmaktadır. Sınıflandırma yöntemleri, karar vermek için ağ trafiğinin kaydedildiği PCAP dosyasından CICFlowMeter kullanılarak çıkarılan 78 adet değişkeni kullanmaktadır. Sonuçlar incelendiğinde, önerilen yöntemin bir milyonun üzerindeki kaydı %100’e yakın bir başarıyla sınıflandırdığı ve anormal trafiğin tespitinde etkin olduğu görülmektedir.

Anahtar Kelimeler:

Saldırı tespit sistemleri, karar ağacı, rastgele orman

Detection of Abnormal Network Traffic by Machine Learning Methods

With the growth of computer networks and developed applications, it is expected that the damage caused by the network attacks will increase significantly. Intrusion Detection Systems (IDS) is one of the most important defense tools in avoiding growing network attacks. Intrusion Detection Systems are trained with the machine learning algorithms and after the training, it is aimed to detect the attacks in real time and to take the necessary measures. In this study, it is aimed to classify normal and abnormal packages flowing in computer networks using decision tree and random forest methods. The classification methods use 78 variables which are extracted from the PCAP file where the network traffic is recorded. When the results are examined, it is seen that the proposed method classifies more than one million records with close to 100% success and is effective in detecting abnormal traffic.

Keywords:

Intrusion Detection System, Decision Tree, Random Forest,

PDF

___

[1] Ç Kaya, O Yildiz, “Makine Öğrenmesi Teknikleriyle Saldırı Tespiti: Karşılaştırmalı Analiz”, Marmara University Journal of Science, vol. 26, pp. 89-104, 2014. [2] MN Chowdhury, K Ferens, “Network Intrusion Detection Using Machine Learning”, Int'l Conf. Security and Management, pp.30-35, 2016.
[3] ME KarsligЕl, AG Yavuz, MA Güvensan, K Hanifi, H Bank, “Network intrusion detection using machine learning anomaly detection algorithms”, 25th Signal Processing and Communications Applications Conference, 2017.
[4] N Shone, N Tran Nguyen, P Vu Dinh, Q Shi, A” Deep Learning Approach to Network Intrusion Detection”, IEEE Transactions on Emerging Topics in Computational Intelligence, vol.2, no.1, 2018.
[5] A. Javaid, Q. Niyaz, W. Sun, and M. Alam, “A deep learning approach for network intrusion detection system,” in Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies, ser. BICT’15. ICST, Brussels, Belgium, Belgium: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp. 21–26, 2016.[6] I Sharafaldin, AH Lashkari, AA Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization”, 4th International Conference on Information Systems Security and Privacy (ICISSP), Portugal, 2018.[7] T Tuncer, Y Tatar, “Karar Ağacı Kullanarak Saldırı Tespit Sistemlerinin Performans Değerlendirmesi”, 4. İletişim Teknolojileri Ulusal Sempozyumu, Adana, 2009.
[8] S Chaudhuri “Data Mining and Database Systems : Where is the Intersection?”, IEEE Bulletin of the Technical Committee on Data Engineering, vol.21, no.1, pp. 4-8, 1998.
[9] A Berson, S Smith, Thearling, K.: “Building Data Mining Applications for CRM”, McGraw-Hill Professional Publishing, New York, USA, (2000).
[10] J Han, M Kamber, “Data Mining Concepts and Techniques”, The Morgan Kaufmann Series in Data Management Systems, 2nd Edition.; Elsevier Inc., San Francisco, USA, pp. 1-97, 2006.
[11] R Agrawal, T Imielinski, A Swami, “Database Mining:A Performance Perspective”, IEEE Transactions on Knowledge and Data Engineering, pp. 914-925, 1993.
[12] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, É. Duchesnay, “Scikit-learn: Machine learning in Python”, Journal of Machine Learning Research, 12, pp. 2825-2830, 2011.
[13] M Belgiu, L Draguţ, “Random forest in remote sensing: A review of applications and future directions”, ISPRS Journal of Photogrammetry and Remote Sensing, 114, p 24-31, 2016.
[14] Demirhan, A . (2018). Kolektif Öğrenmeye Dayalı Çok Değişkenli Desen Analizinin Klinik Karar Destek Sistemlerinde Uygulanması. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 6 (4), 953-961. DOI: 10.29130/dubited.432861
[15] KJ Archer, RV Kimes, “Empirical characterization of random forest variable importance measures”, Computational Statistics & Data Analysis, 52, 4, pp. 2249-2260, 2008.
[16] Anonim, https://syncedreview.com/2017/10/24/how-random-forest-algorithm-works-in-machine-learning (Erişim tarihi: 15 Aralık, 2018).
[17] J Makhoul, F Kubala, R Schwartz, R Weischedel, "Performance Measures For Information Extraction", Proceedings of the DARPA Broadcast News Workshop, 1999