Comparative analysis of classification techniques for network fault management

Network troubleshooting is a significant process. Many studies were conducted about it. The first step in the troubleshooting procedures is represented in collecting information. It's collected in order to identify the problems. Syslog messages which are sent by almost all network devices include a massive amount of data that concern the network problems. Based on several studies, it was found that analyzing syslog data which can be a guideline for network problems and their causes. The detection of network problems can become more efficient if the detected problems have been classified based on the network layers. Classifying syslog data requires identifying the syslog messages that describe the network problems for each layer. It also requires taking into account the formats of syslog for vendors' devices. The present study aimed to propose a method for classifying the syslog messages which identify the network problem.This classification is conducted based on the network layers. This method uses data mining instrument to classify the syslog messages. The description part of the syslog message was used for carrying out the classification process.The relevant syslog messages were identified. The features were then selected to train the classifiers. Six classification algorithms were learned; LibSVM, SMO, KNN, Naïve Bayes, J48, and Random Forest. A real data set was obtained from an educational network device. This dataset was used for the prediction stage. It was found that that LibSVM outperforms other classifiers in terms of the probability rate of the classified instances where it was in the range of 89.90%-32.80%. Furthermore, the validation results indicate that the probability rate of the correctly classified instances is >70%.

___

  • [1] Bob V. Accessing the WAN, CCNA Exploration Companion Guide. India: Pearson Education, 2008.
  • [2] Son HS, Lee JH, Kim TY, Lee SG. Network traffic and security event collecting system. In: Proceedings of Second International Conference on Electrical Systems, Technology and Information; Singapore; 2015. pp 439-446.
  • [3] Deveriya A. Network administrators Survival Guide. USA: Cisco Press, 2005.
  • [4] Joseph S. Network Troubleshooting Tools: Help for Network Administrators. USA: O’Reilly Media, Inc., 2001.
  • [5] Wilkins, Sean. Designing for Cisco Internetwork Solutions (DESGN) Foundation Learning Guide: (CCDA DESGN 640-864). Pearson Education, 2011.
  • [6] Kimura T, Takeshita K, Toyono T, Yokota M, Nishimatsu K et al. Network Failure Syslog and SNS. International Journal of Pure and Applied Mathematics 2018; 119 (12): 9543-9551.
  • [7] Viktor M, Kenneth C. Big Data: A Revolution That Will Transform How We Live, Work, and Think. Boston, New York: Eamon Dolan/Mariner Books Houghton Mıın Harcourt, 2013.
  • [8] Wei X, Ling H, Armando F, David P, Michael J. Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles; Montana, USA; 2009. pp. 117-132.
  • [9] David SK, Saeb AT, Al Rubeaan K. Comparative analysis of data mining tools and classification techniques using weka in medical bioinformatics. Computer Engineering and Intelligent Systems 2013; 4 (13): 28-38.
  • [10] Zupan, Blaz, Janez D. Open-source tools for data mining. Clinics in laboratory medicine 2008; 28 (1): 37-54. doi: 10.1016/j.cll.2007.10.002
  • [11] Liu M, Jiangang Y. An improvement of TFIDF weighting in text categorization. In: International Proceedings of Computer Science and Information Technology; Singapore; 2012. pp. 44-47.
  • [12] Tongqing Q, Zihui G, Dan P, Jia W, Jun Xu. What happened in my network: mining network events from router syslogs. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement; Melbourne, Australia; 2010. pp. 472-484.
  • [13] Fukuda, Kensuke. On the use of weighted syslog time series for anomaly detection. In 12th IFIP/IEEE International Symposium on Integrated Network Management and Workshops; Dublin, Ireland; 2011. pp. 393-398.
  • [14] Genkin A, David D, David M. Large-scale Bayesian logistic regression for text categorization. Technometrics 2007; 49 (3): 291-304. doi: 10.1198/004017007000000245
  • [15] Guiying W, Xuedong G, Sen W. Study of text classification methods for data sets with huge features. In 2nd International Conference on Industrial and Information Systems; Dalian, China; 2010. pp. 387-406.
  • [16] Pilászy I. Text categorization and support vector machines. In: Proceedings of the 6th International Symposium of Hungarian Researchers on Computational Intelligence; Budapest, Hungary; 2005. pp. 1-10.
  • [17] Bottou L, Vapnik V. Local learning algorithms. Neural computation 1992; 4 (6): 888-900. doi: 10.1162/neco.1992.4.6.888
  • [18] Kim JW, Lee BH, Shaw MJ, Chang HL, Nelson M. Application of decision-tree induction techniques to personalized advertisements on internet storefronts. International Journal of Electronic Commerce 2001; 5 (3): 45-62. doi: 10.1080/10864415.2001.11044215
  • [19] Murty, MR, Murthy, JVR, Reddy et al. A Survey of cross-domain text categorization techniques. In 1st IEEE International Conference on Recent Advances in Information Technology; Dhanbad, India; 2012. pp. 499-504.
  • [20] He, Ji, Ah-Hwee T, Chew L. A Comparative Study on Chinese Text Categorization Methods. In: PRICAI Workshop on Text and Web Mining; Melbourne, Australia; 2000. pp. 1-12.
  • [21] Chiang H , Wang T. One-against-one fuzzy support vector machine text categorization classifier. In: IEEE International Conference on Industrial Engineering and Engineering Management; IEEE Singapore; 2008. pp. 1519-1523.
  • [22] Liu Y , Zheng YF. One-against-all multi-class SVM classification using reliability measures. In: International Joint Conference on Neural Networks; Montreal, Canada; 2005. pp. 849-854.
  • [23] Hsu, Chih-Wei, Chih-Jen Lin. A comparison of methods for multiclass support vector machines. IEEE transactions on Neural Networks 2002; 13 (2): 415-425. doi:10.1109/72.991427
  • [24] Pawar , Gawande S. A Comparative Study on Different Types of Approaches to Text Categorization. Machine Learning and Computing 2012; 2(4) 423-426. doi:10.7763/IJMLC.2012.V2.158
  • [25] Cunningham P, Sarah D. k-Nearest neighbour classifiers. In: Multiple Classifier Systems; Prague, Czech Republic; 2007. pp.1-17.
  • [26] Suha O. Enhanced ntology-based text classification algorithm for structurally organized documents. PhD, Universiti Utara Malaysia, Kedah, Malaysia, 2015.
  • [27] Liao Y, Vemuri VR. Use of k-nearest neighbor classi er for intrusion detection. Computers and Security 2002; 21 (5): 439-448. doi: 10.1016/S0167-4048(02)00514-X
  • [28] Kaur, Gaganjot, Amit C. Improved J48 classification algorithm for the prediction of diabetes. International Journal of Computer Applications 2014; 98 (22): 13–17. doi:10.5120/17314-7433
  • [29] Faheema AG, Subrata Ra. Feature selection using bag-of-visual-words representation. In IEEE 2nd International Advance Computing Conference; Patiala, India; 2010. pp 151-156.
  • [30] Quinlan J. Induction of decision trees. Machine learning 1986; 1 (1): 81-106. doi:doi.org/10.1007/BF00116251
  • [31] Liparas D, HaCohen-Kerner Y, Moumtzidou A, Vrochidis S, Kompatsiaris I. News articles classi cation using random forests and weighted multimodal features. In: Information Retrieval Facility Conference; Copenhagen, Denmark; 2014. pp. 63-75.
  • [32] Korada N, Kumar NSP, Deekshitulu. Implementation of naïve Bayesian classifier. International Journal of Information Sciences and Techniques 2012; 2 (3): 63-75. doi: 10.5121/ijist.2012.2305
  • [33] Frank E, Bouckaert R. Naive bayes for text classification with unbalanced classes. In: Proceedings of the 10th European Conference on Principle and Practice of Knowledge Discovery in Databases; Springer, Berlin, Heidelberg; 2006. pp. 503-510.
  • [34] Dietterich, Thomas G. Ensemble methods in machine learning. In: multiple classifier systems, Lecture Notes in Computer Science; Berlin, Heidelberg; 2000. pp. 1-15.