Kokten Ulas BIRANT, Derya BIRANT, Elife OZTURK KIYAK

An improved version of multi-view k-nearest neighbors (MVKNN) for multiple view learning

Multi-view learning (MVL) is a special type of machine learning that utilizes more than one views, where views include various descriptions of a given sample. Traditionally, classification algorithms such as k-nearest neighbors (KNN) are designed for learning from single-view data. However, many real-world applications involve datasets with multiple views and each view may contain different and partly independent information, which makes the traditional single-view classification approaches ineffective. Therefore, this article proposes an improved MVL algorithm, called multi-view k-nearest neighbors (MVKNN), based on the existing KNN algorithm. The experimental results conducted in this research show that a significant improvement is achieved by the proposed MVKNN algorithm compared to the well-known machine learning algorithms (KNN, support vector machine, decision tree, and naive bayes) in the case of multi-view data. The results also show that our method outperforms the state-of-the-art multi-view learning methods in terms of accuracy.

PDF

___

[1] Dilmac S, Ölmez Z, Ölmez T. Comparative analysis of MABC with KNN, SOM, and ACO algorithms for ECG heartbeat classification. Turkish Journal of Electrical Engineering & Computer Sciences 2018; 26 (6): 2819-2830. doi:10.3906/elk-1712-328
[2] Yumurtaci M, Gökmen G, Kocaman Ç, Ergin S, Kilic O. Classification of short-circuit faults in high-voltage energy transmission line using energy of instantaneous active power components-based common vector approach. Turkish Journal of Electrical Engineering & Computer Sciences 2016; 24 (3): 1901-1915. doi:10.3906/elk-1312-131
[3] Madi M, Jarghon F, Fazea Y, Almomani O, Saaidah A. Comparative analysis of classification techniques for network fault management. Turkish Journal of Electrical Engineering & Computer Sciences 2020; 28 (3): 1442- 1457. doi:10.3906/elk-1907-84
[4] Köse E, Hocaoglu AK. A new spectral estimation-based feature extraction method for vehicle classification in distributed sensor networks. Turkish Journal of Electrical Engineering & Computer Sciences 2019; 27 (2): 1120- 1131. doi:10.3906/elk-1807-49
[5] Boyaci D, Erdoğan M, Yildiz F. Pixel-versus object-based classification of forest and agricultural areas from multiresolution satellite images. Turkish Journal of Electrical Engineering & Computer Sciences 2017; 25 (1): 365-375. doi:10.3906/elk-1504-261
[6] Zhuang F, Karypis G, Ning X, He Q, Shi Z. Multi-view learning via probabilistic latent semantic analysis. Information Sciences 2012; 199: 20-30. doi: 10.1016/j.ins.2012.02.058
[7] Sun S, Liu Y, Mao L. Multi-view learning for visual violence recognition with maximum entropy discrimination and deep features. Information Fusion 2019; 50: 43-53. doi: 10.1016/j.inffus.2018.10.004
[8] Hajmohammadi MS, Ibrahim R, Selamat A. Cross-lingual sentiment classification using multiple source languages in multi-view semi-supervised learning. Engineering Applications of Artificial Intelligence 2014; 36 (1): 195-203. doi: 10.1016/j.engappai.2014.07.020
[9] Wu B, Zhong E, Horner A, Yang Q. Music emotion recognition by multi-label multi-layer multi-instance multi-view learning. In: 22nd ACM International Conference on Multimedia; Orlando, FL, USA; 2014. pp. 117-126.
[10] Sun S. Multi-view Laplacian support vector machines. In: Tang J, King I, Chen L, Wang J (editors). Advanced Data Mining and Applications, Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2011, pp. 209-222.
[11] Houthuys L, Langone R, Suykens JA. Multi-view least squares support vector machines classification. Neurocomputing 2018; 282: 78-88. doi:10.1016/j.neucom.2017.12.029
[12] Liang Z, Zhang G, Li G, Fu W. An algorithm for acupuncture clinical assessment based on multi-view KNN. Journal of Computational Information Systems 2012; 8 (21): 9105-9112.
[13] Minh HQ, Bazzani L, Murino V. A unifying framework in vector-valued reproducing kernel hilbert spaces for manifold regularization and co-regularized multi-view learning. The Journal of Machine Learning Research 2016; 17 (1): 769-840. doi:10.5555/2946645.2946670
[14] Guan X, Liang J, Qian Y, Pang J. A multi-view OVA model based on decision tree for multi-classification tasks. Knowledge-Based Systems 2017; 138: 208-219. doi:10.1016/j.knosys.2017.10.004
[15] Wang H, Nie F, Huang H. Multi-view clustering and feature learning via structured sparsity. In: 30th International Conference on Machine Learning; Atlanta, Georgia, USA; 2013. pp. 352-360.
[16] Zhao J, Xie X, Xu X, Sun S. Multi-view learning overview: recent progress and new challenges. Information Fusion 2017; 38: 43-54. doi: 10.1016/j.inffus.2017.02.007
[17] Sun S. A survey of multi-view machine learning. Neural Computing and Applications 2013; 23 (7-8): 2031-2038. doi: 10.1007/s00521-013-1362-6
[18] Culp M, Michailidis G, Johnson K. On multi-view learning with additive models. The Annals of Applied Statistics 2009; 3 (1): 292-318. doi: 10.1214/08-AOAS202
[19] Xu C, Tao D, Xu C. A survey on multi-view learning. arXiv preprint, arXiv:1304.5634, 2013.
[20] Kumar V, Minz S. Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification. Knowledge and Information Systems 2016; 49 (1): 1-59.
[21] Wang S, Chen Z, Yan Q, Ji K, Peng L et al. Deep and broad URL feature mining for android malware detection. Information Sciences 2020; 513: 600-613. doi: 10.1016/j.ins.2019.11.008
[22] Sun S, Jin F. Robust co-training. International Journal of Pattern Recognition and Artificial Intelligence 2011; 25 (7): 1113-1126. doi: 10.1142/S0218001411008981
[23] Chang X, Yang Y, Wang H. Multi-view construction for clustering based on feature set partitioning. In: IEEE International Joint Conference on Neural Networks (IJCNN); Rio de Janeiro, Brazil; 2018. pp. 1-8.
[24] Yu S, Krishnapuram B, Rosales R, Rao RB. Bayesian co-training. Journal of Machine Learning Research 2011; 12 (79): 2649-2680.
[25] Peng J, Luo P, Guan Z, Fan J. Graph-regularized multi-view semantic subspace learning. International Journal of Machine Learning and Cybernetics 2019; 10 (5): 879-895. doi: 10.1007/s13042-017-0766-5
[26] Zhang H, Gönen M, Yang Z, Oja E. Understanding emotional impact of images using bayesian multiple kernel learning. Neurocomputing 2015; 165: 3-13. doi: 10.1016/j.neucom.2014.10.093
[27] Niu Y, Shang Y, Tian Y. Multi-view SVM classification with feature selection. Procedia Computer Science 2019; 162: 405-412. doi: 10.1016/j.procs.2019.12.004
[28] Xu Z, Sun S. An Algorithm on multi-view adaboost. In: Wong KW, Mendis BSU, Bouzerdoum A (editors). Neural Information Processing Theory and Algorithms, Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2010, pp. 355-362.
[29] Sun S, Zhang Q. Multiple-view multiple-learner semi-supervised learning. Neural Processing Letters 2011; 34 (3): 229. doi:10.1007/s11063-011-9195-8
[30] Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q et al. Top 10 algorithms in data mining. Knowledge and Information Systems 2008; 14 (1): 1-37. doi: 10.1007/s10115-007-0114-2
[31] Jiang Z, Bian Z, Wang S. Multi-view local linear KNN classification: theoretical and experimental studies on image classification. International Journal of Machine Learning and Cybernetics 2020; 11 (3): 525-543. doi: 10.1007/s13042-019-00992-9
[32] Srivastava SK, Singh SK. Multi-label classification of twitter data using modified ML-KNN. In: Kolhe M, Trivedi M, Tiwari S, Singh V (editors). Advances in Data and Information Sciences. Singapore: Springer, 2019, pp. 31-41.
[33] Xia Y, Peng Y, Zhang X, Bae H. DEMST-KNN: A novel classification framework to solve imbalanced multi-class problem. In: Silhavy R, Senkerik R, Kominkova Oplatkova Z, Prokopova Z, Silhavy P (editors). Artificial Intelligence Trends in Intelligent Systems. Cham, Germany: Springer, 2017, pp.291-301.
[34] Villar P, Montes R, Sánchez A M, Herrera F. Fuzzy-Citation-KNN: a fuzzy nearest neighbor approach for multiinstance classification. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE); Vancouver, BC, Canada; 2016. pp. 946-952.
[35] Gupta S, Rana S, Saha B, Phung D, Venkatesh S. A new transfer learning framework with application to modelagnostic multi-task learning. Knowledge and Information Systems 2016; 49 (3): 933-973. doi: 10.1007/s10115-016- 0926-z
[36] Peng, P, Xu X, Wang X. Instance-based k-nearest neighbor algorithm for multi-instance multi-label learning. International Journal of Innovative Computing, Information and Control 2014; 10 (5): 1861-1871.
[37] Zhao S, Rui C, Zhang Y. MICkNN: multi-instance covering kNN algorithm. Tsinghua Science and Technology 2013; 18 (4): 360-368. doi: 10.1109/TST.2013.6574674
[38] Liu J, Wang C, Gao J, Han J. Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the 2013 SIAM (Society for Industrial and Applied Mathematics) International Conference on Data Mining; Austin, TX, USA; 2013. pp. 252-260. doi: 10.1137/1.9781611972832.28
[39] Ahmad SR, Bakar AA, Yaakub MR, Yusop NMM. Statistical validation of ACO-KNN algorithm for sentiment analysis. Journal of Telecommunication, Electronic and Computer Engineering (JTEC) 2017; 9 (2-11): 165-170.
[40] Kramer O. K-nearest neighbors. In: Dimensionality reduction with unsupervised nearest neighbors. Berlin, Heidelberg, Germany: Springer, 2013, pp. 13-23.
[41] Gunarathna MHJP, Sakai K, Nakandakari T, Momii K, Kumari MKN. Machine learning approaches to develop pedotransfer functions for tropical Sri Lankan soils. Water 2019; 11 (9): 1940. doi: 10.3390/w11091940
[42] Park J, Lee DH. Parallelly running k-nearest neighbor classification over semantically secure encrypted data in outsourced environments. IEEE Access 2020; 8: 64617-64633. doi: 10.1109/ACCESS.2020.2984579
[43] Murphy K, van Ginneken B, Schilham AM, de Hoop BJ, Gietema HA et al. A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification. Medical image analysis 2009; 13 (5): 757-770. doi: 10.1016/j.media.2009.07.001
[44] Lall U, Sharma A. A nearest neighbor bootstrap for resampling hydrologic time series. Water Resources Research 1996; 32 (3): 679–693. doi: 10.1029/95WR02966
[45] Ruan L, Dias MPI, Wong E. Enhancing latency performance through intelligent bandwidth allocation decisions: a survey and comparative study of machine learning techniques. Journal of Optical Communications and Networking 2020; 12 (4): 20-32. doi:10.1364/JOCN.379715
[46] Hossin M, Sulaiman MN. A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process 2015; 5 (2): 1-11. doi: 10.5121/ijdkp.2015.5201
[47] Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. Cambridge, MA, USA: Morgan Kaufmann, 2016.
[48] Weiss NA, Introductory Statistics. 9th ed. Boston, MA, USA: Pearson Education, 2012.
[49] Yao Z, Song J, Liu Y, Zhang T, Wang J. Research on cross-version software defect prediction based on evolutionary information. IOP Conference Series: Materials Science and Engineering 2019; 563 (5): 1-7. doi: 10.1088/1757- 899X/563/5/052092