A new approach: semisupervised ordinal classification

A new approach: semisupervised ordinal classification

Semisupervised learning is a type of machine learning technique that constructs a classifier by learning from a small collection of labeled samples and a large collection of unlabeled ones. Although some progress has been made in this research area, the existing semisupervised methods provide a nominal classification task. However, semisupervised learning for ordinal classification is yet to be explored. To bridge the gap, this study combines two concepts “semisupervised learning” and “ordinal classification” for the categorical class labels for the first time and introduces a new concept of “semisupervised ordinal classification”. This paper proposes a new algorithm for semisupervised learning that takes into account the relationships between the class labels, especially class orderings such as low, medium, and high. We also performed an extensive empirical study that involves 10 benchmark ordinal datasets with different quantities of labeled samples varying from 15% to 50% with an increment of 5%, aiming to evaluate the performance of our method by combining different base learners. The experimental results were also validated with a nonparametric statistical test. The experiments show that the proposed method improves the classification accuracy of the model compared to the existing semisupervised method on ordinal data.

___

  • [1] Frank E, Hall M. A simple approach to ordinal classification. In: European Conference on Machine Learning; Freiburg, Germany; 2001. pp. 145-156.
  • [2] Driessens K, Reutemann P, Pfahringer B, Leschi C. Using weighted nearest neighbor to benefit from unlabeled data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining; Singapore; 2006. pp. 60-69.
  • [3] Lang R, Lu R, Zhao C, Qin H, Liu G. Graph-based semi-supervised one class support vector machine for detecting abnormal lung sounds. Applied Mathematics and Computation 2020; 364: 1-10. doi: 10.1016/j.amc.2019.06.001
  • [4] Li W, Meng W, Au MH. Enhancing collaborative intrusion detection via disagreement-based semi-supervised learning in IoT environments. Journal of Network and Computer Applications 2020; 161: 1-9. doi: 10.1016/j.jnca.2020.102631
  • [5] Livieris IE, Drakopoulou K, Tampakas VT, Mikropoulos TA, Pintelas P. Predicting secondary school students’ performance utilizing a semi-supervised learning approach. Journal of Educational Computing Research 2019; 57 (2): 448-470. doi: 10.1177/0735633117752614
  • [6] Xu P, Lu W, Wang B. A semi-supervised learning framework for gas chimney detection based on sparse autoencoder and TSVM. Journal of Geophysics and Engineering 2019; 16 (1): 52-61. doi:10.1093/jge/gxy004
  • [7] Stanescu A, Caragea D. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets. BMC Systems Biology 2015; 9 (5): 1-12. doi: 10.1186/1752-0509-9-S5-S1
  • [8] Yang Y, Liu J, Xu S, Zhao Y. An extended semi-supervised regression approach with co-training and geographical weighted regression: a case study of housing prices in beijing. ISPRS International Journal of Geo-Information 2016; 5 (1): 1-12. doi:10.3390/ijgi5010004
  • [9] Wang X, Yang I, Ahn SH. Sample efficient home power anomaly detection in real time using semi-supervised learning. IEEE Access 2019; 7: 139712-139725. doi: 10.1109/ACCESS.2019.2943667
  • [10] Van Engelen JE, Hoos HH. A survey on semi-supervised learning. Machine Learning 2020; 109 (2): 373-440. doi: 10.1007/s10994-019-05855-6
  • [11] Dornaika F, Dahbi R, Bosaghzadeh A, Ruichek Y. Efficient dynamic graph construction for inductive semisupervised learning. Neural Networks 2017; 94: 192-203. doi: 10.1016/j.neunet.2017.07.006
  • [12] Rahangdale A, Raut S. Clustering-based transductive semi-supervised learning for learning-to-rank. International Journal of Pattern Recognition and Artificial Intelligence 2019; 33 (12): 1-27. doi: 10.1142/S0218001419510078
  • [13] Sati N. A novel semisupervised classification method via membership and polyhedral conic functions. Turkish Journal of Electrical Engineering & Computer Sciences 2020; 28 (1): 80-92. doi:10.3906/elk-1905-45
  • [14] Chen K, Wang S. Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. IEEE Transactions on Pattern Analysis and Machine Intelligence 2010; 33 (1): 129-143. doi: 10.1109/TPAMI.2010.92
  • [15] Gutierrez PA, Perez-Ortiz M, Sanchez-Monedero J, Fernandez-Navarro F, Hervas-Martinez C. Ordinal regression methods: survey and experimental study. IEEE Transactions on Knowledge and Data Engineering 2016; 28 (1): 127-146. doi: 10.1109/TKDE.2015.2457911
  • [16] Perez-Ortiz M, Gutierrez, PA, Hervas-Martinez C. Projection-based ensemble learning for ordinal regression. IEEE Transactions on Cybernetics 2014; 44 (5): 681-694. doi: 10.1109/TCYB.2013.2266336
  • [17] Sanchez-Monedero J, Gutierrez PA, Tino P, Hervas-Martinez C. Exploitation of pairwise class distances for ordinal classification. Neural Computation 2013; 25 (9): 2450-2485. doi: 10.1162/NECO_a_00478
  • [18] Perez-Ortiz M, Gutierrez PA, Carbonero-Ruz M, Hervas-Martinez C. Semi-supervised learning for ordinal kernel discriminant analysis. Neural Networks 2016; 84: 57-66. doi: 10.1016/j.neunet.2016.08.004
  • [19] Tsuchiya T, Charoenphakdee N, Sato I, Sugiyama M. Semi-supervised ordinal regression based on empirical risk minimization. Computing Research Repository 2019; arXiv:1901.11351.
  • [20] Cardoso JS, Domingues I. Max-coupled learning: application to breast cancer. In: 10th International Conference on Machine Learning and Applications; Honolulu, Hawaii, USA; 2011. pp. 13-18.
  • [21] Srijith PK, Shevade S, Sundararajan S. Semi-supervised Gaussian process ordinal regression. In: Blockeel H, Kersting K, Nijssen S, Zelezny F (editors). Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science. Berlin, Germany: Springer, 2013, pp. 144-159. doi: 10.1007/978-3-642-40994-3_10
  • [22] Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. Cambridge, MA, USA: Morgan Kaufmann, 2016.
  • [23] Jiang D, Ma P, Su X, Wang T. Distance metric based divergent change bad smell detection and refactoring scheme analysis. International Journal of Innovative Computing, Information and Control 2014; 10 (4): 1519-1531.
  • [24] Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic smaller over sampling technique. Journal of Artificial Intelligence Research 2002; 16: 321-357. doi: 10.1613/jair.953
  • [25] Lang S, Bravo-Marquez F, Beckham C, Hall M, Frank E. WekaDeeplearning4j: a deep learning package for Weka based on deepLearning4j. Knowledge-Based Systems 2019; 178: 48-50. doi: 10.1016/j.knosys.2019.04.013