Temporal bagging: a new method for time-based ensemble learning

Temporal bagging: a new method for time-based ensemble learning

One of the main problems associated with the bagging technique in ensemble learning is its random sample selection in which all samples are treated with the same chance of being selected. However, in time-varying dynamic systems, the samples in the training set have not equal importance, where the recent samples contain more useful and accurate information than the former ones. To overcome this problem, this paper proposes a new time-based ensemble learning method, called temporal bagging (T-Bagging). The significant advantage of our method is that it assigns larger weights to more recent samples with respect to older ones, so it reduces the selection chances of former samples, and, thus, it addresses the adaptation to changes in dynamic systems. The experiments show that the proposed T-Bagging method improves the prediction accuracy of the model compared to the standard bagging method on temporal data.

___

  • [1] Breiman L. Bagging predictors. Machine Learning 1996; 24 (2): 123-140. doi: 10.1007/BF00058655
  • [2] Williams BM. Multivariate vehicular traffic flow prediction: evaluation of ARIMAX modeling. Transportation Research Record 2001; 1776 (1): 194-200. doi: 10.3141/1776-25
  • [3] Revesz P, Triplet T. Temporal data classification using linear classifiers. Information Systems 2011; 36 (1): 30-41. doi: 10.1016/j.is.2010.06.006
  • [4] Sim JA, Kim YA, Kim JH, Lee JM, Kim MS et al. The major effects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning. Scientific Reports 2020; 10 (1): 1-12. doi: 10.1038/s41598-020-67604-3
  • [5] Nti IK, Adekoya AF, Weyori BA. A comprehensive evaluation of ensemble learning for stock-market prediction. Journal of Big Data 2020; 7 (1): 1-40. doi: 10.1186/s40537-020-00299-5
  • [6] Makris C, Pispirigos G, Rizos IO. A distributed bagging ensemble methodology for community prediction in social networks. Information 2020; 11 (4): 199. doi: 10.3390/info11040199
  • [7] Chen W, Hong H, Li S, Shahabi H, Wang Y, Wang X, Ahmad BB. Flood susceptibility modelling using novel hybrid approach of reduced-error pruning trees with bagging and random subspace ensembles. Journal of Hydrology 2019; 575: 864-873. doi: 10.1016/j.jhydrol.2019.05.089
  • [8] Agarwal S, Chowdary CR. A-Stacking and A-Bagging: Adaptive versions of ensemble learning algorithms for spoof fingerprint detection. Expert Systems with Applications 2020; 146: 113160. doi: 10.1016/j.eswa.2019.113160
  • [9] Long S, Zhao M, Song J. A novel PCA-DC-Bagging algorithm on yield stress prediction of RAFM steel. Computing 2020; 102 (1); 19-42. doi: 10.1007/s00607-019-00727-2
  • [10] Hsiao YH, Su CT, Fu PC. Integrating MTS with bagging strategy for class imbalance problems. International Journal of Machine Learning and Cybernetics 2020; 11: 1217-1230. doi: 10.1007/s13042-019-01033-1
  • [11] Li Y, Zhang J, Yuan X. BagGMM: Calling copy number variation by bagging multiple Gaussian mixture models from tumor and matched normal next-generation sequencing data. Digital Signal Processing 2019, 88: 90-100. doi: 10.1016/j.dsp.2019.01.025
  • [12] Al-rimy BAS, Maarof MA, Shaid SZM. Crypto-ransomware early detection model using novel incremental bagging with enhanced semi-random subspace selection. Future Generation Computer Systems 2019; 101: 476-491. doi: 10.1016/j.future.2019.06.005
  • [13] Dahiya S, Handa SS, Singh NP. A feature selection enabled hybrid‐bagging algorithm for credit risk evaluation. Expert Systems 2017; 34 (6): e12217. doi: 10.1111/exsy.12217
  • [14] Guan H, Zhang Y, Cheng H, Tang X. A novel imbalanced classification method based on decision tree and bagging. International Journal of Performability Engineering 2018; 14 (6): 1140-1148. doi: 10.23940/ijpe.18.06.p5.11401148
  • [15] Ooi SY, Tan SC, Cheah WP. Temporal sampling forest (TS-FTS-F): an ensemble temporal learner. Soft Computing 2017; 21: 7039–7052. doi: 10.1007/s00500-016-2242-7
  • [16] Bergmeir C, Hyndman RJ, Benítez JM. Bagging exponential smoothing methods using STL decomposition and Box– Cox transformation. International Journal of Forecasting 2016; 32 (2): 303-312. doi: 10.1016/j.ijforecast.2015.07.002
  • [17] Deng H, Runger G, Tuv E, Vladimir M. A time series forest for classification and feature extraction. Information Sciences 2013; 239: 142–153. doi: 10.1016/j.ins.2013.02.030
  • [18] Zhao J. Temporal weighting of clinical events in electronic health records for pharmacovigilance. In: IEEE 2015 International Conference on Bioinformatics and Biomedicine; Washington, DC, USA; 2015. pp. 375-381. doi: 10.1109/BIBM.2015.7359710
  • [19] Cortes C, Vapnik V. Support-vector networks. Machine Learning 1995; 20: 273-297. doi: 10.1007/BF00994018.
  • [20] Witten IH, Frank E, Hall MA, Pal CJ. Data Mining: Practical Machine Learning Tools and Techniques. Cambridge, MA, USA: Morgan Kaufmann, 2016.
Turkish Journal of Electrical Engineering and Computer Sciences-Cover
  • ISSN: 1300-0632
  • Yayın Aralığı: 6
  • Yayıncı: TÜBİTAK