Classification of generic system dynamics model outputs via supervised time series pattern discovery

Classification of generic system dynamics model outputs via supervised time series pattern discovery

System dynamics (SD) is a simulation-based approach for analyzing feedback-rich systems. An ideal SDmodeling cycle requires evaluating the qualitative pattern characteristics of a large set of time series model output fortesting, validation, scenario analysis, and policy analysis purposes. This traditionally requires expert judgement, whichlimits the extent of experimentation due to time constraints. Although time series recognition approaches can helpto automate such an evaluation, utilization of them has been limited to a hidden Markov model classifier, namely theIndirect Structure Testing Software (ISTS) algorithm. Despite being used within several automated model-analysis tools,ISTS has several shortcomings. In that respect, we propose an interpretable time series classification algorithm for theSD field, which also addresses the shortcomings of ISTS. Our approach, which can highlight the regions of a certaintime series that are influential in the class assignment, is an extension of the symbolic multivariate time series approachwith the use of a local importance measure. We compare the performance of the proposed approach against both ISTSand nearest-neighbor (NN) classifiers. Our experiments on a SD-specific application show that the proposed approachoutperforms ISTS as well as conventional NN classifiers on both noisy and nonnoisy datasets. Additionally, its classassignments are interpretable as opposed to the other approaches considered in the experiments.

___

  • [1] Chaovalitwongse WA, Fan YJ, Sachdeo RC. On the time series k -nearest neighbor classification of abnormal brain activity. IEEE T Syst Man Cy A 2007; 37: 1005–1016.
  • [2] Zeng Z, Yan H. Supervised classification of share price trends. Inform Sciences 2008; 178: 3943–3956.
  • [3] Vergara A, Vembu S, Ayhan T, Ryan MA, Homer ML, Huerta R. Chemical gas sensor drift compensation using classifier ensembles. Sensor Actuat B-Chem 2012; 166–167: 320–329.
  • [4] Fu T. A review on time series data mining. Eng Appl Artif Intel 2011; 24: 164–181.
  • [5] Jeong YS, Jeong MK, Omitaomu OA. Weighted dynamic time warping for time series classification. Pattern Recogn 2011; 44: 2231–2240.
  • [6] Keogh E, Kasetty S. On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Disc 2003; 7: 349–371.
  • [7] Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E. Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 2013; 26: 275–309.
  • [8] Ye L, Keogh E. Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Disc 2011; 22: 149–182.
  • [9] Khadra L, Al-Fahoum AS, Binajjaj S. A quantitative analysis approach for cardiac arrhythmia classification using higher order spectral techniques. IEEE T Bio-Med Eng 2005; 52: 1840-1845.
  • [10] Kakizawa Y, Shumway RH, Taniguchi M. Discrimination and clustering for multivariate time series. J Am Stat Assoc 1998; 93: 328-340.
  • [11] Sterman JD. Business Dynamics: Systems Thinking and Modeling for a Complex World. Boston, MA, USA: Irwin McGraw-Hill, 2000.
  • [12] Forrester JW. Principles of Systems. Cambridge, MA, USA: Wright-Allen Press, 1968.
  • [13] Sücüllü C, Yücel G. Behavior Analysis and Testing Software (BATS). In: Proceedings of the 32nd International System Dynamics Conference; 20–24 July 2014; Delft, the Netherlands.
  • [14] Yücel G, Barlas Y. Automated parameter specification in dynamic feedback models based on behavior pattern features. Syst Dynam Rev 2011; 27: 195–215.
  • [15] Bog S, Barlas Y. Automated dynamic pattern testing, parameter calibration and policy improvement. In: Proceedings of the 23rd International System Dynamics Conference; 17–21 July 2005; Boston, MA, USA.
  • [16] Barlas Y, Kanar K. Structure-oriented behavior tests in model validation. In: Proceedings of the 18th International System Dynamics Conference; 6–10 August 2000; Bergen, Norway.
  • [17] Soylu S. Generic dynamic patterns: Testing by empirical evidence. MSc, Boğaziçi University, İstanbul, Turkey, 2006.
  • [18] Baydogan MG, Runger G. Learning a symbolic representation for multivariate time series classification. Data Min Knowl Disc 2015; 29: 400–422.
  • [19] Chen L, Ng R. On the marriage of ℓp -norms and edit distance. In: Proceedings of the Thirtieth International Conference on Very Large Databases – Volume 30; 31 August–3 September 2004; Toronto, Canada. pp. 792–803.
  • [20] Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and Regression Trees. Wadsworth, NY, USA: CRC Press, 1984.
  • [21] Breiman L. Random forests. Mach Learn 2001; 45: 5–32.
  • [22] Mori U, Mendiburu A, Lozano JA. Distance measures for time series in R: the TSdist package. R J 2016; 8: 451–459.
  • [23] Tormene P, Giorgino T, Quaglini S, Stefanelli M. Matching incomplete time series with dynamic time warping: An algorithm and an application to post-stroke rehabilitation. Artif Intell Med 2009; 45: 11–34.
  • [24] Giorgino T. Computing and visualizing dynamic time warping alignments in R: the dtw package. J Stat Softw 2009; 31: 1-24.
  • [25] R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2016.
  • [26] Alpaydin E. Introduction to Machine Learning. Cambridge, MA, USA: MIT Press, 2014.
  • [27] Delen D, Walker G, Kadam A. Predicting breast cancer survivability: a comparison of three data mining methods. Artif Intell Med 2005; 34: 113-127.
  • [28] Demšar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 2006; 7: 1–30.
  • [29] Friedman M. A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 1940; 11: 86–92.
  • [30] Nemenyi P. Distribution-free multiple comparisons. PhD, Princeton University, NJ, USA, 1963.
  • [31] Ratanamahatana CA, Keogh E. Three myths about dynamic time warping data mining. In: Proceedings of the 2005 SIAM International Conference on Data Mining; 21–23 April 2005; Newport Beach, CA, USA. pp. 506–510.