Filiz KARDİYEN1,♠, Hilal GÜNEY1

A Study on Time Series Clustering

In recent years, the topic of classification which is advantageous in terms of time and cost is of great interest in various fields. Especially, when a large number of  series it is much more practical to classify the series into similar groups and  to make an estimate for each corresponding group rather than to make prediction  for every given series individually. For this reason, some studies have been carried out in order to develop various classification and clustering methods by using characteristics of  time series. In this study,  model based approaches: Maharaj’s p-value based distance, Piccolo’s AR distance, Cepstral based distance and free model based methods: Autocorrelation based distance, Chouakria-Douzal dissimilarity measure, Minkowski distance are compared in terms of clustering performances of time series. Also, the performances of the clustering methods are investigated for different ranking of the processes and correlation structures among the series. In the result of the study, it is obtained that Maharaj’s p-value based distance is the best method regarding to clustering performance and Piccolo’s AR distance based clustering is the least affected method by the different ranking of the processes.

___

  • Box, G., E., P. ve Jenkins, G., M.., Reinsel G. C., Time Series Analysis: Forcesting and Control 3.edn., Prentice Hall, New Jersey, 1994.
  • Caiado, J., Crato, N., ve Pena, D., “A Periodogram-Based Metric for Time Series Classification”, Computational Statistics and Data Analysis, Volume 50, 2668-2684, 2006.
  • Chouakria, A., D., Nagabhushan, P., N., “Adaptive Dissimilarity Index for Measuring Time Series Proximity”, Advanced in Data Analysis and Classification, Volume 1, 5-21, 2007
  • Cleveland, W., S., “The Inverse Autocorrelation of Time Series and Their Applications”, Tecnometrics, Volume 14, 277-293, 1972.
  • Corduas, M., ve Piccolo, D., “Time Series Clustering and Classification by the Autoregressive Metric”, Computational Statistics & Data Analysis, Volume 52,1860-1872, 2008.
  • D’Urso, P., Maharaj, E., A., “Autocorrelation-based Fuzzy Clustering of Time Series”, Fuzzy Sets and Systems, Volume 160, 3565-3589, 2009.
  • Galeano, P., ve Pena, D., “Multivariate Analysis in Vector Time series”, Resenhas, Cilt 4, 383-404, 2000.
  • Kalpakis, K. , Gada, D., and Vasundhara Puttagunta, V., “Distance Measures for Efective Clustering of ARIMA Time-Series”, Proceedings 2001 IEEE International Conference on Data Mining, 273-280, 2001.
  • Kakizawa, Y., Shumway, R., H., ve Taniguchi, M., “Discrimination and Clustering for Multivariate Time Series”, Jornal of American Statistical Association, Cilt 93, 328-340, 1998.
  • Liao, T., W.,” Clustering of Time Series Data- a survey”, Pattern Recognition, Cilt 38, 1857-1874, 2005.
  • Maharaj, E. A., “A Significance Test for Classifying ARMA Models”, Journal of Statistical Computation and Simulation, Volume 54, 305-331, 1996.
  • Maharaj, E., A., “Clusters of Time Series”, Journal of Classification, Cilt 17, 297-314, 2000.
  • Manso, P., M., A Package for Stationary Time Series Clustering, Master thesis, Universidade da Coruna, 2013
  • Piccolo, D., “A Distance Measure for Classifying ARIMA Models”, Journal of Time Series Analysis, Cilt 11, No 2, 153-164, 1990.
  • Shaw, C., T., ve King, G., P., “Using Cluster Analysis to Classify Time Series”, PhysicaD, Volume 58, 288-298, 1992.
  • Tong, H., ve Dabas, P., “Cluster of Time Series Models: An Example”, Journal of Applied Statistics, Volume 17, No 2, 187-198, 1990.
  • Xiong, Y., ve Yeung, D., Y., “Time Series Clustering with ARMA Mixtures”, Pattern Recognition, Cilt 37,1675-1689, 2004.
  • Wei, W.,W., S., Time Series Analysis Univariate and Multivariate Methods, Addison and Wesley Publishing Company, Canada, 1989.