A Note on the Robustness of Performance of Methods and Rankings for M4 Competition

A Note on the Robustness of Performance of Methods and Rankings for M4 Competition

M4 forecasting competition provided some useful information to forecasting literature. The provided information is based on calculated error metrics for ranking methods in the competition. The organizers of the competition calculated arithmetic mean as a descriptive statistic for evaluating the performance of all competitors. In this paper, the effect of different descriptive statistics for ranking of methods is investigated. It is found that the distribution of error metrics for competitor forecasting methods is not symmetric. Thus, the arithmetic mean descriptive statistic is not a good metric to determine the centre of non-symmetric distributions and it will not well present centre of the distribution. In this study, it is showed that the median will well present centre of distribution for error metrics of competitor forecasting methods. When the median is used as a descriptive statistic for ranking methods, the ranks of methods is different form ranks which are calculated according to the arithmetic mean descriptive statistics. Moreover, the direction accuracy metric is calculated for the best ten methods in the competition. So, the forecasting methods are ranked according to direction accuracy and it is showed that the ranks are different from the competition results.

___

  • [1] P. Agathangelou, D. Trihinas, I. Katakis, Correlation analysis of forecasting methods: The case of the M4 Competition, International Journal of Forecasting, 36 (2020), 212–216.
  • [2] A. Davydenko, R. Fildes, Measuring forecasting accuracy: The case of judgmental adjustments to SKU-level demand forecasts, International Journal of Forecasting, 29-3 (2013), 510–522.
  • [3] R. Fildes, M. Hibon, S. Makridakis, N. Meade, Generalising about univariate forecasting methods: further empirical evidence. International Journal of Forecasting, 14-3, (1998), 339-358.
  • [4] R. Fildes, Learning from forecasting competitions, International Journal of Forecasting, 36 (2020), 186–188.
  • [5] P. Goodwin, Lawton, R., On the asymmetry of the symmetric MAPE, International Journal of Forecasting, 15 (1999), 405–408.
  • [6] P. Goodwin, Performance measurement in the M4 Competition: Possible future research, International Journal of Forecasting, 36 (2020), 189–190.
  • [7] S. Kolassa, Why the ‘‘best’’ point forecast depends on the error or accuracy measure, International Journal of Forecasting, 36 (2020), 208–211.
  • [8] R. J. Hyndman, A brief history of forecasting competitions. International Journal of Forecasting, 36-1 (2020), 7-14.
  • [9] S. Makridakis, E. Spiliotis, V. Assimakopoulos, The M4 Competition: 100,000-time series and 61 forecasting methods, International Journal of Forecasting, 36 (2020), 54–74.
  • [10] S. Makridakis, M. Hibon, The M3-Competition: results, conclusions and implications, International Journal of Forecasting, 16 (2000), 451–476.
  • [11] S. Makridakis, E. Spiliotis, V. Assimakopoulos, Responses to discussions and commentaries, International Journal of Forecasting, 36 (2020), 217–223.
  • [12] P. Petropoulos, S. Makridakis, The M4 competition: Bigger. Stronger. Better, International Journal of Forecasting, 36 (2020) 3–6.