Modelling Sport Events with Supervised Machine Learning

Modelling Sport Events with Supervised Machine Learning

It has been very important to understand the change of multivariable systems to make predictions accordingly. The goal of supervised machine learning is to build a model of changing classes of observations depending on various variables and to make predictions about the coming situations. Due to the fact that sports are followed by the whole world modelling sports events and studies about predicting the results of future matches have gained importance. In this study, match statistics of the teams in the Turkey Super League were used, and it was examined how successfully the outcome of the match was predicted using a decision tree, random forest, k-nearest neighbor, naive Bayes, support vector machine. According to the tests done in Turkey Super League, the support vector machine performs the best.

___

  • [1] Z. Ghahrami, Unsupervised Learning Advanced Lectures on Machine Learning Springer, 2004.
  • [2] F.Y. Osisanwo, J.E.T. Akinsola, O. Awodele, J.O. Hinmikaiye, O. Olakanmi, J.Akinjobi, Supervised machine learning algorithms: Classification and comparison, IJCTT International Journal of Computer Trends And Technology, 48 (2017), 128-138.
  • [3] W.L. Chao, Machine Learning Tutorial, DISP Lab, Graduate Institute of Communication Engineering, National Taiwan University, 2011, https: //tcxsproject.com.br/dev/Biblioteca%20Livros%20Hacker%20Gorpo%20Orko/Machine%20Learning%20Tutorial.pdf.
  • [4] Cao, Chenjie, Sports data mining technology used in basketball outcome prediction, Masters Dissertation, Technological University Dublin, 2012, https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1040&context=scschcomdis.
  • [5] D. Harville, Predictions fot national football league games via linear model methodology, J. Amer. Stat. Ass., 75 (1980), 516-524.
  • [6] Knorr-Held, Dynamic rating of sport teams the statistican, 49 (2000), 261-276.
  • [7] R.H. Koning, Balance in competition in dutch soccer, J. Royal Stat. Soci.: Ser. Statistician, 49 (2000), 419-431.
  • [8] M.J. Maher, Modeling association football scores, Statistica Neerlandica, 36 (1982), 109-110.
  • [9] M. Crowder, M. Dixon, A. Ledford, M. Robinson, Dynamic modelling and prediction of English football league matches for betting, J. Royal Stat. Soci.: Ser. Statistican, 51 (2002), 157-168.
  • [10] D. Karlis, L. Ntzoufras, On modelling soccer data, Student, 3 (2000), 229-244.
  • [11] D. Karlis, L. Ntzoufras, Analysis of sports data by using bivariate poisson models, J. Royal Stat. Soci.: Ser. Statistican, 52 (2003), 381-393.
  • [12] H. Rue, Ø. Salvesen, Prediction and retrospective analysis of soccer matches in A league, J. Royal Stat. Soci.: Ser. Statistican, 49 (2000), 399-418.
  • [13] G. Baio, M. Blangiardoi, Bayesian hierarchical model for the prediction of football results, J. App. Statistics, 37 (2010), 253-264.
  • [14] A. Joseph, N.E. Fenton, M. Neil, Predicting football results using Bayesian nets and other machine learning techniques, Knowledge-Based Systems, 19 (2006), 544-553.
  • [15] K.Y. Huang, A neural network method for prediction od 2006 world cup football game, The 2010 International Joint Conference on Neural Network, 2010.
  • [16] A.C. Constantinou, N.E. Fenton, M. Neil, Pi-football: A bayesian network model for forecasting association football match qutcomes, Knowledge-Based System, 36 (2012), 322-339.
  • [17] A.C. Constantinou, N.E. Fenton, Towards smart-data: Improving predictive accuracy in long-term football team performance, Knowledge Based System, 124 (2017), 93-104.
  • [18] M. Karabiyik, B. Yet, Football analytics with Bayesian networks: The FutBA model, Pamukkale University Journal of Engineering Sciences, 25 (2019), 121-131.
  • [19] M.C. Purucker, Neural network quarterbacking potential, IEEE, 15 (1996), 9-15.
  • [20] J. Kahn, Neural Network Prediction of NFL Football Games, Lecture Notes, Fall 2003, 1-19, https://docplayer.net/ 21763052-Neural-network-prediction-of-nfl-football-games-joshua-kahn.html.
  • [21] A. McCabe, J. Trevathan, Artifical intelligence in sports prediction, The Fifth International Conference on Information Technology: New Generations, Las Vegas, USA, 2008, 1194-1197.
  • [22] B. Hamadani, Predicting The Outcome of NFL Games Using Machine Learning, Stanford University, 2006, http://cs229.stanford.edu/proj2006/BabakHamadani-PredictingNFLGames.pdf.
  • [23] A. Sierra, J. Forco, C. Fierro, Football Futures, 2011, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.374.9764&rep=rep1&type=pdf.
  • [24] L. Smith, B. Lipscomb, A. Simkins, Data mining in sports predicting Cy young award winners, J. Com. Sci. in Colleges, 22 (2007), 115-121.
  • [25] J. Hucalijuk, A. Rakipovic, Predicting Football Scores Using Machine Learning Techniques, MIPRO 2011, 2011, 1623-1627.
  • [26] Cao, Sports data mining technology used in basketball outcome prediction, Masters Dissertation, Technological University Dublin, Ireland, 2012 https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1040&context=scschcomdis.
  • [27] A. Yezus, Predicting Outcome of Soccer Matches Using Machine Learning, Mathematics and Mechanics Faculty Term Paper, Saint-Petersburg State University, 2014, https://www.math.spbu.ru/SD_AIS/documents/2014-12-341/2014-12-tw-15.pdf.
  • [28] B. Ulmer, M. Fernandez, Prediction Soccer Match Results in the English Premier League, Stanford University, 2014, http://cs229.stanford.edu/proj2014/Ben%20Ulmer,%20Matt%20Fernandez,%20Predicting%20Soccer%20Results%20in%20the%20English%20Premier%20League.pdf.
  • [29] B. Karao˘glu, Modeling sports matches with machine learning, EMO Sci. J., 5 (2015), 1-5.
  • [30] S. Vaidya, H. Sanghavi, K. Gevario, Football match winner prediction, Int. J. Comp. Appl., 154 (2016), 31-33.
  • [31] C. Soto Valero, Prediction Win-Loss qutcomes in MLB regular season games – A comparative study using data mining methods, I. J. Comp. Sci. in Sport, 15 (2016), 91-112.
  • [32] K. J. Archer, R. V. Kimes, Empirical characterization of random forest variable importance measures, Computational Statistics & Data Analysis, 52 (2008), 2249-2260.
  • [33] L. Breiman, Random forest, Machine Learning, 45 (2001), 5-32.
  • [34] L. Breiman, Manual-Setting Up, Using, And Understanding Random Forests, University of California, Berkeley https://docplayer.net/44149058-Manual-setting-up-using-and-understanding-random-forests-v4-0.html.
  • [35] T. Cover, P. Hart, Nearest neighbor pattern classification, IEEE Transactions on Information Theory, 13 (1967), 21-27.
  • [36] J. Han, M. Kamber, J. Pei, Data Mining Concepts and Techniques, 2011, Morgan Kaufmann.
  • [37] S. B. Kotsiantis,Supervised machine learning: A riview of classification techniques, Informatica, 31 (2007), 249-268.
  • [38] H. Bhavsar, A. Ganatra, A comparative study of training algorithms for supervised machine learning, International Journal of Soft Computing and Engineering, 2 (2012), 74-81.
  • [39] T. G. Dietterich, E. B. Kong, Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms, Department of Computer Science, Oregon State University, Corvallis, 1995, https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.2702&rep=rep1&type=pdf.
  • [40] X. Wu, V. Kumar, J. R. Quinlan, J. Ghosh , Q. Yang , H. Motoda , G. J. McLachlan , A. Ng , B. Liu , P. S.Yu , Z. Zhou , M. Steinbach , D. J. El & Dan Steinberg, Top 10 Algorithms in Data Mining, Knowledge Information System, 14 (2008), 1-37.
  • [41] A. E. Mohamed, Comparative study of four supervised machine learning techniques for classification, Int. J. App. Sci. Tech., 7 (2017), 5-18.
  • [42] Y. Saeys, I. Inza, P. Larranaga, A review of feature selection techniques in bioinformatics, Bioinformatics, 23 (2007), 2507-2517.