Factors Associated with Match Result and Number of Goals Scored and Conceded in the English Premier League

Factors Associated with Match Result and Number of Goals Scored and Conceded in the English Premier League

The aim of this research is to identify the factors associated with the match result and the number of goals scored and conceded in the English Premier League. The data consist of 17 performance indicators and situational variables of the football matches in the English Premier League for the season of 2017-18. Poisson regression model was implemented to identify the significant factors in the number of goals scored and conceded, while multinomial logistic regression and support vector machine methods were used to determine the influential factors on the match result. It was found that scoring first, shots on target and goals conceded have significant influence on the number of goals scored, whereas scoring first, match location, quality of opponent, goals conceded, shots and clearances are influential on the number of goals conceded. On the other hand, scoring first, match location, shots, shot on target, clearances and quality of opponent significantly affect the probability of losing; while scoring first, match location, shots, shots on target and possession affect the probability of winning. In addition, among all the variables studied, scoring first is the only variable appearing important in all the analyses, making it the most significant factor for success in football.

___

  • [1] Y. Li, R. Ma, B. Gonçalves, B. Gong, Y. Cui, and Y. Shen, “Data-driven team ranking and match performance analysis in Chinese Football Super League,” Chaos, Solitons & Fractals, vol. 141, p. 110330, 2020.
  • [2] T. Y. Yang and T. Swartz, “A Two-Stage Bayesian Model for Predicting Winners in Major League Baseball,” J. Data Sci., vol. 2, no. 1, 2021.
  • [3] E. Ulas, “Examination of National Basketball Association (NBA) team values based on dynamic linear mixed models,” PLoS One, vol. 16, no. 6, 2021, 2021.
  • [4] P. Marek, B. Šedivá, and T. ͖oupal, “Modeling and prediction of ice hockey match results,” J. Quant. Anal. Sport., vol. 10, no. 3, 2014.
  • [5] J. Goddard, “Regression models for forecasting goals and match results in association football,” Int. J. Forecast., vol. 21, no. 2, pp. 331–340, 2005.
  • [6] M. J. Dixon and S. G. Coles, “Modelling association football scores and inefficiencies in the football betting market,” J. R. Stat. Soc. Ser. C (Applied Stat., vol. 46, no. 2, pp. 265–280, 1997.
  • [7] D. Karlis and I. Ntzoufras, “Analysis of sports data by using bivariate Poisson models,” J. R. Stat. Soc. Ser. D (The Stat., vol. 52, no. 3, pp. 381–393, 2003.
  • [8] A. J. Lee, “Modeling scores in the Premier League: Is Manchester United really the best?,” CHANCE, vol. 10, no. 1, pp. 15–19, 1997.
  • [9] M. J. Maher, “Modelling association football scores,” Stat. Neerl., vol. 36, no. 3, pp. 109–118, 1982.
  • [10] C. Lago-Peñas, M. Gómez-Ruano, D. Megías-Navarro, and R. Pollard, “Home advantage in football: Examining the effect of scoring first on match outcome in the five major European leagues,” Int. J. Perform. Anal. Sport, vol. 16, no. 2, pp. 411–421, 2016.
  • [11] J. García-Rubio, M. Á. Gómez, C. Lago-Peñas, and J. S. Ibáñez, “Effect of match venue, scoring first and quality of opposition on match outcome in the UEFA Champions League,” Int. J. Perform. Anal. Sport, vol. 15, no. 2, pp. 527–539, 2015.
  • [12] G. Bilek and E. Ulas, “Predicting match outcome according to the quality of opponent in the English premier league using situational variables and team performance indicators,” Int. J. Perform. Anal. Sport, vol. 19, no. 6, pp. 930–941, 2019.
  • [13] V. Armatas and R. Pollard, “Home advantage in Greek football,” Eur. J. Sport Sci., vol. 14, no. 2, pp. 116–122, 2014.
  • [14] C. Lago-Peñas, J. Lago-Ballesteros, A. Dellal, and M. Gómez, “Game-related statistics that discriminated winning, drawing and losing teams from the Spanish Soccer League.,” J. Sports Sci. Med., vol. 9, no. 2, pp. 288–93, 2010.
  • [15] R. Pollard, “Worldwide regional variations in home advantage in association football,” J. Sports Sci., vol. 24, no. 3, pp. 231–240, 2006.
  • [16] D. R. Poulter, “Home advantage and player nationality in international club football,” J. Sports Sci., vol. 27, no. 8, pp. 797–805, 2009.
  • [17] M. Saavedra García, O. Gutiérrez Aguilar, J. J. Fernández Romero, and P. Sa Marques, “Measuring home advantage in spanish football (1928-2011),” Rev. Int. Med. y Ciencias la Act. Fis. y del Deport., vol. 15, no. 57, 2015.
  • [18] S. Thomas, C. Reeves, and S. Davies, “An analysis of home advantage in the English Football Premiership.,” Percept. Mot. Skills, vol. 99, no. 3 Pt 2, pp. 1212–6, 2004.
  • [19] C. Anderson and D. Sally, The numbers game: why everything you know about Football is wrong. New York: Penguin Books, 2014.
  • [20] C. H. Almeida, A. P. Ferreira, and A. Volossovitch, “Effects of match location, match status and quality of opposition on regaining possession in UEFA champions league,” J. Hum. Kinet., vol. 41, no. 1, 2014.
  • [21] B. J. Taylor, D. S. Mellalieu, N. James, and P. Barter, “Situation variable effects and tactical performance in professional association football,” Int. J. Perform. Anal. Sport, vol. 10, no. 3, 2010.
  • [22] H. Lepschy, A. Woll, and H. Wäsche, “Success Factors in the FIFA 2018 World Cup in Russia and FIFA 2014 World Cup in Brazil,” Front. Psychol., vol. 12, p. 525, 2021.
  • [23] C. Lago-Peñas and J. Lago-Ballesteros, “Game location and team quality effects on performance profiles in professional soccer.,” J. Sports Sci. Med., vol. 10, no. 3, pp. 465–71, 2011, Accessed: [11-Mar-2021]. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/24150619.
  • [24] L. M. Hvattum and H. Arntzen, “Using ELO ratings for match result prediction in association football,” Int. J. Forecast., vol. 26, no. 3, 2010.
  • [25] M. Crowder, M. Dixon, A. Ledford, and M. Robinson, “Dynamic modelling and prediction of English Football League matches for betting,” J. R. Stat. Soc. Ser. D Stat., vol. 51, no. 2, 2002.
  • [26] P. Lucey, A. Bialkowski, M. Monfort, P. Carr, and I. Matthews, “‘Quality vs quantity’: Improved shot prediction in soccer using strategic features from spatiotemporal data,” in Proc. 8th Annu. MIT Sloan Sport. Anal. Conf., 2014.
  • [27] P. D. Jones, N. James, and S. D. Mellalieu, “Possession as a performance indicator in soccer.,” Int. J. Perform. Anal. Sport, vol. 4, no. 1, pp. 98–102, 2004.
  • [28] C. Lago and R. Martín, “Determinants of possession of the ball in soccer,” J. Sports Sci., vol. 25, no. 9, pp. 969–974, 2007.
  • [29] C. Lago, “The influence of match location, quality of opposition, and match status on possession strategies in professional association football,” J. Sports Sci., vol. 27, no. 13, pp. 1463–1469, 2009.
  • [30] B. Mcguckin, J. Bradley, M. Hughes, P. O’donoghue, and D. Martin, “Determinants of successful possession in elite Gaelic football Determinants of successful possession in elite Gaelic football,” Int. J. Perform. Anal. Sport, 2020.
  • [31] H. Liu, M. Á. Gomez, C. Lago-Peñas, and J. Sampaio, “Match statistics related to winning in the group stage of 2014 Brazil FIFA World Cup,” J. Sports Sci., vol. 33, no. 12, pp. 1205–1213, 2015.
  • [32] J. Castellano, D. Casamichana, and C. Lago, “The use of match statistics that discriminate between successful and unsuccessful soccer teams,” J. Hum. Kinet., vol. 31, no. 1, 2012.
  • [33] F. A. Moura, L. E. B. Martins, and S. A. Cunha, “Analysis of football game-related statistics using multivariate techniques,” J. Sports Sci., vol. 32, no. 20, pp. 1881–1887, 2014.
  • [34] R. Ensum, R. Pollard, and S. Taylor, “Applications of logistic regression to shots at goal in association football,” in Science and Football V, Routledge, 2005, pp. 211–218.
  • [35] H. Liu, W. Hopkins, M. A. Gómez, and J. S. Molinuevo, “Inter-operator reliability of live football match statistics from OPTA Sportsdata,” Int. J. Perform. Anal. Sport, vol. 13, no. 3, 2013.
  • [36] S. Coxe, S. G. West, and L. S. Aiken, “The analysis of count data: A gentle introduction to Poisson regression and its alternatives,” J. Pers. Assess., vol. 91, no. 2, pp. 121–136, 2009.
  • [37] Y. Huo , L. Xin , C. Kang , M. W. Qin Ma and B. Yu, “SGL-SVM: a novel method for tumor classification via support vector machine with sparse group Lasso,” J. Theor. Biol., 2019.
  • [38] H. Pei, Q. Lin, L. Yang, and P. Zhong, “A novel semi-supervised support vector machine with asymmetric squared loss,” Adv. Data Anal. Classif., vol. 15, no. 1, pp. 159–191, 2021.
  • [39] D. A. Salazar, J. I. Vélez, and J. C. Salazar, “Comparison between SVM and logistic regression: Which one is better to discriminate?,” Rev. Colomb. Estadística, vol. 35, no. SPE2, 2012.
  • [40] P. G. V. G. M. T. Fabian, “Scikit-learn: Machine learning in Python.,” J. Mach. Learn. Res., 2011.
  • [41] J. M. Bland and D. G. Altman, “Statistics notes. The odds ratio,” BMJ, vol. 320, no. 7247, p. 1468, 2000.
  • [42] I. Soto-Valero, C., González-Castellanos,and M., Pérez-Morales, “A predictive model for analysing the starting pitchers’ performance using time series classification methods.,” Int. J. Perform. Anal. Sport, vol. 17, no. 4, pp. 492–509, 2017.
  • [43] V. Guyon, I., Weston, J., Barnhill, and S., Vapnik, “Gene selection for cancer classification using support vector machines.,” Mach. Learn., vol. 46, no. 1, pp. 389-422., 2002.
  • [44] T. Liu, A. García-De-Alcaraz, L. Zhang, and Y. Zhang, “Exploring home advantage and quality of opposition interactions in the Chinese Football Super League,” Int. J. Perform. Anal. Sport, vol. 19, no. 3, pp. 289–301, 2019.
  • [45] T. Peeters and J. C. van Ours, “Seasonal Home Advantage in English Professional Football; 1974– 2018,” Economist (Leiden)., vol. 169, no. 1, pp. 107–126, 2021.