Predicting the Severity of Motor Vehicle Accident Injuries in AdanaTurkey Using Machine Learning Methods and Detailed Meteorological Data

Traffic accidents are among the most important issues facing every nation in the world as they cause many deaths and injuries as well as economic losses every year. In this study, the traffic accidents that took place in Adana, have been classified according to injury severity (i.e. fatal or non-fatal) and the factors affecting the accident outcome are investigated. The study included the traffic accident reports kept by Regional Traffic Division and the weather data provided by the Regional Directorate of Meteorology during 2005-2015. Five major machine learning methods (i.e. k-Nearest Neighbor, Naive Bayes, Multilayer Perceptron, Decision Tree, Support Vector Machine) and one statistical method, Logistic Regression, were employed for prediction models and performances of the models as well as the effective parameters were compared. The main objective of the study is to determine how important weather and other phenomena are for the occurrence of traffic accidents. Decision Tree, k-Nearest Neighbor, and Multilayer Perceptron based models yielded higher accuracy in classification of accidents compared to other models. Furthermore, in Area Under Curve based analysis of factor importance, it was determined that Mean Cloudiness, Existence of Traffic Control and Ground Surface Temperature had higher positive effects, while Maximum Temperature and Weather (kept by traffic officers) parameters decreased the accuracy of models

___

[1] WHO, “Global status report on road safety,” 2016. [Online]. Available: http://www.who.int/violence_injury_prevention/road_safety_status/2015/ en/. [Accessed: 19-Jun-2017].

[2] WHO, “Road traffic injuries,” 2017. [Online]. Available: http://www.who.int/mediacentre/factsheets/fs358/en/. [Accessed: 19-Jun2017].

[3] TSI, “Turkish Statistical Institute,” Turkish Statistical Institute, 2017. [Online]. Available: http://www.turkstat.gov.tr/Start.do. [Accessed: 19- Jun-2017].

[4] M. A. Abdel-Aty and A. E. Radwan, “Modeling traffic accident occurrence and involvement” Accid. Anal. Prev., vol. 32, no. 5, pp. 633– 42, Sep. 2000.

[5] S. Y. Sohn and H. Shin, “Pattern recognition for road traffic accident severity in Korea,” Ergonomics, vol. 44, no. 1, pp. 107–117, Jan. 2001.

[6] Q. Wu, G. Zhang, X. Zhu, X. C. Liu, and R. Tarefder, “Analysis of driver injury severity in single-vehicle crashes on rural and urban roadways,” Accid. Anal. Prev., vol. 94, pp. 35–45, 2016.

[7] M. Taamneh, S. Alkheder, and S. Taamneh, “Data-mining techniques for traffic accident modeling and prediction in the United Arab Emirates,” J. Transp. Saf. Secur., pp. 1–21, Apr. 2016.

[8] E. I. Vlahogianni, M. G. Karlaftis, and F. P. Orfanou, “Modeling the Effects of Weather and Traffic on the Risk of Secondary Incidents,” J. Intell. Transp. Syst., vol. 16, no. 3, pp. 109–117, Jul. 2012.

[9] J. Ona, R. O. Mujalli, and F. J. Calvo, “Analysis of traffic accident injury severity on Spanish rural highways using Bayesian networks,” Accid. Anal. Prev., vol. 43, no. 1, pp. 402–411, Jan. 2011.

[10] M. Chong, A. Abraham, M. Paprzycki, “Traffic Accident Analysis Using Machine Learning” Informatica 29 (2005) 89–98

[11] S. Krishnaveni, M. Hemalatha, “A Perspective Analysis of Traffic Accident using Data Mining Techniques”, International Journal of Computer Applications (0975 – 8887) Volume 23– No.7, June 2011

[12] S. Vasavi, “Extracting Hidden Patterns Within Road Accident Data Using Machine Learning Techniques” Information and Communication Technology, Advances in Intelligent Systems and Computing 625, https://doi.org/10.1007/978-981-10-5508-9_2

[13] C. M. Bishop, Pattern recognition and machine learning. Springer, 2006.

[14] T. Mitchell, Machine Learning. McGraw-Hill, 1997.

[15] S. Ajmani, K. Jadhav, and S. A. Kulkarni, “Three-Dimensional QSAR Using the k-Nearest Neighbor Method and Its Interpretation,” J. Chem. Inf. Model., vol. 46, no. 1, pp. 24–31, Jan. 2006.

[16] O. Z. Maimon and L. Rokach, Soft computing for knowledge discovery and data mining. Springer, 2011.

[17] S. Shalev-Schwartz and S. Ben-David, Understanding machine learning : from theory to algorithms. Cambridge: Cambridge University Press, 2014.

[18] M. Chong, A. Abraham, and M. Paprzycki, “Traffic Accident Analysis Using Machine Learning Paradigms,” Informatica, vol. 29, no. 1, pp. 89– 98, 2005.

[19] Rokach, L. "Classification and Regression Tree Lecture NotesChapter 9", http://www.ise.bgu.ac.il/faculty/liorr/hbchap9.pdf (access date: November 19, 2017).

[20] Scikit Learn Documentation, http://scikitlearn.org/stable/modules/svm.html#svc (access date: February 2, 2018).

[21] X. Fan, L. Wang, and S. Li, “Predicting chaotic coal prices using a multi-layer perceptron network model,” Resour. Policy, vol. 50, pp. 86–92, Dec. 2016.

[22] Shalizi, 2012. Advanced Data Analysis from an Elementary Point of View. Cambridge University Press.

[23] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. New York, NY: Springer New York, 2009.

[24] M. Taamneh, S. Taamneh, and S. Alkheder, “Clustering-based classification of road traffic accidents using hierarchical clustering and artificial neural networks,” Int. J. Inj. Contr. Saf. Promot., pp. 1–8, Sep. 2016.

[25] D. G. Altman and J. M. Bland, “Statistics Notes: Diagnostic tests 2: predictive values,” BMJ, vol. 309, no. 6947, 1994.

[26] BIML, “Test Statistics,” 2017. [Online]. Available: http://groups.bme.gatech.edu/groups/biml/resources/useful_documents/Te st_Statistics.pdf. [Accessed: 19-Jun-2017].

[27] Garson and G. David, “Interpreting neural-network connection weights,” AI Expert, vol. 6, no. 4, pp. 46–51, 1991.

[28] Y.-W. Chang and C.-J. Lin, “Feature Ranking Using Linear SVM,” in JMLR: Workshop and Conference Proceedings 3, 2008, pp. 53–64.

[29] K. Subbian and P. Melville, “Supervised Rank Aggregation for Predicting Influencers in Twitter,” in 2011 IEEE Third Int’l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int’l Conference on Social Computing, 2011, pp. 661–665.