Predicting Traffic Accident Severity Using Machine Learning Techniques

Ülkelerin ekonomilerine, milli varlıklarına zarar verip insanların yaşamlarına sebep olan trafik kazaları, ülkelerin en büyük sorunlarından biridir. Dolayısıyla, kazaların meydana gelmesine katkıda bulunan faktörlerin araştırılması ve doğru bir kaza şiddeti tahmin modelinin geliştirilmesi kritik öneme sahiptir. Bu çalışmada, 2011-2021 yılları arasında Teksas'ın Austin, Dallas ve San Antonio şehirlerinden toplanan trafik kazası verileri kullanılarak, kazalara sebep olan faktörler incelenip, Derin Öğrenme, Lojistik Regresyon, XGBoost, Random Forest, KNN ve SVM gibi 6 farklı makine öğrenme tekniğinin kaza şiddet-tahmin performans sonuçları karşılaştırılırdı. Elde edilen bulgular, Lojistik Regresyon algoritmasının kaza şiddetini sınıflandırmada %88 doğrulukla diğerleri arasında en iyi performansı gösterdiğini göstermektedir.

___

  • Referans1 Chong M, Abraham A, Paprzycki M. Traffic accident data mining using machine learning paradigms. Fourth International Conference on Intelligent Systems Design and Applications (ISDA’04), Hungary, 2004, p. 415–20.
  • Referans2 Chong MM, Abraham A, Paprzycki M. Traffic accident analysis using decision trees and neural networks. ArXiv Preprint Cs/0405050 2004.
  • Referans3 Sohn SY, Lee SH. Data fusion, ensemble and clustering to improve the classification accuracy for the severity of road traffic accidents in Korea. Safety Science 2003;41:1–14. https://doi.org/10.1016/S0925-7535(01)00032-7.
  • Referans4 Abdelwahab HT, Abdel-Aty MA. Development of artificial neural network models to predict driver injury severity in traffic accidents at signalized intersections. Transportation Research Record 2001;1746:6–13.
  • Referans5 Ossiander EM, Cummings P. Freeway speed limits and traffic fatalities in Washington State. Accident Analysis & Prevention 2002;34:13–8.
  • Referans6 Krishnaveni S, Hemalatha M. A perspective analysis of traffic accident using data mining techniques. International Journal of Computer Applications 2011;23:40–8.
  • Referans7 Chen C, Zhang G, Qian Z, Tarefder RA, Tian Z. Investigating driver injury severity patterns in rollover crashes using support vector machine models. Accident Analysis & Prevention 2016;90:128–39.
  • Referans8 Comparison of Machine Learning Algorithms for Predicting Traffic Accident Severity | IEEE Conference Publication | IEEE Xplore n.d. https://ieeexplore.ieee.org/abstract/document/8717393 (accessed December 4, 2021).
  • Referans9 Cuenca LG, Puertas E, Aliane N, Andres JF. Traffic Accidents Classification and Injury Severity Prediction. 2018 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE), 2018, p. 52–7. https://doi.org/10.1109/ICITE.2018.8492545.
  • Referans10 Taamneh M, Alkheder S, Taamneh S. Data-mining techniques for traffic accident modeling and prediction in the United Arab Emirates. Journal of Transportation Safety & Security 2017;9:146–66. https://doi.org/10.1080/19439962.2016.1152338.
  • Referans11 Aci C, Ozden C. Predicting the Severity of Motor Vehicle Accident Injuries in Adana-Turkey Using Machine Learning Methods and Detailed Meteorological Data. International Journal of Intelligent Systems and Applications in Engineering 2018;6:72–9. https://doi.org/10.18201/ijisae.2018637934.
  • Referans12 Lin C, Wu D, Liu H, Xia X, Bhattarai N. Factor Identification and Prediction for Teen Driver Crash Severity Using Machine Learning: A Case Study. Applied Sciences 2020;10:1675. https://doi.org/10.3390/app10051675.
  • Referans13 CRIS Query n.d. https://cris.dot.state.tx.us/public/Query/app/home (accessed December 4, 2021). Referans14 Chollet F, others. Keras 2015. https://github.com/fchollet/keras.
  • Referans15 Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011;12:2825–30.
  • Referans16 McKinney W, others. Data structures for statistical computing in python. Proceedings of the 9th Python in Science Conference, vol. 445, Austin, TX; 2010, p. 51–6.
  • Referans17 Lemaître G, Nogueira F, Aridas CK. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. The Journal of Machine Learning Research 2017;18:559–63.
  • Referans18 Cox DR. The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological) 1958;20:215–32.
  • Referans19 Ho TK. Random decision forests. Proceedings of 3rd international conference on document analysis and recognition, vol. 1, IEEE; 1995, p. 278–82.
  • Referans20 XGBoost | Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining n.d. https://dl.acm.org/doi/10.1145/2939672.2939785 (accessed June 26, 2022).
  • Referans21 Peterson LE. K-nearest neighbor. Scholarpedia 2009;4:1883.
  • Referans22 Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995;20:273–97. https://doi.org/10.1007/BF00994018.
  • Referans23 LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436–44. https://doi.org/10.1038/nature14539.
  • Referans24 US States - Ranked by Population 2022 n.d. https://worldpopulationreview.com/states (accessed May 18, 2022).
  • Referans25 Motor vehicles in the U.S. - registrations by state. Statista n.d. https://www.statista.com/statistics/196505/total-number-of-registered-motor-vehicles-in-the-us-by-state/ (accessed May 18, 2022).