REGRESSION BASED RISK ANALYSIS IN LIFE INSURANCE INDUSTRY

REGRESSION BASED RISK ANALYSIS IN LIFE INSURANCE INDUSTRY

Risk analysis is a crucial part for classifying applicants in life insurance business. Since the traditional underwriting strategies are time-consuming, recent works have focused on machine learning based methods to make the steps of underwriting more effective and strengthening the supervisory. The aim of this study is to evaluate the linear and non-linear regression-based models to determine the degree of risk. Therefore, four linear and non-linear regression algorithms are trained and evaluated on a life insurance dataset. The parameters of algorithms are optimized using Grid Search approach. The experimental results show that the non-linear regression models achieve more accurate predictions than linear regression models and the LGBM algorithm has the best performance among the all regression models with the highest R2, lowest MAE and RMSE values.

___

  • [1] Burri, R. D., Burri, R., Bojja, R. R., & Buruga, S. (2019). Insurance Claim Analysis using Machine Learning Algorithms. International Journal of Advanced Science and Technology, 127(1), 147-155.
  • [2] Boodhun, N., & Jayabalan, M. (2018). Risk prediction in life insurance industry using supervised learning algorithms. Complex & Intelligent Systems, 4(2), 145-154.
  • [3] Bhalla, A. (2012). Enhancement in predictive model for insurance underwriting. Int J Comput Sci Eng Technol, 3, 160-165.
  • [4] Wuppermann, A. C. (2017). Private Information in Life Insurance, Annuity, and Health Insurance Markets. The Scandinavian Journal of Economics, 119(4), 855-881.
  • [5] Mamun, D. M. Z., Ali, K., Bhuiyan, P., Khan, S., Hossain, S., Ibrahim, M., & Huda, K. (2016). Problems and prospects of insurance business in Bangladesh from the companies’ perspective. Insur J Bangladesh Insurance Acad, 62, 5-164.
  • [6] Web Access (January 2020), https://www.kaggle.com/noordeen/insurance-premium-prediction
  • [7] Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear regression analysis (Vol. 821). John Wiley & Sons.
  • [8] Gallant, A. R. (2009). Nonlinear statistical models (Vol. 310). John Wiley & Sons.
  • [9] Abdi, H. (2010). Partial least squares regression and projection on latent structure regression (PLS Regression). Wiley interdisciplinary reviews: computational statistics, 2(1), 97-106.
  • [10] Saleh, A. M. E., Arashi, M., & Kibria, B. G. (2019). Theory of Ridge Regression Estimation with Applications (Vol. 285). John Wiley & Sons.
  • [11] Reid, S., Tibshirani, R., & Friedman, J. (2016). A study of error variance estimation in lasso regression. Statistica Sinica, 35-67.
  • [12] Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., ... & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. In Advances in neural information processing systems (pp. 3146-3154).
  • [13] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • [14] Steinberg, D. (2009). CART: classification and regression trees. In The top ten algorithms in data mining (pp. 193-216). Chapman and Hall/CRC.
  • [15] Awad, M., & Khanna, R. (2015). Support vector regression. In Efficient Learning Machines (pp. 67-80). Apress, Berkeley, CA.
  • [16] Basaran, K., Özçift, A., & Kılınç, D. (2019). A new approach for prediction of solar radiation with using ensemble learning algorithm. Arabian Journal for Science and Engineering, 44(8), 7159-7171.
  • [17] Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of machine learning research, 13(Feb), 281-305.