An Application on Finance Data for Critical Limits of Assumptions in Count Data

An Application on Finance Data for Critical Limits of Assumptions in Count Data

Regression analysis is used to analyze many cases in real life. The type of data obtained varies according to the type of cases and the variable to be studied. For example, in the most widely used linear regression analysis, the dependent variable must be continuous. Otherwise, the desired results will have a high standard error and will be inconsistent. Alternative regression techniques have been developed according to the types of dependent variable. Two of them are Poisson and Negative Binomial Regression, which are frequently used in case of discrete dependent variables. However, the fact that the dependent variable is discrete does not mean that correct results will be obtained by applying the aforementioned models. Because besides the type of dependent variable, the parameters of the relevant models have been developed and various sub-models have emerged according to its distribution and spread. In this study, a data set containing real data such as HDI, GDP and credit score, which has an crucially important place in the field of finance, was used and the results were compared and interpreted using AIC, RMSE and MAE metrics by applying Poisson, Negative Binomial Regression and their zero-truncated models according to the characteristics of the data set. The empirical results can be interpreted as the negative binomial regression model gives better results when the dependent variable has insufficient distribution, but Poisson regression produces more meaningful results when the assumptions are at the limit. In addition, it was examined whether the number of zeros in the data set is sufficient to go to the Zero Truncated models. As a result, it has been revealed that the Negative Binomial distribution cannot always be used in cases where analysis will be made with Poisson regression, even though there is over- or under-distribution according to the assumptions.

___

  • Agresti, A., & Franklin, C. (2007). The art and science of learning from data. Upper Saddle River, New Jersey, 88.
  • Akaike, H. (1973). Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika, 60(2), 255-265.