Exact distribution of Cook's distance and identification of influential observations

Exact distribution of Cook's distance and identification of influential observations

This paper proposed the exact distribution of Cook s distance used to evaluate the influential observations in multiple linear regression analysis. The authors adopted the relationship proposed by Weisberg (1980), Belsey et al. (1980) and showed the derived density function of the cook s distance in terms of the series expression form. Moreover, the first two moments of the distribution are derived and the authors computed the critical points of Cook s distance at 5% and 1% significance level for different sample sizes based on no.of predictors. Finally, the numerical example shows the identification of the influential observations and the results extracted from the proposed approach is more scientific, systematic and it s exactness outperforms the traditional rule of thumb approach.

___

  • [1] Belsey, D. A., Kuh, E., & Welsch, R. E. Regression diagnostics: Identifying influential data and sources of collinearity. (John Wiley1980).
  • [2] Bollen, K. A., & Jackman, R. W. Regression diagnostics: An expository treatment of outliers and influential cases. Modern methods of data analysis, 257-291, 1990.
  • [3] Chatterjee, S. and Hadi, A. S., Sensitivity Analysis in Linear Regression, (New York: John Wiley and Sons, 1988)
  • [4] Cook, R. D., Detection of influential observation in linear regression. Technometrics, 15-18, 1977.
  • [5] Cook, R. D., & Weisberg, S. Residuals and influence in regression (Vol. 5). (New York: Chapman and Hall, 1982).
  • [6] Diaz-Garcia, J. A., & Gonzlez-Faras, G. A note on the Cook’s distance. Journal of statistical planning and inf erence, 120(1), 119-136, 2004.
  • [7] Eubank, R.L., Diagnostics for smoothing splines. J. Roy. Statist. Soc. Ser. B 47, 332–341, (1985).
  • [8] Kim, C., Cook’s distance in spline smoothing. Statist. Probab. Lett. 31, 139–144, 1996.
  • [9] Kim, C., Kim, W., Some diagnostics results in nonparametric density estimation. Comm. Statist. Theory Methods 27, 291–303, 1998.
  • [10] Kim, C., Lee, Y., Park, B.U., Cook’s distance in local polynomial regression. Statist. Probab. Lett. 54 , 33–40, 2001.
  • [11] Silverman, B.W., Some aspects of the spline smoothing approach to non-parametric regression curve 6tting (with discussion). J. Roy. Statist. Soc. Ser. B 47, 1–52, 1985.