Exact distribution of Hadi's $(H^2)$ influence measure and identification of potential outliers

This paper proposed an exact distribution of Hadi's influence measure that can be used to evaluate the potential outliers in a linear multiple regression analysis. The authors explored a relationship between the measure in terms of two independent F-ratios and they derived density function of the measure in a complicated series expression form with Gauss hyper-geometric function. Moreover, the first two moments of the distribution are derived in terms of Beta function and the authors computed the critical points of Hadi's measures at 5% and 1% significance level for different sample sizes and varying no. of predictors. Finally, the numerical example shows the identification of the potential outliers and the results extracted from the proposed approaches are more scientific, systematic and its exactness outperforms the Hadi's traditional approach.

___

  • Alfons, A., Croux, C. & Gelper, S. Sparse least trimmed squares regression for analyzing high-dimensional large data sets. The Annals of Applied Statistics, 7, 226-248, 2013.
  • Andrews, D. F., & Pregibon, D. Finding the outliers that matter. Journal of the Royal Statistical Society. Series B (Methodological), 85-93, 1978.
  • Belsley, D. A., & Kuh, E. Welsch., RE. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Uiley Series in Probability and Mathematical Statistics, 1980.
  • Beyaztas, U., & Alin, A. Sufficient jackknife-after-bootstrap method for detection of influential observations in linear regression models. Statistical Papers, 55(4), 1001-1018, 2014.
  • Chatterjee, S., & Hadi, A. S. Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 379-393, 1986.
  • Cook, R. D. Detection of influential observation in linear regression. Technometrics, 19(1), 15-18, 1977.
  • Cook, R. D., & Weisberg, S. Criticism and influence analysis in regression. Sociological methodology, 13(3), 313-361, 1982.
  • Hadi, A. S. Identifying multiple outliers in multivariate data. Journal of the Royal Statistical Society. Series B (Methodological), 761-771, 1992.
  • Hoaglin, D. C., & Welsch, R. E. The hat matrix in regression and ANOVA. The American Statistician, 32(1), 17-22, 1978.
  • Johnson, B. W., & McCulloch, R. E. AddedVariable Plots in Linear Regression. Technometrics, 29(4), 427-433, 1987.
  • Mosteller, F., & Tukey, J. W. Data analysis and regression: a second course in statistics. Addison-Wesley Series in Behavioral Science: Quantitative Methods, 1977.
  • Nurunnabi, A. A. M., Hadi, A. S., & Imon, A. H. M. R. Procedures for the identification of multiple influential observations in linear regression.Journal of Applied Statistics, 41(6), 1315-1331, 2014.
  • Park, H., Sakaori, F. & Konishi, S. Robust sparse regression and tuning parameter selection via the efficient bootstrap information criteria. Journal of Statistical Computation and Simulation, 84, 1596-1607, 2014.