PURO: A Package for Unmasking Regression Outliers

Multiple regression outliers should be identified because of their potential effect on parameter estimates and inferences from the regression model. In recent years, researchers have proposed numerous strategies and procedures to identify the outliers. A Mathematica package PURO is introduced which implements seven methods from the latest and most respected outlier detection procedures in the statistics literature.

___

  • Belsley, D.A. , Kuh, E. Welsch, R., “Regression Diagnostics”. Wiley and Sons, N.Y. and Toronto. (1980).
  • Billor, N., Chatterjee, S., Hadi, A.S., “A Re- Weighted Least Squares Method for Robust Regression Estimation”, American Journal of Mathematical and Management Sciences, 26, Nos. 3&4, 229-252 (2006).
  • Billor, N., Kiral, G., “A comparison of multiple outlier detection methods for regression data”, Comm. in Stat., 37, 521-545 (2008).
  • Cook, R.D., Weisberg, S., “Residuals and Influence in Regression. London: Chapman and Hall (1982).
  • Hadi, A.S., “Identifying multiple outliers in multivariate data”, J. Roy. Statist. Soc. Ser. B 54, 761–777 (1992).
  • Hadi, A.S., Simonoff, J.S., Procedures for the identification of multiple outliers in linear models, J. Amer. Statist. Assoc., 88, 1264 –1272 (1993).
  • Hadi, A.S., “A modification of a method for the detection of outliers in multivariate samples”, J. Roy. Statist. Soc. Ser. B 56, 393–396 (1994).
  • Kianifard, F., Swallow, W., A Monte Carlo comparison of some procedures for identifying outliers in linear regression, Commun. Statist. Part A Theory Methods, 19, 1913–1938 (1990).
  • Kim, S., Krzanowski, W.J., , Detecting multiple outliers in linear regression using a cluster method combined with graphical visualization, Comp. Stat., 22, 109-119 2007
  • Marchette, D.J., Solka, J.L., 2003, Using Data Images for Outlier Detection, Comput. Statist. Data Anal., 43, 541-552.
  • McCullough, B. D., 1998 Assessing the reliability of statistical software: Part I. The American Statistician 52, 358–366.
  • McCullough, B. D., 1999, “Assessing the reliability of statistical software: Part II”. The American Statistician 53, 149-159.
  • McCullough, B. D., 2000,The Accuracy of Mathematica 4 as a Statistical Package,, Computational Statistics 15 , 279-299.
  • McCullough, B. D. , Heiser, D. A., 2008, On the accuracy of statistical procedures in Microsoft Excel 2007, Computational Statistics and Data Analysis 52, 4570-4578.
  • Pena, D., Yahoi, V.J., 1995, The detection of Influential Subsets in Linear Regression by Using an Influence Matrix, J. Roy. Statist. Soc. Ser. B 57(1), 145-156.
  • Rousseeuw, P.J, 1984, Least median of squares regression, J. Amer. Statist. Assoc. 79, 871-881.
  • Rousseeuw,P.J, 1985,Multivariate Estimation with
  • High Breakdown Point, Mathematical Statistics and Applications, ed. by W. Grossmann, G. Pflug, I. Vincze, and W. Wertz, Dordrecht: Reidel Publishing Company, 283-297.
  • Rousseeuw, P.J., Leroy, A.M., 1987, “Robust Regression and Outlier Detection”, Wiley, New York.
  • Rousseeuw, P.J, Van Zomeren, B.C., 1990, Unmasking multivariate outliers and leverage points, J. Amer. Statist. Assoc., 85, 633-639.
  • Rousseeuw,P.J, Huber,M,1997, URL. http://www.agoras.ua.ac.be/abstract/Recdev97.htm
  • Sebert, D.M., Montgomery, D.C., Rollier, D., 1998, A clustering algorithm for identifying multiple outliers in linear regression, Comput. Statist. Data Anal. 27, 461-484.
  • Wisnowski, J.W. , Montgomery, D.C., Simpson, J.R., 2001, “A comparative analysis of multiple outlier detection procedures in the linear regression model”, Comput. Statist. Data Anal., 36, 351–382.