Some Regression Methods Based on Principal Components

Principal component analysis (PCA) is commonly used technique in data processing and dimensionality reduction. However, PCA is very sensitive to outliers. To deal with this problem, the robust principal component analysis (RPCA) using Projection-Pursuit (PP) is a very appealing method. On the other hand, combining PCA on explanatory variables with least squares regression yields to principal component regression (PCR). Taking into consideration this general structure of PCR, we combine (R)PCA with OLS and MM regression estimators and show the performances of examined methods on extensive simulation studies and real data examples.

Temel Bileşenlere Dayalı Bazı Regresyon Yöntemleri

Temel Bileşen Analizi (TBA) veri işleme ve boyut indirgeme için sıklıkla kullanılan bir tekniktir. Ancak, TBA verideki sapan değerlere karşı oldukça duyarlıdır. Bu problemle başa çıkmak için iz düşüm takibini (projection pursuit) temel alan dayanıklı TBA kullanımı oldukça dikkat çekici bir yöntem olarak ileri sürülmüştür. Diğer taraftan, Temel Bileşen Regresyonu (TBR), TBA ile en küçük kareler regresyonunun birleşimi olarak görülebilmektedir. Bu çalışmada, TBR’nin bu genel yapısı dikkate alınarak, sapan değerlere karşı dirençli olan versiyonları üzerinde çalışılmıştır. İncelenen yöntemlerin performans karşılaştırmaları detaylı bir benzetim çalışması ve çeşitli gerçek veri kümeleri üzerinde gösterilmiştir.

___

  • Alfons, A., Croux, C. and Gelper, S., 2013. Sparse Least Trimmed Squares Regression for Analyzing High-Dimensional Large Data Sets. Annals of Applied Statistics, 7, 226-248.
  • Campbell, N. A., 1980. Robust Procedures in Multivariate Analysis: Robust Covariance Estimation. Applied Statistics, 29, 231-237.
  • Croux, C. and Haesbroeck, G., 2000. Principal Component Analysis Based on Robust Estimators of the Covariance or Correlation Matrix: Influence Functions and Efficiencies. Biometrika, 87, 603-618.
  • Croux, C. and Ruiz-Gazen, A., 2005. High Breakdown Estimators for Principal Components: The Projection-Pursuit Approach Revisited. Journal of Multivariate Analysis, 95, 206-226.
  • Croux, C., Filzmoser, P. and Oliveira, M. R., 2007. Algorithms for Projection-Pursuit Robust Principal Component Analysis. Chemometrics and Laboratory Systems, 87, 218-225.
  • Cui, H., He, X. and Ng, K. W., 2003. Aysptotic Distribution of Principal Components Based on Robust Dispersions, Biometrika, 90, 953-966.
  • Filzmoser, P., Fritz, H. and Kalcher, K., 2018. pcaPP: Robust PCA by Projection Pursuit. R Foundation for Statistical Computing, Vienna, Austria, http:// CRAN.R-project.org/package=pcaPP R Package Version 1.9-73.
  • Friedman, J., Hastie, T. and Tibshirani, R., 2001. The Elements of Statistical Learning. vol. 1, Springer Series in Statistics Springer, Berlin.
  • Hotelling, H., 1957. The Relations of the Newer Multivariate Statistical Methods to Factor Analysis. Brit. J. Stat. Psychol., 10, 69-79.
  • Huber, P. J., 1985. Projection Pursuits. The Annals of Statistics, 13, 435-525.
  • Hubert, M., Rousseeuw, P. J. and Verboven, S., 2002. A Fast Method for Robust Principal Components with Applications to Chemometrics, Chemometrics and Intelligent Laboratory Systems, 60, 101-111.
  • Hubert,M., Rousseeuw, P.J. and Vanden Branden, K., 2005. ROBPCA: A New Approach to Robust Principal Component Analysis. Technometrics, 47(1), 64-79.
  • Janssens, K., Deraedt, I., Freddy, A. and Veekman, J., 1998. Composition of 15-17th Century Archeological Glass Vessels Excavated in Antwerp, Belgium. Mikrochimica Acta, 15, 253-267.
  • Jeffers, J. N. R., 1967. Two Case Studies in the Application of Principal Component Analysis. Appl. Statist., 16, 225-236.
  • Johnson, R. A. and Wichern, D. W., 1998. Applied Multivariate Statistical Analysis, 4th edition, Prentice Hall, New York.
  • Li, G. and Chen, Z., 1985. Projection-Pursuit Approach to Robust Dispersion and Principal Components: Primary Theory and Monte Carlo. Journal of the American Statistical Association, 80, 759-766.
  • Kendall, M. G., 1957. A Course in Multivariate Anaysis, London: Griffin.
  • Maronna, R. A., 1976. Robust M-Estimators of Multivariate Location and Scatter. The Annals of Statistics, 4, 51-67.
  • Maronna, R. A., 2005. Principal Components and Orthogonal Regression Based on Robust Scales. Technometrics, 47, 264-273.
  • Maronna, R. A., Martin, R. D. and Yohai, V. J., 2006. Robust Statistics Theory and Methods, John Wiley and Sons Ltd.: England.
  • Maronna, R. A., 2011. Robust Ridge Regression for High-Dimensional Data, Technometrics, 53, 44-53.
  • R Development Core Team, 2013. R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0. URL http://www.R-pro ject.org
  • Stamey Thomas, A., Kabalin John, N., McNeal John, E. et al., 1989. Prostate Specific Antigen in the Diagnosis and Treatment of Adenocarcinoma of the Prostate. II. Radical Prostatectomy Treated Patients. The Journal of Urology, 141, 1076-1083.
  • Varmuza, K. and Filzmoser, P. 2008. Multivariate Statistical Anaysis in Chemometrics, CRC Press: Taylor and Francis Group.
  • Xie, Y., Wang, Y., Liang, Y., Sun, L., Song, X. and Yu, R. Q., 1993. Robust Principal Component Analysis by Projection Pursuit. Chemometrics, 7, 527-541.
  • Yohai, V. J., 1987. High Breakdown Point and High Efficiency Robust Estimates for Regression. The Annals of Statistics, 15, 642-656.
  • Zou, H. and Hastie, T. 2005. Regularization and Variable Selection via the Elastic Net. Journal of Royal Statistical Society Series B, 67, 301-320.