Bilgi karmaşıklığı (ICOMP) kriterinin yeni bir sınıfı ile müşteri profili oluşturma ve segmentasyonu uygulaması

Bu çalışma, ICOMP olarak adlandırılan bilgi karmaşıklığı kriterinin yeni bir sınıfının tanıtımını amaçlamaktadır. Bu kriter, istatistiksel modellemede yeni yaklaşımlara yardım sağlamaktadır ve en iyi modelin seçilmesinde bir karar kuralı olarak kullanılır. ICOMP’un önemi ve kullanımı, veri madenciliğinde yeni bir yöntem olan “çok sınıflı destek vektör makineleri”ni kullanarak (MSVMRFE), müşteri profili oluşturma ve segmantasyonu uygulamasında örnek verilerek gösterilmiştir. Bu çalışmada önerilen yeni modelleme, cep telefonu kullanan müşterilerin sınıflandırılmasında, klasik diskriminant analizine göre elde edilen yanlış sınıflandırma oranının %32’sinden daha iyi bir performans göstermiştir. Bu sonuçlar, yeni bir mikro-pazarlama analiz yöntemi olarak kullanılabilir. Ayrıca bu sonuçlar veri tabanlarını daha iyi analizler yaparak sınıflandırmada daha çok müşteri kazanmak isteyen veya ellerindeki müşterileri kaybetmek istemeyen cep telefonu piyasasının dikkatini çekebilir.

Anahtar Kelimeler:

Yeni ICOMP sınıfı kriterler, kovaryans karmaşıklığı, tahminlenmiş Fisher bilgi matrisi (FIM) tersi, model seçme, çok sınıflı destek vektör makineleri – yinelemeli özellikli eleme, müşteri profili ve segmantasyonu

A new class of information complexity (ICOMP) criteria with an application to customer profiling and segmentation

This paper introduces several forms of a new class of information-theoretic measure of complexity criterion called ICOMP as a decision rule for model selection in statistical modeling to help provide new approaches relevant to statistical inference. The practical utility and the importance of ICOMP is illustrated by providing a real numerical example in data mining of mobile phone data for customer profiling and segmentation of mobile phone customers using a novel multi-class support vector machine-recursive feature elimination (MSVM-RFE) method. The approach proposed in this paper outperforms the classical discriminant analysis techniques over 32% in terms of misclassification error rate. This is a remarkable achievement due to using MSVM-RFE hybridized with ICOMP that was not possible using other methods to classify the mobile phone customer data base as a new micro-marketing analytics. This should capture the attention of the mobile phone industry for more refined analysis of their data bases for customer management and retention.

Keywords:

ICOMP class of criteria, covariance complexity, estimated inverse-Fisher information matrix (FIM), model selection, multi-class support vector machine-recursive feature elimination (MSVM-RFE), customer profiling and segmentation,

PDF

___

H. Bozdogan, ICOMP: A new model-selection criterion. In classification and related methods of data analysis, H. H. Bock (Ed.), Elsevier Science Publishers, Amsterdam, 1988, pp.599-608.
H. Bozdogan, On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models. Communications in Statistics, Theory and Methods. 19, 221-278 (1990).
H. Bozdogan, Mixture-model cluster analysis using a new informational complexity and model selection criteria. In Multivariate Statistical Modeling, H. Bozdogan (Ed.), Vol. 2, Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach, Kluwer Academic Publishers, the Netherlands, Dordrecht, 1994, pp.69-113.
H. Bozdogan, Akaike's information criterion and recent developments in information complexity. Journal of Mathematical Psychology. 44, 62-91 (2000).
H. Bozdogan, Statistical Modeling and Model Evaluation: A New Informational Approach. To appear (2004).
H. Bozdogan, Information Complexity and Multivariate Learning in High Dimensions in Data Mining. To appear (2011).
H. Akaike, Information theory and an extension of the maximum likelihood principle. In B.N. Petrov and F. Csáki (Eds.), Second international symposium on information theory, AcadémiaiKiadó, Budapest, 267-281 (1973).
M.H. van Emden, An Analysis of Complexity. Mathematical Centre Tracts, Amsterdam, 35 (1971).
J. Rissanen, Minmax entropy estimation of models for vector processes. In System Identification: R.K. Mehra and D.G. Lainiotis (Eds.), Academic Press, New York, 1976, pp.97-119.
H. Bozdogan, Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika. 52, 3, 345-370 (1987).
H. Cramér, Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ, 1946.
C.R. Rao, Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math Soc. 37, 81 (1945).
C.R. Rao, Minimum variance and the estimation of several parameters. Proc. Cam. Phil. Soc. 43, 280 (1947).
C.R. Rao, Sufficient statistics and minimum variance estimates. Proc. Cam. Phil. Soc. 45, 213 (1948).
H. Bozdogan and D.M.A. Haughton, Informational complexity criteria for regression models. Computational Statistics and Data Analysis. 28, 51-76 (1998).
H. Bozdogan and M. Ueno, A unified approach to information-theoretic and Bayesian model selection criteria. Invited paper presented in the Technical Session Track C on: Information Theoretic Methods and Bayesian Modeling at the 6th World Meeting of the International Society for Bayesian Analysis (ISBA), May 28-June 1, 2000, Hersonissos-Heraklion, Crete (2000).
H. Bozdogan and P. M. Bearse, Subset selection in vector autoregressive models using the genetic algorithm with informational complexity as the fitness function. Systems Analysis, Modeling, Simulation (SAMS) (1998).
S. Kullback, Information Theory and Statistics. Dover, New York, 1968.
C.J. Harris, An information theoretic approach to estimation. In M. J. Gregson (Ed.), Recent Theoretical Developments in Control, Academic Press, London, 1978, pp.563-590.
H. Theil and D.G. Fiebig, Exploiting Continuity: Maximum Entropy Estimation of Continuous Distributions. Ballinger Publishing Company, Cambridge, MA, (1984).
S. Kullback, and R. Leibler, On information and sufficiency. Ann. Math. Statist. 22, 79-86 (1951).
C.E. Shannon, A mathematical theory of communication. Bell Systems Technology Journal, 27, 1948, pp. 379-423.
S. Watanabe, Pattern Recognition: Human and Mechanical. John Wiley and Sons, New York, 1985.
J. Rissanen, Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Company, Teaneck, NJ, 1989.
R.E. Blahut, Principles and Practice of Information Theory. Addison-Wesley Publishing Company, Reading, MA, 1987.
C.R. Rao, Linear Statistical Inference and Its Applications. John Wiley& Sons, New York, 1965, p. 532.
S.A. Mulaik, Linear causal modeling with structural equations, CRC Press, A chapman and Hall Book, 2009, p. 368.
S.A. Mustonen, measure of total variability in multivariate normal distribution. Comp. Statist. and Data Ana. 23, 321-334 (1997).
S.D. Morgera, Information theoretic covariance complexity and its relation to pattern recognition. IEEE Trans. on Syst., Man, and Cybernetics. SMC 15, 608-619 (1985).
J.B. Conway, Functions of one complex variable I, Second edition, Springer-Verlag, 1995.
L. Ljung and J. Rissanen, On canonical forms, parameter identifiability and the concept of complexity. In Identification and System Parameter Estimation, N. S. Rajbman (Ed.), North-Holland, Amsterdam, 1415-1426 (1978).
M. S. Maklad and T.Nichols, A new approach to model structure discrimination. IEEE Trans. on Syst., Man, and Cybernetics. SMC 10, 78-84 (1980).
D.S. Poskitt, Precision, Complexity and Bayesian model determination. J. Roy. Statist. Soc. 49, 199-208 (1987).
B.R. Frieden, Physics from fisher information, Cambridge University press, 1998.
J.Rissanen, Modeling by shortest data description. Automatica, 14, 465-471 (1978).
G.Schwarz, Estimating the dimension of a model. Ann. Statist., 6, 461-464 (1978).
A.D.R. McQuarie, and C-L. Tsai, Regression and Time Series Model Selection. World Scientific Publishing Company, Singapore, 1998.
K.P. Burnham and D. R. Anderson, Model Selection and Inference: A Practical Information-Theoretic Approach. Springer, New York, 1998.
D.V. Lindley, On a measure of information provided by an experiment, The Annals of Mathematical Statistics 27, 4, 986-1005 (1956).
K. Chaloner and I. Verdinelli, Bayesian experimental design a review. Statistical Science. 10, 3, 273-304 (1995).
R.E. Kass, L. Tierney, and J.B. Kadane, The validity of posterior expansions based on Laplace’s method. In: GEISSER, S. et al. (Ed.), Bayesian and likelihood methods in statistics and econometrics: essays in honor of George A. Barnard. Amsterdam: North-Holland, 1990. 473-488, 1990.
X. Chen, Model Selection in Nonlinear Regression Analysis. Unpublished Ph.D. Thesis, the University of Tennessee, Knoxville, TN, 1996.
K. Takeuchi, Distribution of information statistics and a criterion of model fitting. Suri-Kagaku. Mathematical Sciences. 153, 12-18 (1976).
J.R.M. Hosking, Language-multiplier tests of time-series models. Journal of the Royal Statistics Society. Series B, 42, 170-181 (1980).
R. Shibata, Statistical aspects of model selection. In J.C. Willems (Ed.), From the data to modeling, Berlin: Springer-Verlag, 1989, pp. 216-240.
A. Howe and H. Bozdogan, Regularized SVM classification with information complexity and the genetic algorithm, appear in multivariate high dimensional data mining forthcoming edited book, 2011.
V. Vapnik, The nature of statistical learning theory, springer-verlag, New York, 1995.
C. Hsu and C. Lin, A comparison of method for multiclass support vector machines. IEEE Transactions on Neural Networks. 13, 2 (2002).
F. Camillo, Personal correspondence, 2007.
F. Camillo, C. Liberati and K.A. Athappilly, Profiling of customer data base through a sample survey, unpublished report, 2009.
E. Wegman, hyperdimensional data analysis using parallel coordinates. Technical Report No.1. George Mason University Center for Computational Statistics (1986).
S.H. Baek and H. Bozdogan, Multi-class support vector machine recursive feature elimination using information complexity, working paper (2011).