MAKRO İKTİSAT VERİLERİNDE KAYIP VERİLERİN REGRESYONA DAYALI EN YAKIN KOMŞU “HOT DECK” YÖNTEMİ İLE TAMAMLANMASI

Ülke verileri söz konusu olduğunda araştırmacılar çok boyutlu uzaklıkları kullanan kümeleme yaklaşımlarını tercih etmektedirler. Ancak karşılıklı bağımlılık içeren değişkenler arasında kayıp verilerin tamamlanmasında regresyon yönteminin kullanılması yaygındır. Bunun sebebi değişkenler arasındaki bağımlılık yapısının bozulmamasının amaçlanmasıdır. Bu durumda her iki analiz yaklaşımının da ortak ele alınarak hem göstergeler arasındaki bağımlılığın korunması hem de ülkeler arasındaki anlamlı uzaklıkların yok edilmemesini sağlayacak yöntemler tercih edilmelidir. Çalışmada regresyona dayalı en yakın komşuluk algoritmasının kullanılmasıyla elde edilen sonuçlar diğer yöntemlerle elde edilen sonuçlarla karşılaştırılmıştır.

Anahtar Kelimeler:

Kayıp veri tamamlama, İktisadi veri, Kümeleme

The researchers prefer usually classification approaches which are depending on multidimensional distances to handle country data. However, it is used to impute the missing data with regression analysis at interdependent variables. The reason is to prevent the interdependency structure among variables. Therefore, there is a need for missing value imputation techniques, which will prevent the collinearity among economic variables and the significant distances among the countries. This research discussed the comparison between the results of hot-deck imputation with regression analysis and other imputation methods.

Keywords:

Missing data imputation, Economical data, Classification,

PDF

___

Afifi, A.A., R.M., Elashoff, (1966), “Missing Observations in Multivariate Statistics I: Review of the Literature”, Journal of American Statist. Assoc., 61 , 595-604.
Dalsgaard, T., C., Andre ve P., Richardson (2001), “Standard Shocks In The Oecd Interlink Model” Economıcs Department Workıng Papers No. 306, 32
Denk, M., P., Hackl (2003), “Data Integration and Record Matching: An Austrian Contribution to Research in Official Statistics”, Austrian Journal of Statistics, Volume 32 , Number 4, 305-321
Eurostatistics (2008), Data for Short-term Economic Analysis, Volume 2, 1-23.
Fogarty, D.J. (2001), “Multiple Imputation as a Missing Data Approach to Reject Inference on Consumer Credit Scoring”, Affiliation, , s:8.
Gediga,G.; I., Duntsch (2003), “Maximum Consistency of Incomplete Data via Non-invasive Imputation”, Artificial Intelligence Review, Volume 19 , Issue 1, 93-107
Hastie, T.; R. Tibshirani, G. Sherlock, M. Eisen, P. Brown, ve D. Botsein (1999), Imputing missing values for gene expression arrays. Technical report, Divison of Biostatistics, Stanford University.
Hawkins, M.R., V.H., Merriam, (1991), “An Overmodeled World”, Direct Marketing, 21-24.
Laaksonen, S. (2000), “Regression-based nearest neighbour hot decking”, Comput. Statist. 15, 65–71.
Little, R., D., Rubin (1986), “Statistical Analysis with Missing Data”, Wiley, New York .
Northolt, E.S.(2000), “Hot deck imputation in the Dutch Structure of Earnings Survey”, Proceedings ETK European Statistical Laboratory, 239-242.
Tatlıdil, H., “Uygulamalı Çok Değişkenli İstatistiksel Analiz”, Ankara, 2002
Wasito, I. (2003), “Least Squares Algorithms with Nearest Neighbour Techniques for Imputing Missing Data Values”, Doktora Tezi, University of London, 9-28.