Ayfer Ezgi Yilmaz, Tulay Saracbasi

Agreement and adjusted degree of distinguishability for square contingency tables

In square contingency tables, analysis of agreement between the row and column classifications is of interest. In such tables, kappa or weighted kappa coefficients are used to summarize the degree of agreement between two raters. In addition to investigate the agreement between raters for square contingency tables, category distinguishability should be considered. Because the kappa coefficient is insufficient to measure the category distinguishability, the degree of distinguishability is suggested to use. In practice, some problems have occurred with regards to the use of the degree of distinguishability. The aim of this study is to assess the agreement coefficient and degree of distinguishability in square contingency tables together. In this study, the adjusted degree of distinguishability is suggested to solve the problem of calculating the degree of distinguishability falls outside the defined range. A simulation study is performed to compare the proposed adjusted degree of distinguishability and the classical degree of distinguishability. Furthermore, interpretation levels for the degree of distinguishability are determined based on a simulation study. The results are discussed over numerical examples and simulation.

Keywords:

Agreement, Degree of distinguishability Kappa coefficient, Square contingency table,

PDF

___

Agresti, A. Categorical data analysis (John Wiley and Sons, New York, 2002).
Becker, M.P. and Agresti, A. Log-linear modelling of pairwise interobserver agreement on a categorical scale, Statistics in Medicine 11 (1), 101-114, 1992.
Cicchetti, D. and Allison, T. A new procedure for assessing reliability of scoring eeg sleep recordings, American Journal EEG Technology 11, 101-109, 1971.
Cohen, J. A coefficient of agreement for nominal scales, Educational and Psychological Measurement 20 (1), 37-46, 1960.
Cohen, J. Weighted Kappa: Nominal scale agreement with provision for scaled disagreement or partial credit, Psychological Bulletin 70 (4), 213-220, 1968.
Darroch, J.N. and McCloud, P.I. Category distinguishability and observer agreement, Australian Journal of Statistics 28 (3), 371-388, 1986.
Fleiss, J.L. and Cohen, J. The equivalence of weighted kappa and the intraclass correlation coefficient as measure of reliability, Educational and Psychological Measurement 33, 613- 619, 1973.
Goktas, A., Isci, O. A comparison of the most commonly used measures of association for doubly ordered square contingency tables via simulation, Metodoloski Zvezki 8 (1), 17-37, 2011.
Holmquist, N.D., McMahon, C.A., and Williams, O.D. Variability in classification of carcinoma in situ of the uterine cervix, Archives of pathology 84, 334-345, 1967.
Landis, J.R. and Koch, G.G. The measurement of observed agreement for categorical data, Biometrics 33 (1), 159-174, 1977a.
Landis, J.R. and Koch, G.G. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers, Biometrics 33 (2), 363-374, 1977b.
Lawal, B. Categorical data analysis with SAS and SPSS applications (Lawrence Erlbaum Associates, Publishers, Inc., New Jersey, 2003).
Oh, M. Inference on measurements of agreement using marginal association, Journal of the Korean Statistical Society 38, 41-46, 2009.
Perkins, S.M. and Becker, M.P. Assessing rater agreement using marginal association models, Statistics in Medicine 21, 1743-1760, 2002.
Saracbasi, T. Agreement models for multiraters, Turkish Journal of Medical Sciences 41 (5), 939-944, 2011.
Shoukri, M.M. Measures of interrater agreement (Chapman & Hall/CRC Press LLC., Florida, 2004).
Terry, M.B., Neugut, A.I., Bostick, R.M., Potter, J.D., and Haile, R.W. Reliability in the classification of advanced colorectal adenomas, Cancer Epidemiol Biomarkers & Prevention 11, 660-663, 2002.
Tinsley, H.E.A. and Weiss, D.J. Interrater reliability and agreement, in: Handbook of applied multivariate statistics and mathematical modeling (Academic Press, New York, 2010).
Valet, F. and Mary, J.-Y. Power estimation of tests in log-linear nonuniform association models for ordinal agreement, BMC Medical Research Methodology 11 (1), 70-80, 2011.