Agreement models for multiraters

Agreement between 2 or more independent raters evaluating the same items and same scale can be measured by kappa coefficient. In recent years, modeling agreement among raters rather than summarizing indices has been preferred. In this study, the disadvantages of kappa are reviewed. Agreement models are introduced and these models are applied to a real data set. Materials and methods: Three pathologists classified each of 118 slides in terms of carcinoma in situ of the uterine cervix, based on the most involved lesions. Using log-linear agreement models, agreement between 3 pathologists according to their evaluations was investigated. Results: Coefficient of kappa was found to be 0.48 among the 3 pathologists, which indicates a moderate agreement. Models were applied to the data. The agreement parameter was estimated for the best model among models. The probability of giving the same decision by the 3 pathologists was 2.5 times higher than that of giving a different decision. Conclusion: Log-linear models can be used to measure the agreement among more than 2 raters. Modeling agreement can provide more information than kappa.

Anahtar Kelimeler:

Key words: Agreement, log-linear models, uterine cancer

Agreement models for multiraters

Keywords:

Key words: Agreement, log-linear models, uterine cancer,

PDF

___

Cohen JA. Coeffi cient of agreement for nominal scales. Educational and Psychological Measurement 1960; 20: 37-46. 2. Agresti A. Categorical data analysis. New York: Wiley; 2002. 3. Fleiss J, Cohen J, Everitt, BS. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin 1969; 72: 323-7.
Fleiss J, Cohen J. Th e equivalence of weighted kappa and intraclass correlation coeffi cient as measure of reliability. Educational and psychological measurement 1973; 33: 613-9. 5. Landis JR, Koch GG. Th e measurement of observer agreement for categorical data. Biometrics 1977; 33: 159-74.
Tanner MA, Young MA. Modeling agreement among raters. Journal of American Statistical Association 1985; 80: 175-80.
Tanner MA, Young MA. Modeling ordinal scale disagreement. Psychological Bulletin 1985; 98: 408-15.
Lawal, B. Categorical data analysis with SAS and SPSS applications. Mahwah, New Jersey, London: Lawrence Erlbaum Associates Publishers; 2003.
Agresti A. Loglinear modeling of pairwise interobserver agreement on a categorical scale. Statistics in Medicine 1992; 11: 101-14.
Perkins SM., Becker MP. Assessing rater agreement using marginal association models. Statistics in Medicine 2002; 21: 1743-60.
Shoukri M. Measures of interobserver agreement. Chapman & Hall/CRC, Boca Raton, FL, USA; 2004.