Download click here and type title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
International Biometric Society
ESTIMATION OF THE INTER-OBSERVER VARIABILITY IN DIAGNOSTIC TRIALS: KAPPA VS.
KRIPPENDORFF’S ALPHA
Antonia Zapf
Department of Medical Statistics, University Medical Center Göttingen
The reproducibility of test results is an important topic in diagnostic accuracy trials.
Especially in trials for imaging agents, where the decision can be rather subjective, the interobserver variability should be estimated and discussed. In the corresponding European
guideline [1] as well as in the STARD- (STAndards for Reporting of Diagnostic accuracy)
statement [2], the kappa-coefficient is mentioned as a measure for agreement. A review of
the literature reveals that kappa together with the percentage agreement is used in general
(see for example Leeuwenburgh et al. [3]). However, as Cohen‘s kappa can lead to
paradoxic results (see for example Feinstein and Cicchetti [4]), many alternative measures
were proposed during the last years. Krippendorff’s alpha as one alternative has the
advantage of great flexibility (several observers/categories and missing values can be
considered).
Therefore, the properties of kappa and Krippendorff’s alpha will be compared in the talk.
Results of a simulation study and of examples will be presented and discussed.
References:
[1] EMA, CHMP (2010). Appendix 1 to the guideline on clinical evaluation of diagnostic
agents on imaging agents.Doc. Ref. EMEA/CHMP/EWP/321180/2008.
[2] Bossuyt et al. (2003). The STARD statement for reporting studies of diagnostic accuracy:
explanation and elaboration. Clinical Chemistry, 49(1):7-18.
[3] Leeuwenburgh et al. (2013). Accuracy and interobserver agreement between MR-nonexpert radiologists and MR-experts in reading MRI for suspected appendicitis. European
Journal of Radiology, doi: 10.1016/j.ejrad.2013.09.022.
[4] Feinstein and Cicchetti (1990). High agreement but low kappa: I. The problems of two
paradoxes. Journal of Clinical Epidemiology, 43(6):543-549.
International Biometric Conference, Florence, ITALY, 6 – 11 July 2014