Download estimating the accuracy of diagnostic imaging based on multiple

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
International Biometric Society
ESTIMATING THE ACCURACY OF DIAGNOSTIC IMAGING BASED ON MULTIPLE
RATERS USING RANDOM EFFECTS MODEL
Hiroyuki Saeki1,3, Toshiro Tango2 and Jinfang Wang3
1
Development Department, FUJIFILM RI Pharma Co., Ltd., Japan
2
Center for Medical Statistics, Japan
3
Graduate School of Science, Chiba University, Japan
In clinical trials designed to demonstrate the efficacy of a diagnostic imaging, multiple
independent raters are often required to evaluate the images from the diagnostic imaging to
confirm the inter-rater reliability. Although we can estimate the accuracy (sensitivity or
specificity) by considering consensus evaluations or majority votes to handle multiple results
from the multiple raters as if there were a single rater, these methods are not recommended
for the primary evaluation. The consensus evaluations may produce a bias caused by nonindependent evaluations. Moreover, the majority votes cannot take into account the
variability in results of the multiple raters. Therefore, all results from the multiple
independent raters should be used in the analysis. For this issue, Saeki and Tango (2011)
have provided a non-inferiority test, confidence interval and sample size formula for
inference of the difference in correlated proportions based on the multiple raters. However,
there are few adequate methods of summarizing correlated proportions across the multiple
raters. In this presentation, we propose a method to integrate correlated proportions
estimated from the multiple independent raters.
We consider a study design in which all images are read by all raters. For easier
explanation, we shall present our results on the case of two raters. The fixed effects model
may be used to take into account the within-subject correlation across the two raters using a
bivariate normal distribution. Moreover, we assume the random effects model to account for
not only the within-subject correlation but also the between-rater variance. For estimating
the between-rater variance, we shall apply the DerSimonian and Laird’s method (1986).
Then, maximum likelihood estimator (MLE) and restricted maximum likelihood estimator
(REML) are estimated based on the aforementioned models. We conducted Monte Carlo
simulation studies to examine the bias of MLE, REML and simple mean of the rater-wise
proportions. Our results indicate that the biases of REML and simple mean are smaller than
that of MLE. We will also discuss and compare various types of confidence intervals for
proportion based on both normal theories and nonparametric bootstrap methods. The
bootstrap intervals are easy to use and have good coverage properties. Furthermore, we
shall illustrate our method using data from a study of diagnostic imaging to predict the
presence of Alzheimer’s disease.
References
Saeki, H. and Tango, T. (2011) Non-inferiority test and confidence interval for the difference
in correlated proportions in diagnostic procedures based on multiple raters. Statistics in
Medicine 30, 3313-3327.
DerSimonian R, Laird N. (1986) Meta-analysis in clinical trials. Controlled Clinical Trials 7,
177-188.
International Biometric Conference, Florence, ITALY, 6 – 11 July 2014