Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CIJN. CHEM.24/6, 857-861 (1978) Comparison of Product Moment and Rank Correlation Coefficients in the Assessment of Laboratory Method-Comparison Data P. Joanne Cornbleet and Margaret C. Shea We have studied the effects of range and distribution of data on product moment and rank correlation coeff i- cients when deviation from a linear relationship was due solely to experimentally produced random error. All correlation coefficients (Pearson r, Spearman rho, and Kendall tau) were markedly influenced by the range of the data, and, for the rank correlation coefficients, the effect of range varied for different data distributions. While correlation coefficients may be useful in assessing whether an association exists between two variables, they are not useful in assessing the degree of random error about the regression line when a strong linear association is presumed to exist between the two variables. Thus, neither product moment nor rank correlation coefficients are of value in analysis of laboratory method-comparison data. The standard deviation of the residual error of regression should be calculated as a measure of the random error about the regression line. AddItIonal Keyphrases: parison statistics . intermethod com- Parametric regression and correlation coefficients are frequently used in assessment of laboratory method comparison data. The slope (fi)’ and intercept (a) of the least-squares line are sensitive to proportional or constant bias between the methods, while the product moment correlation coefficient (r) reflects random rather than systematic error between the two methods (1). An r value of 0.95 or better is thought to indicate an excellent linear fit between the “test” and “reference” methods, with little random scatter about the line. However, the product moment correlation coefficient has been shown to depend markedly on the range (or standard deviation) of the data as well as on random error about the line (1, 2). Thus, different correlation coefficients for similar method comparison experiments may arise simply because of differ- ences in the range of values used. Furthermore, distribution of the r statistic is strongly dependent on the assumption of bivariate normality of the data (2), i.e., both the x and y variables must have a gaussian distribution; for any fixed x, all y must have a gaussian distribution; and for any fixed y, all x must have a gaussian distribution. Since laboratory values collected for method-comparison studies frequently do not meet this criterion, the non-robust r statistic cannot be readily interpreted. Although nonparametric rank correlation coefficients are not subject to such constraints (3), they are not widely used in the medical distribution reflect curvilinear as well as linear association and have 0.91 the power of the product moment correlation coefficient when the data is bivariate gaussian. If linearly related laboratory method-comparison data are analyzed, values of Spearman p and Kendall r are interpreted similarly to Pearson r; values range from -1 to +1, with a value of 0 indicating no association between the methods, and a value of -1 or +1 implying a perfect negative or positive relationship between the two methods. The influence correlation coefficient; p, Spearman rho rank correlation coefficient; r, Kendall tau rank correlation coefficient; e, standard deviation of the reference (x) values; u,,, standard deviation of the test (y) values; a3,.x standard deviation of the residual error of regression (i.e., standard deviation of the differences between the actual y values and the y values predicted by the least square line); o, standard deviation of the random errors, E, artificially added to the “reference” values (x) to generate the “test” values (y); COV (x-E), covariance between the errors and the x values; , the slope of the least-squares line; a, the intercept of the least-squares line. Received Dec. 27, 1977; accepted Mar. 13, 1978. of data range on nonparametric correlation coefficients has not been sufficiently investigated. Wu et al. (4) and Reed (5) cite one example in which the removal of one extreme data point markedly reduced Pearson r, but changed Spearman p only slightly. If, indeed, rank correlation correlation coefficients could reflect the degree of random error or scatter about a regression line when a true linear relationship is apparent, uninfluenced by the range or distribution of the data, such statistics would be useful in evaluating random error between two laboratory methods or random error about a calibration line. This paper compares the product moment and rank correlation coefficients calculated from data simulating results likely to be obtained in clinical laboratory method-comparisons. Department of Pathology, University of California, San Diego, La Jolla, Calif. 92093. Nonstandard abbreviations used: r, Pearson product moment literature. Unlike Pearson r, the distribution p and Kendall r does not depend on the of the sample data. Furthermore, p and r will of both Spearman Both computer-generated distributions and test values for hospital inpatients were used for the “reference” method data. The “test” method data differed from the reference data only by the addition of constant random error. Using this model, the effect of range and data distribution on parametric and nonparametric correlation coefficients is assessed. Materials and Methods Generation Gaussian of Reference or log-gaussian were generated Data reference populations (x variable) with the aid of a table of cumulative normal probability function (6). One-hundred ascending values at equal probability intervals were selected to give a gaussian CLINICAL CHEMISTRY, Vol. 24, No. 6, 1978 857 35 U) 40 30 30 20 25 U, 10 0 > 0 20 LU F 0 Thr,Thr 100 2#{243}0 300400 500600 700 U) REFERENCE VALUE 0 Fig. 2. Histogram of alkaline phosphatase reference data 15 0 LU oughly checked 10 z by the use of test data with known statistical parameters. Least-squares 5 regression coefficients, the product - 0 0 100 200 300 400 500 1000 2000 REFERENCE VAUJE Fig. 1. Frequency dIstribution of simulated lognormal data, formed by taking the antilog of 100 gaussian distributed values wIth a mean of 2 and a standard deviation of (1)0.1, (2)0.2, (3) 0.3, (4)0.4, and (5) 0.6 All have a median value close to 100 distribution with a mean of 0 and a standard deviation of 1. Gaussian populations with a mean of 100 and standard deviations of 20, 25, 40, and 60 were obtained by multiplying each value by the desired standard deviation and adding the mean. Log-normal distributions with a median of 100 and varyious degrees of skewness were obtained by taking the antilog of gaussian data with a mean of 2 and standard deviations of 0.1, 0.2,0.3,0.4, and 0.6. One-hundred laboratory values for inpatients for serum alkaline phosphatase and chloride (with medians near 100) were also randomly selected to be reference populations to illustrate skewed and narrowly clustered laboratory data distributions that might be obtained when performing a method comparison test. Generation of Test Data with Constant Random Error Test data (y variable) were obtained from the reference data (x variable) by alternately adding or subtracting a constant error value, E, of 1,2,5, 10, 15,20, and 25 to the ranked x -data. Although such a manipulation does not generate truly random variation, this type of model has been previously used with success to demonstrate the response of regression and correlation coefficients to random and systematic errors (1). However, for both least-squares regression and product moment correlation coefficients to be valid, there must be no covariance present between the x -data and the random error added, i.e., the correlation coefficient between the x-data and the random error must be zero. This is achieved by reversing the order of addition of positive and negative error at a given point in the data, as described in the Appendix. Computation of Statistics All computations written 858 were performed by a computer of this program in COBOL. The accuracy CLINICAL CHEMISTRY, Vol. 24, No.6, 1978 program thor- was moment correlation coefficient, the standard error of regression, and the covariance between the reference values and the error values were calculated by standard formulas (7). All standard deviations were considered to be those of populations rather than samples (divided by n rather than n 1 or n 2), so that there would be perfect agreement between the standard deviation used to generate the data and that returned by calculation. Spearman rank correlation coefficient was calculated by the shortcut method of summing the squares of differences between the ranks of the data (4), and also, in the case of greater than five ties in either y or x variables, by computing a product moment correlation coefficient between the ranks of x and y. The Kendall rank correlation coefficient was calculated by counting the proportion of concordant pairs of data (i.e., both members of one observation are larger or smaller than the respective members of the other observation) out of the () total possible pairs, and applying a correction factor for tied values (7). A two-sided Kolmogorov-Smirnoff test (7) was performed to assess the normality of the laboratory data. - Results Adequacy of the Model All gaussian reference data had the expected median and mean of 100 and expected standard deviations of 20, 25, 40, and 60. Lognormal reference data sets 1 through 5 showed increasing skewness as the standard deviation of the gaussian population from which they were generated was increased (Figure 1). Medians of all five lognormal data sets were 100, but the means of the data were 102.7, 111.1, 126.5, 151.2, and 246, and the standard deviations 23.8, 52.7, 93.1, 154.6, and 414.2. The alkaline phosphatase data (Figure 2) were skewed in distribution, differing significantly from gaussian (P <0.05) by the Kohnogorov-Smirnoff test. The median of the alkaline phosphatase data was 84, the mean 126.6, and the standard deviation clustered 106.5. The chloride data (Figure 3) were closely with a median of 104, mean of 103.8, and standard deviation of 4.94, and did not exhibit a significant deviation from normality. A typical plot of test vs. reference data generated by this model is shown in Figure 4. As expected, alternate addition and subtraction of constant error generated points parallel to a line with = 1, a = 0. Covariance between the reference values and the random error can be expected to cause deviation of the calculated regression parameters a, 3, and from the expected values of 0, 1, and IEI, the absolute added and subtracted value of the error alternately to the reference values. That our model 40 A 1.0 U) 0.9 0 08 (I’ wIi&. 0.7 00 “c-’ 0.6 LU U) 20 0.5o 20 40 0 20 0 o_x 10 LU 60 40 60 o_x Fig. 5. Effect on the correlation coefficient of varying standard deviation of reference data Z 0 100 50 150 REFERENCE VALUE FIg. 3. 1.0 + #{149} *+, S0 Histogram of chloride reference data 0 0.9 0 0 + 0 200 0.8 .:..#{149}#{149}#{149} 0.7 #{149} ,_-. #{149}00 0 100 Cl) U- .#{149}. 0.6 :ss ..: 00 100 0.5 200 REFERENCEcI4TA FIg. 4. Gaussian reference data with mean of 100, standard deviation, 25, standard error of regression = 25 minimized such covariance was evidenced by the excellent agreement of the observed and expected values of a, fi, and For all regressions performed, values of a ranged from -1.1 to +2.9 and those for from 0.97 to 1.01. For E = 25, the largest random error used, ay.x equalled 25.0 for all data. UJO4 o 0.3 0.2 0 1 In addition, this model resulted in the appropriate values for r and p predicted mathematically for the gaussian data. The relationship: r = - () 2 0 0 (ref. 7) 0.1 0.2 and 0.3 0.4 0.5 0.6 0y.x/0y r = 2 sin (s--) (ref. 6) Fig.6. Product moment and rank correlation coefficients as a when x has a gaussian distribution function of r(#{149}), p (x), r (0) held for all gaussian r was lower data regressions. in absolute magnitude As expected (3), Kendall than r and p for all re- gressions. Comparison of Product Moment and Rank Correlation Coefficients Gaussian reference data. Product moment and rank correlation coefficients between the test (y) and reference (x) data were compared when the random error was fixed at a constant level, but the standard deviation of the reference data varied. Results are shown in Figure 5. When was fixed, all correlation coefficients varied markedly with o, particularly when o,,., is equal to 50% or more of o. While r and p were nearly equal, r was systematically lower in value at every level of a. Results were pooled for all gaussian data and are shown in Figure 6. Spearman p was very close in value to r for all (iy.x/#{248}y plotted. On the other hand, Kendall r was less than r for all and showed an almost linear relationship to a.,.,/ay (a = 1.004, = -0.598, o = 0.002, r2 = 99.9%). Because of the shape of this curve, Pearson r and Spearnian p are insensitive to differences in sry.x/ay when this ratio is small. Thus for ratios of 0.45 toO, r and p range from 0.9 to 1.0. On the other hand, Kendall i is more steeply sloped in this area and ranges from 0.75 to 1.00, providing a more sensitive index to changes in for a given data set. Plots of the correlation coefficients as a function of c.,/a,, for each of the five lognormal distributions are shown in Figure 7. The lognormal distribution did not affect the graph of Pearson r, which remained the same function of cr,,./a,, as for gaussian data. However, graphs of p or r vs. showed increasing departure from that of gaussian data as the skew. ness of the lognormal data increased. For a given cr,,. and os,, a lower correlation coefficient was obtained for lognormal data than for gaussian data, and the extent of decrease depended on the skewness of the data. Values of the nonparanietric CLINICALCHEMISTRY, Vol.24,No. 6, 1978 rank 859 0 0.2 0.4 deviation of 25, resulting in a change to a mean of 105.9 and standard deviation of 48.4. Test data were generated in the usual manner. Product moment and rank correlation coefficients for data with and without these spurious values are shown in Table 1. As expected, Pearson r increased with increased o of the reference data, particularly for large However, p and F did not appear to be sensitive to this increase in o through the addition of extreme outliers. 0.6 08 o_y.x/ o Discussion Fig. 7. Product moment and rank correlation coefficients as a function of aI o when x has a lognormal distribution with median of 100 and increasing degree of skewness from nos. 1 to 5(see FIg. 1) Gaussian function (0) is For data of both gaussian and non-gaussian distribution, the product moment correlation coefficient showed identical dependence on o,,.,. and a,, as expressed by the formula: shown for comparison. A, r (where poInts from all lognormal data fall on the curve for gaussian data); B, p (where points from lognormal flO. I fall on curve for gaussiandata); C, r *3 7 0 0,., / Fig. 8. Correlation coefficients as a function of for alkaline phosphatase (AP) and chloride (Chi) referencedataas compared to that for gaussian (C3)data A, r (all points for AP, Chi fall on curve for gaussian data); B. p; C, r correlation coefficients are thus dependent on data distribution was well as o,,. and Laboratory data. When inpatient laboratory values were used for reference data, different correlation coefficients were obtained among the two groups for the same 0)’#{149}Xabout regression. For example, when was equal to 0.10 times the median, Pearson r values for alkaline phosphatase and chloride data were 1.00, and 0.44; Spearman p, 0.96 and 0.49; and Kendall i-, 0.87 and 0.52, respectively. Graphs of the correlation coefficients vs. are shown in Figure 8. Again, the same function as gaussian data was obtained for all laboratory data for Pearson r, but each set of laboratory data had a distinctly different graph of the rank correlation coefficient as a function of Thus the rank correlation coefficients appear to be markedly dependent on the distribution of the data as well as on and os,. If data are extensively tied, the short-cut calculation of Spearman p may be falsely elevated (9), as compared to the correct p (the product moment correlation coefficient calculated on the ranks of the data). Our laboratory data contained many ties; out of 100 reference values, 45 alkaline phosphatase and 92 chloride values were involved in ties. Even so, the short-cut computation of p was no more than 0.002 higher than the product moment computation for any level of error added to the reference data. Addition of spurious values. Two values of 400 were added to the gaussian reference data with a mean of 100 and standard 0y ri-(j2) Since, as shown in Figure 6, the function flattens as r approaches 1.0, the ratio of to a, must be greater than 0.3 for r to be less than 0.95. For data with large a,, because of skewing or spurious values, an extremely large value of or scatter about the regression line would be necessary before a value of r lower than 0.95 would be obtained. Thus, a “good” product moment correlation coefficient provides no assurance of low random error about the linear relationship between two variables. It was hoped that rank correlation coefficients might provide a measure of random error about a regression line relatively free from the effect of the range of the data, but this hypothesis did not prove to be true. Both Spearman p and Kendall r were markedly dependent on both the range and the distribution of the data and thus cannot readily be compared between different data sets. Because of difficulties in interpreting correlation coefficients, Westgard and Hunt (1) suggested that the standard error of regression, expressed in concentration units, be calculated as an estimate of random error. This coefficient represents the average of the squares of the vertical distances along they axis from each experimental point to the regression line. Since r and are related by the same equation for all data distributions, may be easily calculated if r and cr are known: = where a), is calculated S,,., with n v’l - r2 with n weighting. The sample statistic, 2 weighting - U) may also be easily obtained: Si>,., = The standard error of regression can be interpreted as the standard deviation of values expected by the “test” method (y) for a given value of the “reference” method (x) if there were no uncertainty in the calculated regression line, and should be reviewed with respect to the usual range of values encountered in the clinical laboratory. However, if an imprecise laboratory method with a constant coefficient of variance is compared to an extremely precise reference method, the standard error of regression about the regression line will vary with each value of the reference method. In this situation, a weighted regression should be performed (10) with Table 1. Effect of Addition of Extreme Val ues on Regr esslon Coefficients Gaussian r.f.r.ncs (o’25) dat a Gaussian r#{149}fsrence data + 2outll.rs(u=49) r p 5 10 0.98 0.93 0.98 0.92 0.89 0.99 0.98 0.89 0.79 0.98 0.93 0.80 25 0.71 0.67 0.58 0.89 0.69 0.59 rx 860 CLINICALCHEMISTRY, Vol. 24, No. 6, 1978 F r p F ay.x interpreted as the fixed proportion by which a given reference value is multiplied to find the standard error of re- gression Thus, at that particular Since n value of x. while the product moment and rank correlation coefficients may be useful in assessing whether an association exists between two variables, we have shown that they are not useful parameters in assessing the degree of random error about a presumed linear association. We conclude that in laboratory method-comparison, where a strong linear association frequently is evident graphically, estimation of ay.x is a more useful indicator of scatter about the regression line than are product moment or rank correlation coefficients. Appendix “Test” values (y) are generated the addition of E: from “reference” values (x) COV (x-E) is present where E is constant a positive (1) in absolute value, but alternates and negative COV (x-E) number. (x = ) (E, - )/n - (2) and E = 0, the covariance will be 0 if equal deviates of x from the mean are multiplied by a constant error of the same sign. For the gaussian data, this is readily accomplished by ranking the x data and adding positive error to odd ranks and negative error to even ranks for ranks 1 through 50, then adding negative error to even ranks and positive error to odd ranks for ranks 51 through 100. Similarly, for non-gaussian data, re- versal of the order of addition of positive and negative error at some unknown rank of x can be expected to minimize the covariance between x and the error. For each non-gaussian data set, this rank was found empirically by reversing the order of addition of random error at all even ranks between 50 and 100 and calculating the resulting covariance between x and the error. The rank which minimized the covariance between x and E was then selected for reversing the addition of positive and negative If covariance the expected error. is present x and E, 13is altered between from value of 1 as follows: xjyj- (xiyi)/n n ax2 x and E, between equation n n x,2+ - i-i ( 2, in x1Ei-I 1: x.) i=1 ( in /n- n \ / Ei)/n x1 1=1 i.1 fla2 (4) IEl. = will also differ In the least-squares From equation (7) 1: a2 Equating 13: 13ax2 + = equations tr2 = + lE2 + COV(x-E) 7 and 8 and substituting ay.x 2 a5 2 - (8) equation 5 for (COV(x-E))2 2 We are grateful to Doctors Lemuel Bowie, John Brimm, Nathan Gochman, and Alfred Zettner for reviewing this manuscript. References 1. Westgard, J. 0., and Hunt, M. R., Use and interpretation of common statistical tests in method-comparison studies. Clin. Chem. 19, 49 (1973). 2. Armitage, P., Statistical Methods in Medical Research. John Wiley and Sons, New York, N.Y., 1977. 3. Conover, W. J., Practical Nonparametric Statistics. John Wiley and Sons, Inc., New York, N.Y., 1971, pp 243-253. 4. Wu, G. T., Twomey, S. L., and Thiers, R. E., Statistical evaluation of method comparison data. Clin. Chem. 21, 315 (1975). 5. Reed, A. H., Misleading correlations in clinical applications. Clin. Chim. Acta 40, 266 (1972). 6. Diem, K., and Sentner, C., Eds., Documenta Geigy. Ciba Geigy Limited, Basle, Switzerland, 1970, pp 54-55. 7. Sokal, R. R., and Rohlf, F. J., Biometry. W. H. Freeman and Co., San Francisco, Calif., 1969. 8. Walpole, R. E., and Myers, R. H., Substituting (5) a2 where r, is the difference between the observed y, and the predicted Yi for each value of x, and has a standard deviation of ay.x. Thus: between Since: / (6) a2 xi + E1 = \ E)/n COV (x-E) value of a through yi + - If covariance n x, i=1 = from the expected model: in xE-( Probability and Statistics for Engineers and Scientists. Macmillan Publishing Co., New York, N.Y., 1972. 9. Tate, M. W., and Clelland, R. C., Nonparametric and Short-cut Statistics in the Social, Biological, and Medical Sciences. Interstate Printers and Publishers, Inc., Danville, Ill., 1952. 10. Steel, R. G. D., and Torrie, J. H., Principles and Procedures of Statistics. McGraw-Hill, New York, N.Y., 1960, p 181. CLINICAL CHEMISTRY, Vol. 24, No. 6, 1978 861