Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Linear least squares (mathematics) wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Association rule learning wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
German tank problem wikipedia , lookup
Analysis of variance wikipedia , lookup
FORUM Confidence Intervals for the Abbott's Formula Correction of Bioassay Data for Control Response JAY A. ROSENHEIM' ANDMARJORIE A. HOY Department of Entomological Sciences, University of California, Berkeley, California 94720 J. Econ.Entomol.82(2): 331-335 (1989) Abbott's formula may be used to correct bioassay data for control response and has become a standard in bioassay evaluation. Although Abbott's formula provides an estimate of p_ (the mean bioassay treatment response corrected for control response), it does not provide a measure of associated variance. The current practice of retaining the variance estimate for p.", (the mean bioassay treatment response not corrected for control response) and applying it to p_ is invalid. This invalid procedure results in an exaggeration of the reliability of the estimate of p_ and a confidence interval for PCIm' that is centered around an inappropriate value. We present a technique to incorporate a correction for control response into the statistical analysis of bioassaysconducted with only a single or small number of treatments, which may be qualitative classes rather than a series of doses. The proposed solution isbased upon established techniques for estimating the variance or confidence interval of a ratio of normally distributed variables. The analysissuggeststwo implications for bioassay experimental design and evaluation: first, the optimal allocation of bioassay replications to control and experimental treatments generally occurs when the number of experimental replications is equal to or slightly greater than the number of control replications, and second, bioassay data should be corrected for control mortality more frequently than is currently recommended. Only if such a correction has negligible effects on both p_ and Var(p_) can it be safely omitted. ABSTRACT KEY WORDS Insecta, bioassays,control responses, Abbott's formula BrOASSAYS THATCOMPAREthe physiological or behavioral responses of two or more groups are widely used in the biological sciences. Probit analysis, which can incorporate a correction for control response, may be used to analyze experimental data when bioassay treatments consist of a graded series of doses (Finney 1971). However, many bioassays are done with only a single or a small number of treatments that may be qualitative classes rather than a series of doses. For example, mortality generated by a single pesticide residue on organisms sampled from different populations might be compared in a study of pesticide resistance. A behavioral study might compare orientation responses of parasitoids reared on different hosts to a common chemical cue. Tumor induction frequencies of a single mutagen dose on different age classes of an organism might be compared. In such cases in which bioassays are done without a graded series of doses, observed responses to experimental treatments must still be corrected for control response, thereby preventing the confounding of treatment effects with differences between control groups. For example, different age classes of an organism may show different spontaneous tumorigenesis rates independent of a mutagen's action, thereby confounding mutagen treatment effects with differences tween the untreated age classes. Abbott's formula (Abbott 1925) is - Pc"" - Peon. Pea" = 1 - - be- (1) Peonl where Peon' is the mean control response, Pc"" is the mean experimental treatment response, and Pco" is the mean experimental treatment response corrected for control response. The formula is a means of correcting bioassay data for control response and has become a standard in bioassay evaluation (Busvine 1971, Neal 1976, Hewlett & Plackett 1979, Hubert 1984). (Mean responses, Peon. and Perp, are calculated by averaging the observed proportion responding across replicates; thus, if we suppose that r, is the number of responding subjects out of n, total subjects in replicate i, with i = 1, 2, 1 I, then P = 1/1' ~ rJn,.) '-1 In this paper we show that Abbott's formula alone is an incomplete correction for control response because it fails to provide an estimate of variance for Pa", (Var(pco,,))' Failure to compute a valid estimate of Var(pcorr),and specifically a failure to consider the variance associated with Peon" results in an overestimate of the reliability of the measured Currentaddress:Departmentof Entomology, 3050 MaileWay, Room310, Universityof Hawaiiat Manoa,Honolulu,Hawaii value of Pco" and in a confidence interval for Pco" 96822. that is centered around an inappropriate value. I 0022-0493/89/0331-0335$02.00/0 © 1989 Entomological Societyof America 332 JOURNAL OF ECONOMIC These errors may lead to invalid statistical inferences when responses of different experimental groups to bioassay treatments are compared. We conclude by discussing implications of this study for experimental design and analysis of bioassays. The Problem Abbott's formula provides a convenient formula for computing 15_. Because Abbott's formula includes a ratio of variables, pCOTT will be a biased estimator of the population mean experimental treatment response corrected for control response (Cochran 1977, Buonaccorsi & Liebhold 1988), but the magnitude of the bias will be negligible in most cases. Abbott's formula does not provide a means to calculate an associated variance. The variance of 15_ is not, in general, equal to Var(p.",,); by rearranging Abbott's formula, 15 - = 1 _ 1 - 15."" 1 - Peon, (2) the value of Var(peO") is instead clearly determined by the variance of the ratio of two variables (1 15"",) and (l - Peon,)' Var(p_) therefore incorporates variance components contributed by both 15."" and Peon" We should not, therefore, continue the common practice of simply retaining the variance estimate associated with 15."" and applying it to Pear,. A Proposed Solution Despite the apparent simplicity of the bioassay experimental design and Abbott's formula, a simple, elegant means of generating a variance estimate for 15_ does not appear to be available currently, except for very large data sets for which special assumptions become valid. As clearly stated by Cochran (1977), discussing the ratio estimate R = y/x: "The distribution of the ratio estimate has proved annoyingly intractable because both y and x vary from sample to sample. The known theoretical results fall short of what we would like to know for practical applications." Before we consider a special-case solution and a more general solution, one statistical assumption implicit to Abbott's formula needs to be made explicit. Abbott's formula (Equation 1) may be rearranged to yield: (1 - 15.",,) = (1 - 15_)' (l - Peon') (3) Thus, Abbott's formula implies that the probability of lack of response in the experimental treatment (1 - p"",) is equal to the product of the probability of lack of response in the control (1 - Peon,) and the probability of lack of response to the experimental treatment effect corrected for control response (1 - PCOTT)' This multiplication of the control and experimental effects is valid only if they are statistically independent. Thus, Abbott's formula assumes statistical independence of control and ex- Vol. 82, no. 2 ENTOMOLOGY peri mental responses (Busvine 1971, Finney 1971, Hewlett & Plackett 1979, HoeI1980). A Special-Case Solution. Cochran (1977) describes conditions under which a formula for the variance of a ratio estimate may be used. These conditions will generally be met when the control and experimental bioassay treatments are replicated >30 times and the coefficients of variation of p_ and Peon" equal to SE(p.",,)/p_ and SE(peon')/ Peon" respectively, are <0.10 (Cochran 1977). A formula for the variance of 15_ may then be obtained by modifying Equation 6.13 of Cochran (1977) to incorporate assumptions appropriate for bioassay experimentation (i.e., we are sampling from an infinite population and, as discussed above, Perp is independent of Peon,): 1 Var(PCO")= (1 _ - )2 Pcon' .[var(p.",,) + n."" (1 _ ~.",,)2 1 - Peon' .va~~:~,)] (4) where n."" and neon. are the number of replications for the experimental and control treatments, respectively (see also Finney [1978] and Buonaccorsi & Liebhold [1988]). Apparently, however, the restrictive conditions required for applying this formula will not be met by most bioassay data sets, for which n."" and neon. are <30. A general solution is therefore required. A General Solution. Unfortunately, a general solution for the variance of a ratio estimate has not been developed in the statistical literature (Cochran 1977). Techniques for generating confidence intervals for ratio estimates are, however, available (Elston 1969, Cochran 1977, Finney 1978) and may be used to analyze bioassay data. The method developed by Elston (1969) appears to be the simplest to apply. This method assumes that the variables whose ratio is being considered are distributed normally. This assumption appears reasonable for most bioassay data sets for at least two reasons. First, if we assume that the individual probability of response, p, is fixed, bioassay variation between replications will generally follow some sort of binomial distribution. The normal approximation to the binomial distribution should, therefore, describe the distribution of Peon, and P."" between replications whenever the number of individuals tested per replication, n, is adequate to justify the normal approximation. As a rule of thumb, Sokal & Rohlf (1981) suggest that the normal distribution will closely approximate the binomial when n'p'(1 - p) ~ 3. Second, Abbott's formula calculates the ratio of P."" and Peon. rather than P."" and Peon. (see Equation 2). Thus, even if Peon. and P."" are not distributed exactly normally, their sampling means will be dis- April 1989 ROSENHEIM & Hoy: CONFIDENCE tributed approximately normally under the central limit theorem (Finney 1978, Sokal & Rohlf 1981). The central limit theorem is universally valid only when the number of replications averaged to yield a mean is large; various rules for the minimum number of replications ranging from 10 to 30 have been suggested in the literature. However, when the distribution of replication values is itself nearly normal, as for most bioassay data, the requirement for a large number of replications becomes less stringent (Freund 1981, Sokal & Rohlf 1981). We present Elston's (1969) method with the caveat that data sets should be examined to ensure conformance with the assumption of normality. Confidence limits for Peon may then be calculated as follows: Peon = 1 __(I_-_p_,,_p) (1 - Peon,) (1 - g) ± --(1 - Peon,) (1 - g) {(1 _ g)(va~~: ..,,») 333 INTERVALS FOR ABBOTT'S FORMULA O.B 0.7 -u.<: ~ 10 ~ ~c 0.6 0.5 0.4 Q) 0.3 0 c Q) :2 'E 0 () 0.2 0.1 0.0 0.0 0.1 0.2 0.3 Mean Control Response Fig. 1. Comparison of confidence interval widths for p_ that incorporate variance associated with Peon, (Equation 5) (open squares) and that ignore variance in Peon. (Equation 7) (solid squares). Bioassay data simulated with Pup = 0.5, n.", = 10, and neon, = 5. creases from 0.0 to 1.0.) In addition, to make data sets more realistic, we include random "vial effects" (i.e., the random error incorporated into bioassays by subtle biological differences between + (1 - p•.,,)2. var(Peon')J5 (5) replications and differences in the treatment applied to different replications) by multiplying the (1 - Peony· neon, binomial standard deviation between replications where by a scaling or heterogeneity factor of 2.0 (Finney 1971, Neider 1985, Preisler 1988). Between-repli(6) cate variance was therefore calculated as Var = 4'(p'q/n), where n = 20, P is the per-individual probability of response, and q = 1 - p. t is chosen from the t distribution with the desired Confidence Interval Width. Ignoring the variex level and n - 1 degrees of freedom, and n is the ance associated with Peon, exaggerates the reliability lesser of ne." and neon,' Note that as g approaches zero, Equation 5 becomes analogous to the special- of the estimate of Peo.,.. For hypothetical bioassay data with n.", = 10, ncon, = 5, and 0.00 ~ Peon, ~ case formula (Equation 4). 0.25, confidence intervals calculated with Equation 7 are substantially narrower (18.5-56.2%) than those The Significance of the Problem calculated with Equation 5 (Fig. 1). The magnitude of this effect increases with increasing Peon. (Fig. 1). To assess the magnitude of the error incorpoNote that even if Pro.,= 0.0, and therefore g = 0.0, rated into bioassay data evaluation by assuming that Var(peon-)= Var(Pe.,,),we can compare the con- the confidence interval generated by Equation 7 fidence limits obtained from Equation 5 with the will be narrower than that generated by Equation 5 if the number of degrees of freedom associated confidence limits that would have been generated with Equation 7 (n •." - 1) is greater than that had the variance in Peon, been ignored, i.e.: associated with Equation 5 (the lesser of n•." - 1 and neon. - 1) (Fig. 1). - _ 1 (1 - Pc.,,) Peon- Confidence Interval Location. A more subtle (1 - Peon,) difference between the confidence intervals generated by Equations 5 and 7 is the difference be± t' .(Var(Pe.,,»)O.5 (7) tween the location of their midpoints. The ratio n•." estimate is biased, the bias becoming more prowhere t' is chosen from the t distribution with n•." nounced for small sample sizes (Cochran 1977). In general, the sampling distribution of the ratio es- 1 degrees of freedom. In the following calculations, hypothetical bioas- timate is skewed right for ratios with positive valsay data sets are simulated by assuming that 20 ues. (This may be understood intuitively by conindividuals are tested per replication and that the sidering the rapidly increasing value of the ratio individual probability of response in the experi- when the denominator approaches zero.) Reflecting the bias of the ratio estimate, Equation 5 locates mental treatment, Pc.", is equal to 0.5. (In general, the magnitude of the error generated by assuming the midpoint of the confidence interval at 1 - [(1 - Pcon.)!(1 - g)J, whereas Equation 7 that Var(pco.,.)= Var(p •.,,) will decrease as Pc." in- - Pe.,,)!(1 334 JOURNAL OF ECONOMIC Vol. 82, no. 2 ENTOMOLOGY 0.6 (ij ~ 0.5 .c 0.4 ~ 'iii ~ 0.3 C C Ql 0 c: Ql '0 'E 0 () '0 Ql 0 c: 0.2 Ql '0 0.5 ~0 0.1 m () () 0.0 0.0 0.0 0.1 0.2 0.3 Mean Control Response Fig. 2. Comparison of the locations of confidence interval midpoints for p_ that incorporate variance associated with p=, (Equation 5) (open squares) and that ignore variance in p=, (Equation 7) (solid squares). Bioassay data simulated with p ••• = 0.5, n•••= 10, and neon, 1.0 Ql CD C Ql 1.5 -0 Ql = 5. locates the midpoint at 1 - (1 - P.rp)/{l - Peon') (Fig. 2). Implications for Bioassay Design and Evaluation To provide a valid estimate of variance for P""", variances contributed by both P.rp and Peon, must be considered. Therefore, use of Equations 4 and 5, which provide estimates of the variance or confidence interval of a ratio of normal variates, should replace the current standard analysis, which incorrectly assumes that Var(pc•.... ) = Var(p.rp) (Equation 7). Equation 4, valid only for large data sets, provides a formula for variance which can then be used in parametric statistical tests of between-group differences. For large data sets, Equation 4 should, therefore, be preferred to Equation 5, which provides only a confidence interval for p""". Failure of confidence intervals generated by Equation 5 to overlap is a conservative criterion for statistically significant differences in response between groups. Neal (1976) suggested that Abbott's formula be applied to bioassay data whenever control response exceeded 10%. Although control response values <10% will generally have only small effects on the value of p""", they may continue to make substantial contributions to Var(p"",,) (Fig. 1). Therefore, bioassay data should be corrected for control response more regularly than is currently recommended. Only if such a correction has negligible effects on both p""" and Var(PeOTT) can it be safely omitted. The statistical treatment of bioassay data proposed here creates a new consideration for bioassay experimental design: what is the optimal allocation of bioassay replicates to control and experimental treatments? Because the variance or confidence interval width for p_ is related to n.." and neon, through Equations 4 and 5, the choice of n.rp and neon' will be important. For an analysis of optimal allocation o 10 20 Number of Control Replicates Fig. 3. Confidence interval widths for p""" calculated with different allocations of 20 replicates to control and experimental treatments. m, optimal allocations for each level of control mortality; solid squares, p""" = 0.25; solid triangles. p= = 0.05; open squares. p= = 0.01. Confidence interval widths> 1.6 are not shown. of bioassay replicates based upon the variance estimate for a ratio of variables presented in Equation 4, see Buonaccorsi & Liebhold (1988). Fig. 3 presents confidence interval widths generated by Equation 5 for simulated bioassay data with Parp = 0.50; Peon, = 0.01, 0.05, and 0.25; and n,o, = narp + neon, = 20. Clearly, the width of the confidence interval is strongly dependent upon the relative allocation of replicates to control and experimental treatments (Fig. 3). p""" is estimated least precisely when n.rp > neon, or n.rp <: neon,' Although the optimal allocation varies with the value of Peon•• the narrowest confidence intervals are generated over a wide range of values when n.rp is equal to or slightly greater than neon" Because the optimal allocation will vary with Perp, Pcon•. Var(p •••), and Var(peon'), no allocation will be optimal under all conditions. If approximate values of these parameters are known from previous experiments or pilot studies, the construction of curves such as those shown in Fig. 3 should provide a useful guide for bioassay experimental design. Control response is a nearly universal element of bioassay experimentation. We have attempted here to develop a sound means of incorporating a correction for control response into the statistical analysis of data generated in bioassays employing a single or a small number of experimental treatments. Developing techniques for adjusting data for control response will continue to be necessary to complete the development of new techniques of bioassay analysis (e.g., Roush & Miller 1986, Tabashnik et al. 1987, Preisler 1988). Acknowledgment We thank T. P. Speed and C. M. M. Mueller (University of California, Berkeley), M. E. Ortel (University of Hawaii, Manoa), and H. K. Preisler (PacificSouthwest Forest and Range Experiment Station, USDA-Forest April 1989 ROSEN HElM & Hoy: CONFIDENCE Service) for useful discussions and assistance. The manuscript was reviewed and improved by D. C. Margolies, R. T. Roush, T. P. Speed, and B. E. Tabashnik. This material is based upon work supported in part by USDA Competitive Grant #84-CRCR-1-1452, Regional Research Project W-84, and under a National Science Foundation Graduate Fellowship to J.A.R. References Cited Abbott, W. S. 1925. A method of computing the effectiveness of an insecticide. J. Econ. Entomol. 18: 265-267. Buonaccorsi, J. P. & A. M. Liebhold. 1988. Statistical methods for estimating ratios and products in ecological studies. Environ. Entomol. 17: 572-580. Busvine, J. R. 1971. A critical review of the techniques for testing insecticides, 2nd ed. Commonwealth Agricultural Bureaux, London. Cochran, W. G. 1977. Sampling techniques, 3rd ed. Wiley, New York. Elston, R. C. 1969. An analogue to Fieller's Theorem using Scheffe's solution to the Fisher-Behrens Problem. Am. Stat. 23: 26-28. Finney, D. J. 1971. Probit analysis, 3rd ed. Cambridge University Press, London. 1978. Statistical method in biological assay, 3rd ed. Griffin, London. Freund, J. E. 1981. Statistics: a first course, 3rd ed. Prentice-Hall, Englewood Cliffs, N.J. Hewlett, P. S. & R. L. Plackett. 1979. An introduction INTERVALS FOR ABBOTT'S FORMULA 335 to the interpretation of quantal responses in biology. Arnold, London. Hoel, D. G. 1980. Incorporation of background in dose-response models. Fed. Proc. 39: 73-75. Hubert, J. J. 1984. Bioassay, 2nd ed. Kendall Hunt, Dubuque, Iowa. Neal, J. W., Jr. 1976. A manual for determining small dosage calculations of pesticides and conversion tables. Entomological Society of America, College Park, Maryland. Neider, J. A. 1985. Quasi-likelihood and GUM, pp. 120-127. In R. Gilchrist, B. Francis & J. Whittaker [eds.], Generalized linear models. Proceedings, GUM 85 Conference. Springer-Verlag, Berlin. Preisler, H. K. 1988. Assessing insecticide bioassay data with extra-binomial variation. J. Econ. Entomol. 81: 759-765. Roush, R. T. & G. L. Miller. 1986. Considerations for design of insecticide resistance monitoring programs. J. Econ. Entomol. 79: 293-298. Sokal, R. R. & F. J. Rohlf. 1981. Biometry, 2nd ed. Freeman, New York. Tabashnik, B. E., N. L. Cushing & M. W. Johnson. 1987. Diamondback moth (Lepidoptera: Plutellidae) resistance to insecticides in Hawaii: intra-island variation and cross-resistance. J. Econ. Entomol. 80: 1091-1099. Received for publication November 1988. 5 May 1988; accepted 17