Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review of Basic Concepts Some abbreviations widely used in statistics.^ Sample Statistic Preferred symbol Population Acceptable symbol Arithmetic mean µ X Chi-square χ2 Correlation coefficient r Coefficient of multiple determination R2 Coefficient of simple determination r2 Coefficient of variation CV Degrees of freedom df DF Least significant difference LSD Multiple correlation coefficient R Not significant NS Probability of type I error α Probability of type II error β Regression coefficient b β Sample size n N σx sx Standard error of mean SE Standard deviation of sample SD s σ Student’s t t Variance s2 σ2 Variance ratio F ^The symbols *, **, and *** are used to show significance at the P = 0.05, 0.01, and 0.001 levels, respectively. Significance at other levels should be designated by a supplemental note. From: Publications Handbook and Style Manual. 1988. Amer. Soc. Agron. Inc., Crop Sci. Soc. Amer. Inc., Soil Sci. Soc. Amer., Madison, WI. 2.1 Review of Basic Concepts - Example Problem Comparison of the speed (-2 minutes) in seconds of two calculating machines for computing sums of squares (modified from Cochran and Cox) Machine A Replication Time (sec) Deviation from Mean Machine B Dev.2 Time (sec) Deviation from Mean Dev.2 A-B Rep. or Pair Totals 1 30* 8 64 14 0 0 16 44 2 21 -1 1 21* 7 49 0 42 3 22* 0 0 5 -9 81 17 27 4 22 0 0 13* -1 1 9 35 5 19 -3 9 14* 0 0 5 33 6 29* 7 49 17 3 9 12 46 7 17 -5 25 -6 36 9 25 8 14* -8 64 16 2 4 -2 30 9 23* 1 1 8 -6 36 15 31 10 23 1 1 24* 10 100 -1 47 Total Σx2=214 ΣX=220 8* ΣX=140 where x = X - X Sums of Squares = SS = Σx 2 = Σ(X -X)2 Mean = X = ΣX n Standard Deviation = s = s 2 Variance = s 2 = Σ(X - X)2 n-1 Standard Error = sx = s2 n 2.2 Σx2=316 Coefficient of Variation = CV = s X x 100% Confidence Limits = CL = X + tsx t Distribution The t distribution was first presented by William S. Gosset who published under the psuedonym “Student” in 1908. Thus the term Student’s t test. The t test compares the deviation of the sample mean form the population mean measured against the standard error of the mean. It is also used to compare the difference between two means measured against the standard error of the difference. The t distribution follows the normal distribution and varies for different df. The standard t tables are two-tailed tables in which the probability, i.e. 5%, is distributed on both ends of the distribution. The differences between the results of a standard treatment and a new treatment may be either positive or negative, i.e. the result of the new treatment may be either larger of smaller than of the standard treatment. There are also one-tailed t tables in which the difference between the result of a standard treatment and the new treatment can be only positive or only negative. The t test for the difference between the sample mean and the population mean is calculated in the following manner. t= X-µ where sx = sx s2 n The t test for the difference between means from two different treatments is calculated in the following manner. X -Xb t= a where sd = sd 2 sa na 2 + sb nb Degrees of freedom (df) for looking up t value in t table: 1. If samples are from two populations, the df is the sum of the df for the two populations. 2. If pairs of values or replicated comparisons are being compared, the df is the number of pairs - 1. 3. In an ANOVA, the df is the df for mean square for error. 2.3 Confidence Limits The confidence limits of a mean may be calculated by the formula CL = X + t sx as shown in the example. When a t value for the 0.05 probability is used in the CL, the true mean is expected to lie within the confidence limits indicated with a probability of 95% unless a 1-in-20 chance has occurred. The 95% confidence limits may be calculated for the means of each of two treatments. If the confidence limits of the two means do not overlap, it may be concluded that the two means are significantly different at the 95% probability level. Confidence Limits - Example Problem Determine the confidence limits for µ for calculating machine A, given: ΣX = 220 n = 10 Σx 2 = 214 s2 = Σx 2 214 = 23.77 = 9 n - 1 X 16 17 18 19 20 21 22 23 24 25 26 27 P = 0.80 19.9 24.1 P = 0.95 18.5 25.5 P = 0.99 17.0 X = 27.0 ΣX 220 = = 22.0 n 10 2.4 28 sec CL = 22 + tsx sX = s2 = n 23.77 = 1.54 10 For P = 0.80 t(.20,9) = 1.383 CL = 22 ±(1.383) (1.54) = 22 ±2.1 = 19.9, 24.1 For P = 0.95 t(.05,9) = 2.262 CL = 22 ± (2.262) (1.54) = 22 ± 3.5 = 18.5, 25.5 For P = 0.99 t(.01,9) = CL = Thus, with a confidence of 95%, we can say that the true mean, µ, is included in the range 18.5 to 25.5. Or, to state this another way, there is 1 chance in 20 or 5 chances in 100 that the true mean for machine A lies outside this range. 2.5