Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
Foundations of statistics wikipedia , lookup
Analysis of variance wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Null and Alternative Hypotheses Lecture # 3 Significance Testing Is there a significant difference between a measured and a standard amount (that can not be accounted for by random error alone)? aka Hypothesis testing- H0 (null hypothesis) (no difference) Decision- Accept or Reject The lower the probability that the observed difference occurs by chance, the less likely it is that the null hypothesis is true. H0 -> Null Hypotheses Ha -> Alternative Hypotheses Hypotheses always pertain to population parameters or characteristics rather than to sample characteristics. It is the population, not the sample, that we want to make an inference about from limited data. JRA 01/13 JRA 01/13 Steps in Conducting a Hypothesis Test Steps in Conducting a Hypothesis Test (Cont’d) Step 1. Set up H0 and Ha Step 2. Identify the nature of the sampling distribution curve and specify the appropriate test statistic Step 3. Determine whether the hypothesis test is one-tailed or twotailed JRA 01/13 Step 4. Taking into account the specified significance level, determine the critical value (two critical values for a two-tailed test) for the test statistic from the appropriate statistical table Step 5. State the decision rule for rejecting H0 Step 6. Compute the value for the test statistic from the sample data Step 7. Using the decision rule specified in step 5, either reject H0 or reject Ha JRA 01/13 Decision based on significance testing Decision based on significance testing The null hypothesis is rejected if the probability of such a difference occurring by chance is less than 1 in 20 (5% or 0.05). We In such a case, the difference is said to be significant at the 0.05 (or 5%) level. Using this level, there is 1 in 20 chance that we will reject the null hypothesis when it is in fact, true. can use 0.01 or 0.001 (1% or 0.1%) However, if the null hypothesis is retained, it has not been proved that it is true, only that it has not been demonstrated to be false. We use the t-test in significance testing. If |calculated t| is greater than the critical value, we reject the null hypothesis. JRA 01/13 JRA 01/13 α, 1-α, β and 1-β α, 1-α, β and 1-β The significance level (α) of a statistical hypothesis test is a fixed probability of wrongly rejecting the null hypothesis H0, if it is in fact true. It is the probability of a Type I error. The confidence level is 1-α. Usually, the significance level is chosen to be 0.05 (or 5%) JRA 01/13 type II error occurs when H0 is not rejected and when it is, in fact, false. A type II error is frequently due to sample sizes being too small. The probability of a type II error is symbolized by β. The power of the test is 1-β, which is the probability of avoiding a Type II error. β JRA 01/13 Significance Testing α, 1-α, β and 1-β The α is user-defined. β is difficult to constrain because its value depends on the unknown value of the population parameter An inverse relationship exists between α and β. Reduce the probability of α and β increases Is there a significant difference between a measured (X) and a standard amount (µ) that can not be accounted for by random error alone? a.k.a: Hypothesis testing- H0 (null hypothesis) (no difference) Decision- Accept or Reject t = (X - m ) n /s Where X is the sample mean, s = Standard Deviation and n = sample size If |t| (i.e. the calculated value of t without regard to sign) exceeds a certain critical value, then the null hypothesis is rejected. JRA 01/13 Values of Student’s t JRA 01/13 Comparing an experimental mean to a known value JRA 01/13 JRA 01/13 Significance Tests “gives us tools to accept conclusions that have a high probability of Truth table being correct and to reject conclusions that do not” At P=0.05, there is a 5% risk that a null hypothesis will be rejected even though it is true (Type I error) It is also possible to retain a null hypothesis even when it is false (Type II error) JRA 01/13 Summary of Errors Involved in Hypothesis Testing Inference Based on Sample Data H0 is True H0 is False Real State of Affairs H0 is True Correct decision Confidence level = 1- α H0 is False Type II error P (Type II error) = β Correct decision Type I error Significance level Power = 1-β =α* *Term α represents the maximum probability of committing a Type I error JRA 01/13 JRA 01/13 To calculate the probability of a Type II error, we postulate H1 where H1 is an alternative hypothesis Consider the example that a product contains 3% of Phosphorus by weight. Four (4) measurements are taken, the mean and SD are calculated and a significance test is conducted at P=0.05 It is suspected that the [P] has increased. H0: µ = 3.0% (one tailed t-test, “increase”) JRA 01/13 Sampling Distribution if H0 is true Sampling Distribution if H1 is true n=4 n=4 Probability of a type I error is 0.05 If the sample mean lies above the critical value Xc, the null hypothesis is rejected. Probability of a type II error, H0 is retained even if H1 is true and the sample mean lies below the critical value Xc. JRA 01/13 Increase of sample size to reduce both Type I and Type II errors. JRA 01/13 We can assign a confidence level to our measurements….. If we can accept a 5% error level, we can say that these values are reported with a 95% confidence limit. n=9 SE = s / n The probability that a false hypothesis is rejected is called the POWER of the test. (1- prob of a Type II error) JRA 01/13 For a small number of measurements, we must consult a t value table • choose confidence level % • determine number of degrees of freedom (n-1) • plug t value into the following equation: where t is 4.303 for 2 d.f. (n=3) and s was 0.28% Cl= 66.69% ± 0.70% or 65.99% Cl- - 67.39% Cl- 95%CL = X ± ts n JRA 01/13 Comparison of the means from 2 “samples” and X2 Comparison of one analytical technique to another (new method to a standard method) Null hypothesis is that there is no difference (both methods give the same results) X 1 - X2 = 0 Comparison of two means with a t-test X1 where spooled = s1 2 (n1 - 1) + s 2 2 (n 2 - 1) n1 + n 2 - 2 t has n1+ n2 - 2 degrees of freedom Assumes that the samples are drawn from populations with equal standard deviations JRA 01/13 Comparison of two means with a t-test (no assumptions about equal variance) JRA 01/13 Comparison of means from two sets of data Set 1 2.31017 2.30986 2.31010 2.31001 2.31024 2.31010 2.31028 Mean = 2.31010 n=7 s = 0.00014 13 degrees of freedom (n+n-2) Set 2 2.30143 2.29890 2.29816 2.30182 2.29869 2.29940 2.29849 2.29889 Mean = 2.29947 n=8 s = 0.00137 H0 = no significant difference between means JRA 01/13 JRA 01/13 Comparison of 2 Means with t-test spooled = t = Values of Student’s t 0.00014 2 (7 - 1) + 0.00137 2 (8 - 1) = 0.00102 7+8-2 2.31010 - 2.29947 0.00102 7(8) = 20.2 7+8 For 13 degrees of freedom, tcritical is 2.228-2.131 @ 95% CL The calculated t value is >, therefore, reject the H0 and the difference is significant. JRA 01/13 JRA 01/13 Exercise # 1–Refractive Index Data Analysis (Eleven different fragments were measured for K1 & Q2 and Q1. K1 and Q2 samples Exercise # 1–Refractive Index Data Analysis were removed from the same source of fragments. Q1 samples were removed from a 1.5195 different source of glass.) 1.5190 Sample K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 K1&Q2 RI 1.51880 1.51881 1.51886 1.51881 1.51888 1.51870 1.51874 1.51881 1.51872 1.51881 1.51880 Mean 1.51879 SD 0.00005 Sample Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 Q1 RI 1.51828 1.51844 1.51842 1.51838 1.51841 1.51848 1.51834 1.51842 1.51841 1.51838 1.51834 Mean SD I R 1.5185 CASE$ 1.5180 12 10 8 6 4 2 0 2 4 6 8 10 12 Count Count 1.5195 Q1 K1 1.5190 1.51839 0.00006 I R 1.5185 CASE$ 1.5180 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 Count Count JRA 01/13 Q2 K1 JRA 01/13 t-Test: Two-Sample Assuming Unequal Variances Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Exercise # 2 – Elemental Analysis Q1 K1 1.51901 1.51911 1.01E-08 6.73E-09 5 5 0 8 -1.792570632 0.055401831 1.85954832 0.110803662 2.306005626 K1 and Q1 analysis summary: EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: K1 (concentrations in ppm) there is NOT a significant difference between the 2 values Sample Ti Mn Ga Rb Sr Zr Ba La Ce Sm Hf Pb K1 and Q1 K1 and Q1 65.87 67.22 19.63 20.06 0.515 0.540 1.656 1.669 32.23 31.91 36.24 37.64 17.49 18.03 2.782 2.806 4.512 4.628 0.363 0.394 0.936 0.931 0.889 1.006 K1 and Q1 67.21 19.53 0.485 1.710 32.50 36.99 17.59 2.722 4.510 0.383 0.930 0.979 66.77 0.78 19.74 0.28 0.513 0.027 1.678 0.028 32.21 0.30 36.96 0.70 17.70 0.29 2.770 0.043 4.550 0.067 0.380 0.016 0.932 0.003 0.958 0.062 1.2 1.4 5.3 1.7 0.9 1.9 1.6 1.6 1.5 4.2 0.4 6.4 t-Test: Two-Sample Assuming Unequal Variances Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Q2 K1 1.51835 1.51911 4.665E-08 6.73E-09 5 5 0 5 -7.394163923 0.000355858 2.015049176 0.000711716 Average SD %SD 2.570577635 there is a significant difference between the 2 values JRA 01/13 JRA 01/13 Exercise # 2 – Elemental Analysis Exercise # 2 – Elemental Analysis K1 and Q1 analysis summary: EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: K1 / Q1 (concentrations in ppm) Q2 analysis summary: Sample EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: Q2 (concentrations in ppm) Sample Ti Mn Ga Rb Sr Zr Ba La Ce Sm Hf Pb Q2 Q2 122.6 137.1 2.888 6.339 2.054 2.131 0.264 0.276 24.10 23.79 81.14 79.74 28.47 28.38 19.25 18.53 183.7 184.0 0.630 0.719 1.931 1.805 11.50 9.932 Q2 126.5 3.760 2.095 0.257 24.25 79.83 29.96 20.67 196.4 0.641 1.965 11.94 128.7 7.5 4.329 1.794 2.093 0.039 0.266 0.010 24.05 0.23 80.24 0.78 28.94 0.89 19.48 1.09 188.0 7.2 0.664 0.048 1.900 0.085 11.13 1.06 5.8 42 1.9 3.7 1.0 1.0 3.1 5.6 3.8 7.3 4.5 9.5 %SD JRA 01/13 Mn Ga Rb Sr Zr Ba La Ce Sm Hf Pb 66.77 19.74 0.513 1.678 32.21 36.96 17.70 2.770 4.550 0.380 0.932 0.958 SD %SD 0.78 1.2 0.28 1.4 0.027 0.028 5.3 1.7 0.30 0.9 0.70 1.9 0.29 1.6 0.043 0.067 0.016 0.003 0.062 1.6 1.5 4.2 0.4 6.4 Q2 analysis summary: EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: Q2 (concentrations in ppm) Sample average SD Ti K1/Q1 Average Q2 average SD %SD Ti Mn Ga Rb Sr Zr Ba La Ce Sm Hf Pb 128.7 4.329 2.093 0.266 24.05 80.24 28.94 19.48 188.0 0.664 1.900 11.13 7.5 5.8 1.794 42 0.039 1.9 0.010 3.7 0.23 1.0 0.78 1.0 0.89 3.1 1.09 5.6 7.2 3.8 0.048 7.3 0.085 4.5 1.06 9.5 JRA 01/13 Exercise # 2 – Elemental Analysis Values of Student’s t Analysis summary for comparison of samples using a two-sample t-test: Sample comparison Ti K1/Q1 with Q2 14.2 Mn Ga Rb Sr Zr Ba La Ce Sm Hf Pb 14.7 57.8 81.4 37.7 71.3 20.8 26.6 44 9.6 19.8 16.6 The critical t value for all entries is 2.8, if the calculated t is larger than this critical t, the samples are distinguishable and marked in red. If any of the elements are considered significantly different (absent an explanation), then the glass samples are considered different. JRA 01/13 Significance tests JRA 01/13 Paired t-test Comparison of an experimental mean with a known value Comparison of two experimental means “Paired” comparisons One-sided or two sided tests Comparison of standard deviations Determination of outliers Analysis of Variance (ANOVA) Comparison of several means JRA 01/13 Comparing results from two different methods Need to separate difference due to methods (d, if any exist) from differences due to chance Paired t-test t = d√n/sd Where d is mean of differences sd is standard deviation of differences Degrees of freedom is n-1 JRA 01/13 Paired t-test One sided and two sided tests So far we’ve tested for differences in means in either direction (2 sided) There are occasions when only one side of the test is affected (increase in a rate of reaction). 1 sided The critical value of t is halved (p= 0.10 is used instead of p= 0.05 in the table) Prior knowledge is needed to decide on 1 or 2 sided tests t = 0.159√9/0.570 t= 0.88 and critical t9= 2.26 (P=0.05) We DO NOT reject the H0, no difference JRA 01/13 F test for the comparison of s Comparing the random errors of two sets of data Is method A more precise than method B? Do methods A and B differ in precision? JRA 01/13 Rejection of Measurements…. In the case where you suspect something went wrong, you use the Q-test. (Minimum of 3 measurements) (Dixon’s Test) Q-test = |suspect value - nearest value| / total range Calculate Qexp. If Qexp ³ Qcritical (from Q table), then reject. d F = S21/S22 Where 1 and 2 are allocated so that F is greater than or equal to 1 Use 2 degree of freedom values from F table JRA 01/13 X1 X2 X3 X4 X5 X6 Qexp = d/w w JRA 01/13 Test for outliers Significance Testing - Part 2 JRA 01/13 Sources of variation Analysis of Variance (ANOVA) Due to random error in measurement Causes a different result each time a measurement is repeated. Controlled or fixed-effect factor 1. 2. 3. Analysis of Variance (ANOVA) - used to separate and estimate the different causes of variation in a data set. ie. 1) comparing the mean concentration of metals in solution stored under different conditions. 2) comparing the concentration of the metals in solution analyzed by different methods. 3) Comparing results obtained by different analysts. JRA 01/13 the storage conditions methods of analysis people conducting the analysis JRA 01/13 Used to separate any variation which is caused by changing the controlled factor from the variation due to random error. (test whether altering the controlled factor leads to a significant difference between the mean values obtained during the analysis) JRA 01/13 Fluorescene from solutions stored under different conditions Example 1. 2. Measurement of concentration of NaCl in a large barrel Samples are removed from several different locations in the barrel Several replicate analyses are performed on each of these samples Random error in concentration Variation in concentration from samples from different parts of the barrel JRA 01/13 Fluorescene from solutions stored under different conditions Is the difference between the sample means too great To be explained by the random error? JRA 01/13 JRA 01/13 Steps in ANOVA Null hypothesis - all the samples are drawn from a population with mean u and variance Estimate the within-sample variation (note: within sample variation does not depend on the means of the sample) Determine the within-sample estimate of the variance Determine the between-sample variation (F-test) JRA 01/13 JRA 01/13