Download 4.1 Hypothesis Testing

4.1 Hypothesis Testing • z-test for a single value • double-sided and single-sided z-test for one average • z-test for two averages • double-sided and single-sided t-test for one average • the F-parameter and F-table • the F-test • the t-test for two averages 4.1 : 1/14 z-Test for a Single Value Suppose that an analytical technique is done sufficiently often that σ is known. Also assume that only a single measured value is available. The following hypothesis test involves the situation where μ is specified or obtained by theory. Null Hypothesis: the measured value, x, comes from a normal pdf having μ as its mean. Alternative Hypothesis: the measured value, x, comes from a normal pdf not having μ as its mean. To test the hypothesis compute, zcalc = |x - μ|/σ. If zcalc ≤ 1.96, accept the null hypothesis. If zcalc > 1.96, accept the alternative hypothesis. 4.1 : 2/14 Graphical Interpretation of Test The null hypothesis will be accepted for any pdf having a mean over the range μ = x - 1.96σ (blue curve) to μ = x + 1.96σ (red curve). Accepting the null hypothesis does not guarantee that x comes from a pdf having μ as its mean. Acceptance infers that the pdf resulting in x is statistically indistinguishable from one having μ as its mean. Rejection of the null hypothesis states with 95% confidence that x comes from a pdf not having μ as its mean. 0.4 pdf with lowest possible mean pdf with highest possible mean f(x) 0.3 2.5% on each side 0.2 2.5% on each side 0.1 0 -6 4.1 : 3/14 -4 -2 0 2 σ units from x 4 6 z-Test for One Average Suppose that an analytical technique is done sufficiently often that σ is known. Also assume that an average, x , has been obtained using N replicate measurements. The following hypothesis test involves the situation where μ is specified or obtained by theory. Null Hypothesis: the average, x , comes from a normal pdf having μ as its mean. Alternative Hypothesis: the average, x , comes from a normal pdf not having μ as its mean. To test the hypothesis compute, zcalc = x − μ N σ If zcalc ≤ 1.96, accept the null hypothesis. If zcalc > 1.96, accept the alternative hypothesis. 4.1 : 4/14 Single-Sided z-Test for One Average Confidence limits can also be used to guarantee that the measured average comes from a pdf with a mean greater than or equal to a specified value, C0. The entire 5% uncertainty is put on the lowvalue side of the pdf. For a normal pdf F(x) occurs at x = μ - 1.64σ. Now, zcalc = (C0 - x )/σ, and can be negative. Alternative hypothesis: z > 1.64, the measured average is less than the specified value. C0 0.3 f(x) Null hypothesis: z ≤ 1.64, the measured average is greater than or equal to the specified value. 0.4 0.2 5% on one side 0.1 x + 1.64σ 0 -4 -3 -2 -1 0 1 σ units from C0 A similar strategy can be used to test if the measured average is less than a specified value. 4.1 : 5/14 2 3 4 z-Test for Two Averages It is possible to statistically test whether two experimental averages come from the same pdf. Let x1 have the normal pdf, n(μ,σ/N11/2), and x2 have the normal pdf, n(μ,σ/N21/2). To test the hypothesis calculate the difference, D, and test whether it is statistically indistinguishable from 0. D = x1 − x2 μD = 0 σ D2 = σ x21 σD = zcalc = + σ x22 = σ2 N1 + σ2 N2 = N1 + N 2 2 σ N1N 2 N1 + N 2 σ N1N 2 D−0 σD = x1 − x2 σD For zcalc ≤ 1.96, the averages are not statistically distinguishable. For zcalc > 1.96, the averages come from pdfs with different means. 4.1 : 6/14 The t-Test for One Average When the value of σ is not known and is estimated by s, the confidence limits are determined by the t-parameter. The hypothesis test uses tcalc instead of zcalc. μ=x± ts N tcalc = x − μ N s To perform the hypothesis test, an appropriate value of ttable is found by choosing the confidence level and the degrees of freedom, φ = N-1. Null hypothesis: tcalc ≤ ttable, the average comes from a pdf with a mean indistinguishable from μ. Alternative hypothesis: tcalc > ttable, the average comes from a pdf with a mean different than μ. 4.1 : 7/14 t-Test Examples A NIST nickel standard known to contain 6.15 mmol is analyzed by a gravimetric method. Three replicate measurements were obtained: {5.88, 5.68, 6.16 mmol}. The average is 5.91 mmol and the standard deviation is 0.24 mmol. N 3 = 5.91 − 6.15 = 1.73 s 0.24 ttable ( 0.95,φ = 2 ) = 4.30 tcalc = x − μ Since tcalc ≤ ttable, the average is indistinguishable from a pdf having a mean of 6.15 mmol. An A-grade! A second student ran 10 replicates and obtained an average of 5.46 mmol and a standard deviation of 0.29 mmol. N 10 tcalc = x − μ = 5.46 − 6.15 = 7.52 s 0.29 ttable ( 0.95,φ = 9 ) = 2.26 Since tcalc > ttable, the average does not come from a pdf having 6.15 mmol as its mean. The student has to repeat the determination and identify the determinate error. 4.1 : 8/14 Single-Sided t-Test A pollution regulation requires that the concentration of airborne SO2 be less than 10 ppm. Three measurements are made {12.64, 11.04,14.57} with an average of 12.75 and a standard deviation of 1.77. By analogy with the single-sided z-test on slide 4.1-5 we can write the following. tcalc = ( x − C0 ) N s = (12.75 − 10.00 ) 1.77 3 = 2.69 A single-sided t-test requires that we use a 90% double-sided table. φ 2 3 4 9 19 29 39 ∞ t(0.90) 2.92 2.35 2.13 1.83 1.73 1.70 1.69 1.64 Since tcalc ≤ ttable(0.90,2), the average of 12.75 is statistically indistinguishable from the 10 ppm regulation. 4.1 : 9/14 The F-Parameter The ratio of two experimental variances is called the F-parameter, where F = s12/s22. The pdf, f(F), is asymmetric and has to be integrated numerically. φ1 − 2 φ1 ⎛ φ1 + φ2 ⎞ ⎛ φ1 ⎞ 2 Γ⎜ ⎟ ⎜φ F ⎟ φ ⎝ 2 ⎠ ⎝ 2 ⎠ f (F ) = 2 φ1 +φ2 ⎛ φ1 ⎞ ⎛ φ2 ⎞ Γ⎜ ⎟Γ⎜ ⎟ ⎛ φ ⎞ 2 ⎝ 2 ⎠ ⎝ 2 ⎠ ⎜1 + 1 F ⎟ ⎝ φ2 ⎠ The red line is F(3,3), the blue F(5,5), the green F(10,10), the magenta F(20,20), and the cyan F(50,50). As both N1 and N2 increase, the pdf becomes symmetric and the mean approaches 1. 4.1 : 10/14 The F-Table By convention the larger variance is placed into the numerator so that 1 ≤ F ≤ +∞. This simplifies integration by allowing it to start at 0. Ftable ∫ f ( F ) dF = 0.95 0 The value of F yielding any particular confidence level depends upon the degrees of freedom used to compute both variances. The numerator φ1 are in the top row, the denominator φ2 in the left column. φ2 \ φ1 2 3 4 9 19 29 39 ∞ 2 19.00 19.16 19.25 19.38 19.44 19.46 19.47 19.50 3 9.55 9.28 9.12 8.81 8.67 8.62 8.60 8.53 4 6.94 6.59 6.39 6.00 5.81 5.75 5.72 5.63 9 4.26 3.86 3.63 3.18 2.95 2.87 2.83 2.71 19 3.52 3.13 2.90 2.42 2.17 2.08 2.03 1.88 29 3.33 2.93 2.70 2.22 1.96 1.86 1.81 1.64 39 3.24 2.85 2.61 2.13 1.86 1.76 1.70 1.52 ∞ 3.00 2.61 2.37 1.88 1.59 1.47 1.40 1.00 4.1 : 11/14 Example F-Test Two students analyzed a steel sample for nickel. The first used a gravimetric method and obtained 3 values, {9.87, 9.95, and 10.01 mmol}, with s = 0.07 mmol. The second used a titrimetric method and obtained 5 values, {9.98, 9.99, 9.99, 10.06, 9.97 mmol}, with s = 0.04 mmol. Although the two standard deviations differ by a factor of ~2, are they statistically different? Fcalc = 2 slarger 2 ssmaller 2 0.07 ) ( = = 3.06 2 ( 0.04 ) Since Fcalc ≤ Ftable(0.95,φlarger=2,φsmaller = 4) = 6.94, the two standard deviations are statistically indistinguishable. 4.1 : 12/14 t-Test for Two Averages Two experimental averages can be compared as long as they both come from pdfs having the same standard deviation. Ordinarily the two experimental standard deviations are first checked using the Ftest. If the F-test is passed, then the t-test for two averages can proceed. First compute a pooled variance using the following equation, then calculate t. N1 − 1) s12 + ( N 2 − 1) s22 ( 2 sp = N1 + N 2 − 2 tcalc = x1 − x2 sp N1N 2 N1 + N 2 The ttable value depends upon the confidence level and N1+N2 -2 degrees of freedom. If tcalc ≤ ttable, accept the null hypothesis; if tcalc > ttable, accept the alternative hypothesis. 4.1 : 13/14 Example t-Test for Two Averages Use the previous example where two students analyzed a steel sample for nickel. The first used a gravimetric method and obtained 3 values with x1 = 9.94 and s1 = 0.07 mmol. The second used a titrimetric method and obtained 5 values with x2 = 10.00 and s2 = 0.04 mmol. The two variances have already been shown to be statistically indistinguishable. s 2p N1 − 1) s12 + ( N 2 − 1) s22 2 × 0.07 2 + 4 × 0.042 ( = = = 0.0522 tcalc = N1 + N 2 − 2 x1 − x2 sp 6 9.94 − 10.00 15 N1N 2 = = 1.15 × 1.37 = 1.57 N1 + N 2 0.052 8 ttable ( 0.95,φ = 6 ) = 2.45 Since tcalc ≤ ttable, the null hypothesis is accepted. The two means are statistically indistinguishable. 4.1 : 14/14

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 4.1 Hypothesis Testing