Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
統計學: 應用與進階 第13 章: 假設檢定 假設檢定的基本觀念 如何執行假設檢定? 假設檢定程序 檢定的p-值 誤差機率與檢定力 檢定力函數 One Population Tests One Population Mean Proportion Variance Z Test t Test Z Test c2 Test (1 & 2 tail) (1 & 2 tail) (1 & 2 tail) (1 & 2 tail) t Test for Mean ( Unknown) 1. Assumptions • Population is normally distributed • If not normal, only slightly skewed & large sample (n 30) taken 2. Parametric test procedure 3. t test statistic X t S n Two-Tailed t Test Finding Critical t Values Given: n = 3; = .10 df = n - 1 = 2 /2 = .05 /2 = .05 Critical Values of t Table (Portion) v t .10 t .05 t .025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 -2.920 0 2.920 t 3 1.638 2.353 3.182 例子: 實際檢定 設 且 令 試在顯著水準為 α 下, 檢定 未知 例子: 實際檢定 在 未知的情況下, 我們所選的 根據 接下來, 我們求算 若 我們知道 ,則拒絕H0 為 Two-Tailed t Test Example Does an average box of cereal contain 368 grams of cereal? A random sample of 36 boxes had a mean of 372.5 and a standard deviation of 12 grams. Test at the .05 level of significance. 368 gm. Two-Tailed t Test Solution • • • • • H0: = 368 Ha: 368 = .05 df = 36 - 1 = 35 Critical Value(s): Reject H0 Reject H0 .025 .025 -2.030 0 2.030 t Test Statistic: Decision: Conclusion: Two-Tailed t Test Solution Test Statistic: X 372.5 368 t 2.25 S 12 n 36 Decision: Reject at = .05 Conclusion: There is evidence population average is not 368 Two-Tailed t TestThinking Challenge You work for the FTC. A manufacturer of detergent claims that the mean weight of detergent is 3.25 lb. You take a random sample of 64 containers. You calculate the sample average to be 3.238 lb. with a standard deviation of .117 lb. At the .01 level of significance, is the manufacturer correct? 3.25 lb. Two-Tailed t Test Solution* • • • • • H0: = 3.25 Ha: 3.25 .01 df 64 - 1 = 63 Critical Value(s): Reject H 0 Reject H0 .005 .005 -2.656 0 2.656 t Test Statistic: Decision: Conclusion: Two-Tailed t Test Solution* Test Statistic: X 3.238 3.25 t .82 S .117 n 64 Decision: Do not reject at = .01 Conclusion: There is no evidence average is not 3.25 One-Tailed t Test of Mean ( Unknown) One-Tailed t Test Example Is the average capacity of batteries at least 140 amperehours? A random sample of 20 batteries had a mean of 138.47 and a standard deviation of 2.66. Assume a normal distribution. Test at the .05 level of significance. One-Tailed t Test Solution • • • • • H0: = 140 Ha: < 140 = .05 df = 20 - 1 = 19 Critical Value(s): Test Statistic: Decision: Reject H0 Conclusion: .05 -1.729 0 t One-Tailed t Test Solution Test Statistic: X 138.47 140 t 2.57 S 2.66 n 20 Decision: Reject at = .05 Conclusion: There is evidence population average is less than 140 One-Tailed t Test Thinking Challenge You’re a marketing analyst for Wal-Mart. Wal-Mart had teddy bears on sale last week. The weekly sales ($ 00) of bears sold in 10 stores was: 8 11 0 4 7 8 10 5 8 3 At the .05 level of significance, is there evidence that the average bear sales per store is more than 5 ($ 00)? One-Tailed t Test Solution* • • • • • H0: = 5 Ha: > 5 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: Reject H0 .05 0 1.833 Conclusion: t One-Tailed t Test Solution* Test Statistic: X 6.4 5 t 1.31 S 3.373 n 10 Decision: Do not reject at = .05 Conclusion: There is no evidence average is more than 5 檢定誤差 在檢定的過程中, 我們被迫要在虛無假設與對立假 設之間做出選擇, 一如法官必須在有罪與無罪的判 決上做出決策 法官可能會冤枉好人, 將一名無罪的人送入監獄; 法官亦可能會錯縱罪人, 將一名有罪之人當庭開釋 同理, 在假設檢定的過程中, 我們可能會在H0為真 的情況下拒絕H0, 亦有可能在H0 為假的情況下, 作 出無法拒絕H0 的決策 檢定誤差 無法拒絕H0 拒絕H0 H0 為真 正確決策 型一誤差 H0 為假 型二誤差 正確決策 型一誤差 犯下型一誤差的機率我們以 α 表示 α = P(型一誤差) = P(拒絕H0 | H0為真) 我們亦稱顯著水準 α 為該檢定的「檢定範圍」或 是「檢定大小」(size of the test) 型二誤差 犯下型二誤差的機率我們以 β 表示 β = P(型二誤差) = P(無法拒絕H0 | H0為假). β 取決於H0 為假, 也就是取決於母體參數的真值 如果我們以 θ 代表母體參數的真值, 則 β 就是 θ 的 函數 β = β (θ) 令 π(θ) = 1 − β (θ) = P(拒絕H0 | H0為假) 代表了我 們的檢定正確地拒絕了不為真的H0, 我們稱 π(θ) 為檢定的檢定力(power) 型一誤差的機率(α) 與型二誤差的機率(β) 檢定力 再以藥廠為例, 依照之前提及, 在 α = 0.01 的顯著 水準下, 拒絕域為 RR ={拒絕H0, 當 ≥0.08 } 檢定力: 如果母體參數μ 的真值為0.09 β = P(型二誤差) = P(無法拒絕H0 | H0為假). 而檢定力 π = 1 − β = 1 − 0.2776 = 0.7224 Finding Power Step 1 Hypothesis: H0: 0 368 Ha: 0 < 368 n 15 25 Reject H0 = .05 Do Not Draw Reject H0 0 = 368 X Finding Power Steps 2 & 3 Hypothesis: H0: 0 368 Ha: 0 < 368 n Reject H0 Do Not Draw Reject H0 15 25 = .05 0 = 368 ‘True’ Situation: a = 360 (Ha) Specify Draw 1- a = 360 X X Finding Power Step 4 Hypothesis: H0: 0 368 Ha: 0 < 368 n Reject H0 Do Not Draw Reject H0 15 25 = .05 0 = 368 ‘True’ Situation: a = 360 (Ha) Specify X 15 368 1.64 n 25 363.065 X L 0 Z Draw 1- a = 360 363.065 X Finding Power Step 5 n Hypothesis: H0: 0 368 Ha: 0 < 368 Reject H0 Do Not Draw Reject H0 15 25 = .05 0 = 368 ‘True’ Situation: a = 360 (Ha) Draw Specify Z Table X 15 368 1.64 n 25 363.065 X L 0 Z = .154 1- =.846 a = 360 363.065 X 檢定力函數 根據以上的討論, 我們知道檢定力是母體參數真值 的函數, 故又稱檢定力函數 檢定力函數的定義為: π(m) = P(拒絕H0|θ = m) 因此, π(θ0) = P(拒絕H0| θ = θ0) = , 亦即,在H0 為真 的情況下, π(θ0) 就是顯著水準, 就是犯型一誤差的 機率 α 如果H1 為真, 則 π(m) 就是該檢定的檢定力,就是不 犯型二誤差的機率1 − β(m) α 與 β 間的抵換關係 檢定力函數 Power Curves Power H0: 0 Power H0: 0 Possible True Values for a Power Possible True Values for a H0: =0 Possible True Values for a = 368 in Example 誤差機率與檢定力 α + β 通常被當作衡量檢定好壞的指標。我們希望 α + β 越小越好。此外, 注意到 α + β 的機率值可以 大於1 雖然我們希望 α + β 越小越好, 但是在樣本數 n 固 定的情況下, α 越小則 β 越大; 反之, β 越小則 α 越 大。亦即 α 與 β 之間存在著抵換關係(trade-off) 如果我們可以增加樣本數n, 則 α 與 β 將會隨著n 增加而同時減少。 & Have an Inverse Relationship You can’t reduce both errors simultaneously! Factors Affecting 1. True value of population parameter • Increases when difference with hypothesized parameter decreases 2. Significance level, • Increases when decreases 3. Population standard deviation, • Increases when increases 4. Sample size, n • Increases when n decreases One Population Tests One Population Mean Proportion Variance Z Test t Test Z Test c2 Test (1 & 2 tail) (1 & 2 tail) (1 & 2 tail) (1 & 2 tail) Chi-Square (c2) Test for Variance 1. Tests one population variance or standard deviation 2. Assumes population is approximately normally distributed 3. Null hypothesis is H0: 2 = 02 4. Test statistic c 2 (n 1) S 2 0 2 Sample variance Hypothesized pop. variance Chi-Square (c2) Distribution Population Select simple random sample, size n. Compute s 2 Sampling Distributions for Different Sample Sizes Compute c2 = (n-1)s 2 /2 0 Astronomical number of c2 values 1 2 3 c2 Finding Critical Value Example What is the critical c2 value given: Ha: 2 > 0.7 Reject n=3 =.05? = .05 df = n - 1 = 2 c2 Table (Portion) 0 DF .995 1 ... 2 0.010 5.991 c2 Upper Tail Area … .95 … … 0.004 … … 0.103 … .05 3.841 5.991 Finding Critical Value Example What is the critical c2 value given: Ha: 2 < 0.7 n=3 What do you do =.05? if the rejection region is on the left? Finding Critical Value Example What is the critical c2 value given: Ha: 2 < 0.7 Upper Tail Area Reject H0 for Lower Critical n=3 Value = 1-.05 = .95 = .05 =.05? df = n - 1 = 2 c2 Table (Portion) 0 .103 DF .995 1 ... 2 0.010 c2 Upper Tail Area … .95 … … 0.004 … … 0.103 … .05 3.841 5.991 Chi-Square (c2) Test Example Is the variation in boxes of cereal, measured by the variance, equal to 15 grams? A random sample of 25 boxes had a standard deviation of 17.7 grams. Test at the .05 level of significance. Chi-Square (c2) Test Solution • • • • • H0: 2 = 15 Ha: 2 15 = .05 df = 25 - 1 = 24 Critical Value(s): Test Statistic: Decision: /2 = .025 Conclusion: 0 12.401 39.364 c2 Chi-Square (c2) Test Solution Test Statistic: c 2 (n 1) S 2 0 2 (25 1) 17.7 2 15 = 33.42 Decision: Do not reject at = .05 Conclusion: There is no evidence 2 is not 15 2 例子: 實際檢定 某工廠製作某型的塑膠管,經長期測定其直徑之 標準差為0.09毫米,且設母體為常態分布。最近僱 用一批新操作員,抽取10個產品測定其直徑之標 準差為0.12毫米。在α=0.05下,此批新的操作員 是否可以勝任此工作? Chi-Square (c2) Test Solution • • • • • H0: 2 = 0.092 Ha: 2 > 0.092 = .05 df = 10 - 1 = 9 Critical Value(s): Test Statistic: Decision: = .005 Conclusion: 0 12.401 16.916 c2 Chi-Square (c2) Test Solution Test Statistic: c 2 (n 1) S 2 0 2 (10 1)(0.12) 2 (0.09) 16 Decision: Do not reject at = .05 Conclusion: There is no evidence 2 is not 0.081 2 Thinking Challenge How would you try to answer these questions? • Who gets higher grades: males or females? • Which program is faster to learn: Word or Excel? Target Parameters Difference between Means 1– 2 Difference between Proportions p1– p2 Ratio of Variances ( 1 )2 2 ( 2 ) Possible Estimator • ( X 1 X 2 ) for ( 1 2 ) ( p1 p2 ) for ( p1 p2 ) 2 1 2 2 S S for 2 1 2 2 Test Statistics • What are the possible test statistics? • Do we know the sampling distribution? Two Population Inference Two Populations Mean Paired Proportion Variance Z F Indep. Z t t (Large sample) (Small sample) (Paired sample) Comparing Two Independent Means Two Population Inference Two Populations Mean Paired Proportion Variance Z F Indep. Z t t (Large sample) (Small sample) (Paired sample) Sampling Distribution 1 Population 1 2 2 1 Select simple random sample, n1. Compute X1 Compute X1 – X2 for every pair of samples Astronomical number of X1 – X2 values Population 2 Select simple random sample, n2. Compute X2 Sampling Distribution 1 - 2 One Population Case Base on CLT: 2 σ1 X 1~N(1 , ) n1 2 σ2 X 2~N( 2 , ) n2 Two Population Case (independent) • So far we do not know the sampling distribution of X 1 X 2 , If these two populations are independent, then we have X 1 X 2 should be normal i. i. d E ( X ) E ( X 1 X 2 ) 1 2 V ( X ) V ( X1 X 2 ) V ( X1 ) V ( X 2 ) X1 X 2 N ( 1 2 , 12 n1 22 n2 ) 12 n1 22 n2 Large-Sample Inference for Two Independent Means Two Population Inference Two Populations Mean Paired Proportion Variance Z F Indep. Z t t (Large sample) (Small sample) (Paired sample) Conditions Required for Valid LargeSample Inferences about μ1 – μ2 Assumptions • Independent, random samples • Can be approximated by the normal distribution when n1 30 and n2 30 Large-Sample Confidence Interval for μ1 – μ2 (Independent Samples) Confidence Interval X 1 X 2 Z 2 2 1 n1 2 2 n2 Hypotheses for Means of Two Independent Populations Research Questions Hypothesis No Difference Any Difference Pop 1 Pop 2 Pop 1 Pop 2 Pop 1 < Pop 2 Pop 1 > Pop 2 H0 1 2 0 1 2 0 1 2 0 Ha 1 2 0 1 2 0 1 2 0 Large-Sample Test for μ1 – μ2 (Independent Samples) Two Independent Sample Z-Test Statistic z ( x1 x2 ) ( 1 2 ) 12 n1 22 n2 Hypothesized difference Large-Sample Confidence Interval Example You’re a financial analyst for Charles Schwab. You want to estimate the difference in dividend yield between stocks listed on NYSE and NASDAQ. You collect the following data: NYSE NASDAQ Number 121 125 Mean 3.27 2.53 Std Dev 1.30 1.16 What is the 95% confidence interval for the difference between the mean dividend yields? © 1984-1994 T/Maker Co. Large-Sample Confidence Interval Solution X 1 X 2 Z 2 12 n1 22 n2 (1.3) 2 (1.16) 2 (3.27 2.53) 1.96 125 121 .43 1 2 1.05 Hypotheses for Means of Two Independent Populations Research Questions Hypothesis No Difference Any Difference Pop 1 Pop 2 Pop 1 Pop 2 Pop 1 < Pop 2 Pop 1 > Pop 2 H0 1 2 0 1 2 0 1 2 0 Ha 1 2 0 1 2 0 1 2 0 Large-Sample Test for μ1 – μ2 (Independent Samples) Two Independent Sample Z-Test Statistic z ( x1 x2 ) ( 1 2 ) 12 n1 22 n2 Hypothesized difference Large-Sample Test Example You’re a financial analyst for Charles Schwab. You want to find out if there is a difference in dividend yield between stocks listed on NYSE and NASDAQ. You collect the following data: NYSE NASDAQ Number 121 125 Mean 3.27 2.53 Std Dev 1.30 1.16 Is there a difference in average yield ( = .05)? © 1984-1994 T/Maker Co. Large-Sample Test Solution • • • • • H0: 1 - 2 = 0 (1 = 2) Ha: 1 - 2 0 (1 2) .05 n1= 121 , n2 = 125 Critical Value(s): Reject H0 Reject H0 .025 -1.96 0 1.96 .025 z Large-Sample Test Solution • Test Statistic: (3.27 2.53) 0 z 4.69 1.698 1.353 121 125 • Decision: Reject at = .05 • Conclusion: There is evidence of a difference in means Large-Sample Test Thinking Challenge You’re an economist for the Department of Education. You want to find out if there is a difference in spending per pupil between urban and rural high schools. You collect the following: Urban Rural Number 35 35 Mean $ 6,012 $ 5,832 Std Dev $ 602 $ 497 Is there any difference in population means ( = .10)? Large-Sample Test Solution* • • • • • H0: 1 - 2 = 0 (1 = 2) Ha: 1 - 2 0 (1 2) .10 n1 = 35 , n2 = 35 Critical Value(s): Reject H 0 .05 -1.645 Reject H0 .05 0 1.645 z Large-Sample Test Solution* • Test Statistic: z (6012 5832) 0 2 602 497 35 35 2 1.36 • Decision: Do not reject at = .10 • Conclusion: There is no evidence of a difference in means 兩獨立樣本之區間估計 若有兩母體, 其均數分別為 與 , 變異數為 與 我們有興趣的參數分別為兩均數的差 或是兩母體變異之比值 我們可以由每一個母體抽出大小分別為n1 與n2 的 兩獨立隨機樣本, 以求得 或是 的區 間估計式 譬如說, 不同統計班的平均成績是否有差異 例一: 當 與 已知, 且來自常態母體時, 的區間估計式 為兩獨立 隨機樣本 則 的100 · (1 − α )% 區間估計式為