Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AMS7: WEEK 6. CLASS 2 Hypothesis Testing with One Sample Friday May 08, 2015 Test Statistic • Value computed from the data and used to make the decision about the rejection of the Null Hypothesis PARAMETER TEST STATISTIC Population Proportion (p) Population Mean (): known or unknown Population Standard Deviation () = = തିఓ or ̂ − . t= തିఓ ೞ ଶ − 1 ଶ = ଶ Hypothesis Testing (Cont.) • We assume Ho is true and use the test statistic for determining whether there is significant evidence against Ho. Critical Region (or Rejection Region): Set of all values of the test statistic that cause us to Reject Ho. Significance level (): Probability that the test statistic will fall in the critical region when Ho is true. ߙ is the probability of making the mistake of rejecting Ho when Ho is true Hypothesis Testing (Cont.) • Values of ߙ: 0.05, 0.01 and 0.10. The most commonly used value is 0.05 (Default value) • Two-tailed test: Critical region is in two tails • Left-tailed test: Critical region is in the left tail • Right-tailed test: Critical region is in the right tail • Critical Value: Separates the critical Region (where we reject Ho) from the values of the test statistic that do not lead to rejection of the Ho Hypothesis. Types of Tests (Assume Two-tailed Test /2=0.025 /2=0.025 z=-1.96 z=0 Critical Region z=1.96 Critical Region Types of Tests (Assume Left-tailed Test =0.05 z=-1.645 z=0 Critical Region Types of Tests (Assume Right-tailed Test =0.05 z=0 z=1.645 Critical Region How do we know what kind of test to use? We need to examine H1’s sign: • ≠ corresponds to two-tailed tests • < corresponds to left-tailed tests • > corresponds to right-tailed tests Examples: Claim or hypothesis in symbolic form 1) More than one-half of all internet users make on-line purchases Ho: p≤ 0.5 Ho: p= 0.5 H1: p>0.5 (original claim) H1: p>0.5 Right-tailed test 2) The percentage of viewers tuned to 60 Minutes is equal to 24% Ho: p=0.24 (original claim) H1: p≠0.24 Two-tailed test Examples (Cont.) 3) Mean IQ of Statistic students is at least 110 Ho: ≥110 (original claim) Ho: =110 H1: <110 H1: <110 Left-tailed test Notes: -Make Ho the hypothesis with the equal sign and H1 the hypothesis which does not contain an equality. - For practical reasons Ho is converted to an equal sign hypothesis. Examples of Critical Values Example 1 =0.10 =0.10 z=0 zߙ=1.285 Critical Region Right-tailed Test Examples of Critical Values (Cont.) Example 2 =0.20 /2=0.10 /2=0.10 zߙ/2 =-1.285 z=0 zߙ/2 =1.285 Critical Region Critical Region Two-tailed Test Examples of Critical Values (Cont.) Example 3 =0.05 =0.05 zߙ =-1.645 z=0 Critical Region Left-tailed Test Example: Test claim that more than half of all Internet users make on-line purchases • Ho: p= 0.5 H1: p>0.5 (Original Claim) • Suppose that in a sample of n=1025 subjects, 69% said they used Internet for shopping (Binomial random variable. Check np≥5, nq ≥5 for using Normal approximation) • Test Statistic: = ොି . • Assume Ho is true. This implies that p=0.5 and q=1- p=0.5. ̂ =0.69 (from the sample!). Example (Cont.) •= .ଽି.ହ బ.ఱ⨯బ.ఱ భబమఱ = .ଵଽ .ଵହ = 12.179 • Decision: Test statistic falls in the Critical Region Reject Ho =0.10 z=0 zߙ=1.285 Critical Region 12.179 falls in the critical region Example (Cont.) • Interpretation of the Result We reject the Null hypothesis Ho. We support the alternative hypothesis H1 which is the original claim. Sample data provide enough evidence to support the claim that more than 50% of internet users make on-line purchases. Using the Confidence Interval for Hypothesis Testing • Confidence Interval for a Proportion CI: − < < + = ఈൗ × ଶ In this example: =0.10, = 0.69; = 1 − 0.69 = 0.31 Confidence level: (1-2⨯)100%= 80% E= .ଵ × .ଽ×.ଷଵ ଵଶହ = 0.0186. 0.69 − 0.0186 < < 0.69 + 0.0186 Confidence Interval: 0.6714<p<0.7086 Conclusion: p=0.5 is not included in the CI: Reject Ho Types of Errors TRUE STATE OF NATURE DECISION Ho is TRUE Ho is FALSE REJECT Ho TYPE I ERROR OK DO NOT REJECT Ho OK TYPE II ERROR Types of Errors • Type I Error: Rejecting Ho when Ho is TRUE. is the Probability of Type I error (P(Reject (Ho| Ho is True)) • Type II Error: Do not reject Ho when Ho is False. The Greek letter (beta) is the probability of a type II error (P(Do not reject Ho| Ho is false)) Note: ߚ≠1-ߙ. Normally we select ߙ first! For a given sample size n, a decrease in alpha () will cause and increase in , and conversely, an increase in will cause a decrease in . Power of a test: Is the probability 1- = Probability of Reject Ho when is false The p-value: Another way to take a decision • The p-value: Probability of getting a value of the test statistic greater (in absolute value) or equal to the sample data value. EXAMPLE: = 0.10 z*= Observed Test Statistic Zߙ= Critical Value Right-tailed test: -If p-value is greater than do not reject Ho - If p-value is lower than reject Ho p-value: Area to the right of z* = 0.10 Area to the right of Zߙ Z* Zߙ=1.285 Example of calculating a p-value • Suppose H1: p>0.29 and the observed test statistic is z=1.97. Assume =0.01 • P-value: 1-0.9756=0.0244 Right-tailed test P-value is this area P-value > 0.01 Do not Reject Ho Z=1.97 More on p-value • Right-tailed test: p-value is the area to the right of the test statistic z • Left-tailed test: p-value is the area to the left of the test statistic z • Two-tailed test: p-value is twice the area to the extreme region bound by the test statistics z. Summary about Hypothesis Testing Test a claim about a Population Parameter Hypothesis test Procedure Population Proportion (p) 1) Check that the binomial distribution of sample proportions can be approximated by a normal distribution (np≥5; nq≥5) ොି 2) Use the test statistic: = . Population mean ( is known) 1) Check that population is normally distributed or n>30. 2) Use the test statistic: = Population mean ( is unknown) തିఓ 1)Check that population is normally distributed or n>30. Use s instead of . 2) Use the test statistic: t = തିఓ ೞ