Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sufficient statistic wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
Omnibus test wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Misuse of statistics wikipedia , lookup
AMS7: WEEK 7. CLASS 3 Hypothesis Testing Examples Friday May 15, 2015 Example 1: Testing a Claim about a Proportion • Sect. 7.3, # 2: Survey of Drinking: In a Gallup survey, 1087 randomly selected adults were asked whether they used alcoholic beverages. 62% of the subjects said that they used alcoholic beverages. Test the claim that the majority (more than 50%) of adults use alcoholic beverages with a 0.05 significance level. Set the Null and Alternative Hypotheses • Set the Null and Alternative Hypotheses about p (Proportion of • • • • • • adults that use alcohol beverages) Claim: p>0.5 Since the claim does not contain the equal sign the claim becomes the Alternative Hypothesis. The opposite to the original claim is: p≤0.5. Since the opposite to the original claim contains the equal sign, this becomes the Null hypothesis. Finally set the Null hypothesis to p=0.5 We finally want to prove: H0: p=0.5 vs. H1: p>0.5 From the sign of the Alternative Hypothesis we figure out that we have a Right-Tailed test. Select the test statistic and check requirements for its application • The sample is assumed to be a simple random sample • The random variable is: number of adults that use alcohol beverages. This variable has a Binomial distribution with sample size n=1087 and p=0.5. • Since n.p≥5 and n.q≥5, the normal distribution can be used to approximate the binomial distribution. • The sample proportion is ̂ = 0.62. The sampling distribution of the sample proportion is approximately normal with mean = p and standard deviation = • The test statistic distribution z= . has a standard normal . . Critical Region and p-value • Test statistic: z= = . .. బ.ఱ×బ.ఱ భబఴళ =7.913 (Note that p=0.5 is the value proposed under the Null Hypothesis. We normally assumed that the Null hypothesis is true before hand!!!). • Level of significance of the test: 0.05 (Prob. reject H0 given it is true). • Critical value: = . = 1. 645 (This is the z score corresponding to an area to the left equal to 1-0.05=0.95) • Critical region (Rejection of H0). All values of the test statistic greater that 1.645. • P-value: Area to the right of the observed test statistic (z=7.913). This area is 0.0001 Take a decision based on the critical region or p-value • Using the critical region: because the test statistic (z= 7.913) is greater that the critical value (. = 1. 645 ) we reject the null hypothesis. • Using the p-value: because the p-value (0.0001) is lower than the significance level (0.05) we reject the null hypothesis. Note: We should reach the same conclusion under the two methods!!!! • CONCLUSION: The sample data support the claim that the majority of adults use alcoholic beverages. Example 2: Testing a claim about a mean: ߪ known • At a dam in Oregon, fisheries biologists are studying the length of a particular species of salmon to investigate the population structure of resident fish. They collect a sample of 60 fishes and find that the mean is 15 inches. Assume the population standard deviation is known from a previous study to be 1.5 inches. Use a 0.05 significance level to test the claim that this species of salmon have a mean length different than 14 inches. Set the Null and Alternative Hypotheses • Set the Null and Alternative Hypotheses about (mean length • • • • • • of salmon) Claim: ≠14 Since the claim does not contain the equal sign the claim becomes the Alternative Hypothesis. The opposite to the original claim is: =14. Since the opposite to the original claim contains the equal sign, this becomes the Null hypothesis. Finally set the Null hypothesis to =14 We finally want to prove: H0: =14 vs. H1: ≠14 From the sign of the Alternative Hypothesis we figure out that we have a Two-Tailed test. Select the test statistic and check requirements for its application • The sample is assumed to be a simple random sample • The random variable is: length of salmon • Since the sample size n > 30, a normal distribution can be used. • The sample mean is ̅ = 15. The sampling distribution of the sample mean is approximately normal with mean = and standard deviation = . • The test statistic distribution z= ̅ has a standard normal Critical Region and p-value • Test statistic: z= ̅ = భ.ఱ లబ = 5.164 (Note that is the value =14 is proposed under the Null Hypothesis. We normally assumed that the Null hypothesis is true before hand!!!). • Level of significance of the test: 0.05 • Critical value: / = . = 1.96 (This is the z score corresponding to an area to the left equal to 10.025=0.975), and - / = −. = -1.96 • Critical region (Rejection of H0). All values of the test statistic greater that 1.96 or lower than -1.96. • P-value: 2⨯Area to the right of the observed test statistic (z=5.164). This are is less than 0.0001 Take a decision based on the critical region or p-value • Using the critical region: because the test statistics (z= 5.164) is greater that the critical value (. = 1.96 ) we reject the null hypothesis. • Using the p-value: because the p-value (0.0001) is lower than the significance level (0.05) we reject the null hypothesis. Note: Again, we should reach the same conclusion under the two methods!!!! • CONCLUSION: The sample data support the claim that salmon have a mean length different that 14 inches Example 3: Testing a claim about a mean: • Sect. 7.5 # 17. Sugar in Cereal A sample of cereal boxes is randomly selected and the sugar content (Grams of sugar per gram of cereal) are recorded. Those amounts are summarized with these statistics: n=16, ̅ = 0.295, = 0.168. Use a 0.10 significance level to test the claim of a cereal lobbyist that the mean sugar content for all cereals is less than 0.3 g. Assume that a simple random sample has been selected from a normally distributed population. Set the Null and Alternative Hypotheses • Set the Null and Alternative Hypotheses about (mean • • • • • • sugar content of cereals) Claim: <0.3 Since the claim does not contain the equal sign the claim becomes the Alternative Hypothesis. The opposite to the original claim is: ≥0.3. Since the opposite to the original claim contains the equal sign, this becomes the Null hypothesis. Finally set the Null hypothesis to =0.3 We finally want to prove: H0: =0.3 vs. H1: <0.3 From the sign of the Alternative Hypothesis we figure out that we have a Left-Tailed test. Select the test statistic and check requirements for its application • The sample is assumed to be a simple random sample • The random variable is: sugar content in cereals • The sample size n <30, but the sample comes from a normally distributed population. • The population standard deviation is unknown. So we use the sample standard deviation. • The sample mean is ̅ = 0.295. The sampling distribution of the sample mean has a Student t distribution with mean = and standard deviation = with n-1 degrees of freedom. • The test statistic to be used is degrees of freedom. t= ̅ ೞ with n-1=15 Critical Region and p-value • Test statistic: t= ̅ ೞ = .. బ.భలఴ భల = -0.119 (Note that the value =0.3 is proposed under the Null Hypothesis. We normally assumed that the Null hypothesis is true before hand!!!). • Level of significance of the test: 0.10 • Critical value: - = − . = -1.341 (This is the critical t value corresponding to a one-tail area to the left equal to 0.10 and 15 degrees of freedom). • Critical region (Rejection of H0). All values of the test statistic lower that -1.341. • P-value: Area to the left of the observed test statistic (z=-0.119). This area is greater than 0.10. Using the computer we found that this value is exactly 0.453. Take a decision based on the critical region or p-value • Using the critical region: because the test statistic (t= - 0.119) is greater that the critical value ( . = -1.341 ) we fail to reject the null hypothesis. • Using the p-value: because the p-value (0.453) is greater than the significance level (0.10) we fail to reject the null hypothesis. Note: Again, we should reach the same conclusion under the two methods!!!! • CONCLUSION: There is not sufficient sample evidence to support the claim that the mean sugar content for all cereals is less than 0.3 g Confidence Interval Method of Testing hypothesis • For a two-tailed tests construct a confidence interval with a confidence level of 1-ߙ • For a one-tailed test construct a confidence interval with a confidence level of 1-2ߙ (you have to double the level of significance ߙ) • We take the decision based on whether the proposed parameter value falls within the confidence interval limits. Confidence Interval Method for example 3 • =0.10. Construct a confidence interval with confidence level 1- 2⨯0.1=0.80 • Confidence Interval with 80% Confidence Level: s s ̅ − 1.341 × < < ̅ + 1.341 × 0.295−1.341 × . < < 0.295 + 1.341 × . 0.238678 < < . CONCLUSION: The value of ߤ=0.3 does fall in the Confidence Interval. We fail to reject H0 and we reached the same conclusion: There is not sufficient sample evidence to support the claim that the mean sugar content for all cereals is less than 0.3 g Additional Note on using Confidence Intervals for hypothesis testing • When testing a claim about a population mean, the traditional method using the critical region, the p-value method and the confidence interval method are all equivalent and we should expect the same conclusion. • When testing a claim about a population proportion, the critical region method and the p-value method are equivalent. The confidence interval methods can give different results because the standard deviation of the population proportion is calculated in different ways: • Standard Dev. of the sample proportion (Confidence Interval Method): • Standard Dev. of the sample proportion (Critical region or p- value Method): . Testing a claim about a standard deviation or variance • REQUIREMENTS 1) Simple random sample 2) Population must have a normal distribution (more strict) 3) Test statistic has a chi-square distribution: − 1 = with n-1 degrees of freedom Chi-square distribution • PROPERTIES 1. All values are non-negatives 2. Distribution is not symmetric 3. Different distributions for different degrees of freedom 4. Critical vales in Table A-4 Chi-square distribution Example: Testing a claim on a standard deviation According to the US department of Agriculture, imports of Canadian grown potatoes have depressed US sales of potatoes during the last six years. From this sample the standard deviation is 2.07. Assume we know from the past, the population standard deviation was .79. Use a 0.05 significance level to test the claim that US potato sales have more variation in the last six years than in the past. Assume a simple random sample and a normally distributed population. Set the Null and Alternative Hypotheses • Set the Null and Alternative Hypotheses about • • • • • • (population standard deviation of US sales of potatoes) Claim: >0.79 Since the claim does not contain the equal sign the claim becomes the Alternative Hypothesis. The opposite to the original claim is: ≤0.79. Since the opposite to the original claim contains the equal sign, this becomes the Null hypothesis. Finally set the Null hypothesis to =0.79 We finally want to prove: H0: =0.79 vs. H1: >0.79 From the sign of the Alternative Hypothesis we figure out that we have a Right-Tailed test. Select the test statistic and check requirements for its application • The sample is assumed to be a simple random sample • The random variable is: US sales of potatoes • The sample size is n=6 <30, but the sample comes from a normally distributed population. • The test statistic to be used is degrees of freedom. () మ 2 X= మ with n-1=5 Critical Region and p-value • Test statistic: () = మ మ = × .మ .మ =34.328 with n-1=5 degrees of freedom. • (Note that the value =0.79 is proposed under the Null • • • • Hypothesis. We normally assumed that the Null hypothesis is true before hand!!!). Level of significance of the test: 0.05 Critical value: = 11.071 (This value corresponds to the column for =0.05 and 5 degrees of freedom of Table A-4). Critical region (Rejection of H0). All values of the test statistic greater than 11.071. P-value: Area to the right of the observed test statistic ( =34.328). This area is lower than 0.005. Using the computer we found that this value is exactly 0.000002. Take a decision based on the critical region or p-value • Using the critical region: because the test statistic ( =34.328) is greater that the critical value ( = 11.071 ) we reject the null hypothesis. • Using the p-value: because the p-value (0.000002) is lower than the significance level (0.05) we reject the null hypothesis. Note: Again, we should reach the same conclusion under the two methods!!!! • CONCLUSION: The sample data supports the claim that US potato sales have more variation in the last six years than in the past