Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Psychometrics wikipedia , lookup
History of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Chapter 10 Hypothesis Test of a Claim Regarding a Population Parameter (p,μ) Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chap 2 2 Section 10.1 The Language of Hypothesis Testing Copyright © 2013, 2010 and 2007 Pearson Education, Inc. An hypothesis is a statement regarding a characteristic of one or more populations. From the Greek, an hypothesis is a statement, or claim, yet to be proved. In statistics, hypothesis testing is the use of sampling and probability theory to evaluate a claim that someone has made regarding a characteristic of the population. 10-4 CLAIMS REGARDING A CHARACTERISTIC OF A POPULATION 1. In 2008, 62% of American adults regularly volunteered their time for charity work. A researcher claims that this percentage is different today. 2. According to a study published in 2006 the mean length of a phone call on a cellular telephone was 3.25 minutes. A researcher claims that call length has increased since then. 3. Using an old manufacturing process, the standard deviation of the amount of wine put in a bottle was 0.23 oz. With new equipment, the quality control manager claims the standard deviation has decreased. 10-5 We test these types of hypotheses (claims) using sample data because it is usually impossible/impractical to gain access to the entire population database. If we have access to population data, obviously we have no need for inferential statistics to test these claims because we can just look at the facts. 10-6 Steps in Hypothesis Testing 1. Note a claim someone has made regarding the population. Generate two mutually exclusive math hypotheses including the claim. 2. Collect evidence (sample data) pertinent to the claim. Make a decision as to which hypothesis is more likely. 3. Analyze the decision to assess the plausibility of the claim. 10-7 The null hypothesis, denoted H0, is a statement to be tested. The null hypothesis is a statement which is assumed true until/unless evidence (sample) indicates otherwise. (Innocent until proven Guilty) 10-8 The alternative hypothesis, denoted H1, is a statement that contradicts the H0 . We look at sample data (evidence) and decide to support/reject one of these mutually exclusive hypotheses. We use that decision to make a comment about the plausibility of the claim we have been given to evaluate. 10-9 “In Other Words” The null hypothesis is a statement of status quo or no difference and always contains a statement of equality. The null hypothesis is initially assumed to be true until/unless we have evidence to the contrary. We look to see if there is evidence that rejects the null and therefore supports the alternative hypothesis. 10-10 “On Trial” null hypothesis: man = innocent and is assumed true until/unless we find evidence to the contrary. alternative hypothesis: man ≠ innocent Evaluate evidence (sample data) and decide to reject or fail-to-reject the null. The evidence will cause us to support or reject the initial claim of innocence. 10-11 in the courtroom… We are trying Karl for impersonating a Math Professor. He looks guilty, but we must assume he is innocent anyway…. We ask 100 of his students whether Karl knows anything about math. For a response. we use a scale of 0 to 10, with 10 meaning he’s a math genius, and 0 meaning he’s a math idiot (not a total idiot). The mean of our sample turns out to be 1.0445 on a scale of 0 to 10 in the courtroom… We conclude that it would be such a rare event (exceptionally small probability) for Karl to actually be a Math Professor and get such a pathetically low sample score of 1.0445, that our initial assumption of innocence was probably wrong, so we reject it… (and we send him to jail where he belongs, for impersonating a Math Professor.) Forming Hypotheses For each of the following claims, determine the null and alternative hypotheses. State whether the test is two-tailed, lefttailed or right-tailed. a) In 2008, 62% of American adults regularly volunteered their time for charity work. A researcher claims that this percentage is different today. H0: p=0.62 b) According to a study published in 2006, the mean length of a phone call on a cellular telephone was 3.25 minutes. A researcher claims that the mean length of a call has increased since then. H0: μ = 3.25. c) Using an old manufacturing process, the standard deviation of the amount of wine put in a bottle was 0.23 ounces. With new equipment, the quality control manager claims the standard deviation has decreased. d) H1: σ < 0.23, a left-tailed test 10-14 Types of Errors Actual Truth of H0 Decision H0 is true H0 is false Fail to reject H0 Correct Decision Type II Error Reject H0 Correct Decision Type I Error Larson & Farber, Elementary Statistics: Picturing the World, 3e 15 α = P(Type I Error) = P(rejecting H0 when H0 is true) β = P(Type II Error) = P(not rejecting H0 when H1 is true) 10-16 Type I and Type II Errors For each of the following claims, explain what it would mean to make a Type I error. What would it mean to make a Type II error? a) In 2008, 62% of American adults regularly volunteered their time for charity work. A researcher claims that this percentage is different today. b) According to a study published in March, 2006 the mean length of a phone call on a cellular telephone was 3.25 minutes. A researcher claims the mean length of a call has increased since then. 10-17 Level of Significance In a hypothesis test, the level of significance is your maximum allowable probability of making a type I error. It is denoted by , the Greek letter “Alpha”. Hypothesis tests are based on . The probability of making a type II error is denoted by , the Greek letter “Beta”. By setting the level of significance at a small value, you are saying that you want the probability of rejecting a true null hypothesis (Type IError) to be small. Commonly used levels of significance: = 0.10 = 0.05 = 0.01 Larson & Farber, Elementary Statistics: Picturing the World, 3e 18 The probability of making a Type I error, α, is chosen by the researcher before the sample data is collected. The level of significance (LOS), α, is the probability of making a Type I error. As the probability of a Type I error increases, the probability of a Type II error decreases, and vice-versa. 10-19 We assume the null hypothesis is true. We look at the evidence (sample) and then decide to reject H0 or fail to reject H0 ,the null hypothesis. This is just like the court system where we assume the defendant is innocent, and then either find him “not guilty” (FTR H0 ) or “guilty” (Reject H0 ). 10-20 Stating the Conclusion According to a study published in 2006, the mean length of a phone call on a cell phone was 3.25 minutes. A researcher claims that the mean length of a call has increased since then. a) Suppose the sample evidence indicates that the null hypothesis should be rejected (Reject H0 ). State the wording of the conclusion. b) Suppose the sample evidence indicates that the null hypothesis should not be rejected (Fail to Reject H0 ). State the wording of the conclusion. 10-21 Statistical Tests After stating the null and alternative hypotheses and specifying the level of significance, a random sample is taken from the population and sample statistics are calculated. The statistic that is compared with the parameter in the null hypothesis is called the test statistic. Population parameter μ p 2 Test statistic x p̂ s2 Standardized test statistic z (n 30) t (n < 30) z X2 Larson & Farber, Elementary Statistics: Picturing the World, 3e 22 Hypothesis Testing: Quick review of two final steps: 6. Decision: Reject/FTR H0 7. Interpret Decision: Suf / Insuf evidence to Support Claim (Claim is the H1 ) Suf / Insuf evidence to Reject Claim (Claim is the H 0 ) Interpreting a Decision Claim Decision Claim is H0 Claim is H1 Reject H0 There is enough evidence to reject the claim. There is enough evidence to support the claim. Do not reject H0 There is not enough evidence to reject the claim. There is not enough evidence to support the claim. Larson & Farber, Elementary Statistics: Picturing the World, 3e 24 Section 10.2 Hypothesis Tests for a Population Proportion (P) Copyright © 2013, 2010 and 2007 Pearson Education, Inc. A researcher assumes (based on her prior research) that the population proportion of people who are in favor of the banning cell phone use while driving is 0.5 She obtains a random sample of 1000 people and finds that 534 are in favor of the banning cell phone use while driving. so pˆ = 534/1000 = 0.534 Would it be unusual to obtain a sample proportion of 0.534 or higher from a population whose proportion is 0.5? What is convincing, or statistically significant, evidence? 10-26 When observed results are unlikely compared to null hypothesis (H0 ), we say the result is statistically significant and we reject the null hypothesis (which we always assume to be true at the start). 10-27 To determine if a sample proportion of 0.534 is statistically significant, we build a probability model. 1. npq = 100(0.5)(0.5) = 250 ≥ 10 2. Sample n = 1000 is < 5% population size We can use the normal model to describe the distribution. The mean of the distribution pö 0.5 and the standard deviation is pö 0.016. 0.5 1 0.5 1000 10-28 Sampling distribution of the sample proportion 10-29 Recall that our simple random sample yielded a sample proportion of 0.534, so z pö pö pö 0.534 0.5 2.15 0.5 1 0.5 1000 Our sample was 2.15 std deviations above the hypothesized proportion of 0.5 which is an “unlikely” result. Therefore, we reject the null hypothesis (H0 ) that the population proportion was 0.50. 10-30 Why does it make sense to reject the null hypothesis if the sample proportion is more than 2 standard deviations away from the hypothesized proportion? The area under the standard normal curve to the right of z = 2 is 0.0228, or only 2.3% of the time would you expect to get a sample of 0.532 10-31 Hypothesis Testing If the sample proportion Z-score is too many standard deviations (generally 2 or more) from the proportion stated in the null hypothesis H0 (assumed true), then we delete that assumption based on evidence (our sample) and “reject the H0” 10-32 Testing Hypotheses Regarding a Population Proportion, p The best point estimate of p, the proportion of the population is given by x pˆ n where x is the number of “success” individuals in the sample and n is the sample size. 10-33 The sampling distribution of pˆ is approximately normal, with mean pˆ p and standard deviation p(1 p) pˆ n provided that the following requirements are satisfied: 1. The sample is a simple random sample. 2. np(1-p) ≥ 10 and n < 5% N 3. The sampled values are independent of each other. 10-34 A. Critical Value Test or Classical Approach Step 1: Determine the null and alternative hypotheses. The hypotheses can be structured in one of three ways: 10-35 A. Critical Value Test or Classical Approach Step 2: Select a level of significance, α, based on the max allowable probability of making a Type I error. 10-36 A. Critical Value Test or Classical Approach Step 3: Compute the standardized test statistic pˆ p0 z0 p0 (1 p0 ) n Note: p0 refers to the value of p in the H0 hypothesis 10-37 Use TI-84 or Table V to determine the critical value. Left-Tailed (critical value) 10-38 Use TI-84 or Table V to determine the critical value. Right-Tailed (critical value) 10-39 Use TI-84 or Table V to determine the critical value. Two-Tailed (critical value) 10-40 A. Critical Value Test or Classical Approach Step 4: Compare the Critical Z Value (based on α) with the “Standardized Test Statistic or STS” (which is the Z-score of your sample). If the STS is outside (more extreme than) the Critical Z, then your sample falls in the Rejection Region and your Decision is to Reject the null (H0 ). If the STS is not in the Rejection Region, then your Decision is to Fail to Reject the null (H0 ). 10-41 B. P-Value Approach The P-Value Approach has the same objective as the Critical Value Approach, which is to decide whether to Reject or Fail-To-Reject (FTR) the null H0 based on the LOS ( α )of the problem The BIG advantage is that the TI-84 does all the work of calculating the STS (below) and the std dev of the sample. z0 pˆ p0 p0 (1 p0 ) n 10-42 B. P-Value Approach If the P-value < α, the Z of sample is in the Rejection Region, so your Decision is : Reject the null hypothesis H0. If P ≥ α , then the Z of your sample is not in the Rejection Region, so your Decision is: Fail-To-Reject the H0 Now, finally “Interpret the Decision” you just made by making a statement about the Claim… 10-43 Testing a Hypothesis about a Population Proportion: Large Sample Size In 1997, 46% of Americans said they did not trust the media “when it comes to reporting the news fully, accurately and fairly”. In a 2007 poll of 1010 adults nationwide, 525 stated they did not trust the media. At the α = 0.05 level of significance (LOS), is there evidence to support the claim that the percentage of Americans that do not trust the media has increased since 1997? 10-44 We want to evaluate the claim that p > 0.46. First, we must verify that we can perform the hypothesis test using z-scores. For the sampling distribution of pˆ to be approximately normal, so we require npq be at least 10. 1. This isa simple random sample. 2. np0q0 = 1010(0.46)(0.54) = 251 > 10 3. The sample size is less than 5% of the population size, so the assumption of independence is met. 10-45 Step 1: H0: p ≤ 0.46 H1: p > 0.46 CLAIM Step 2: LOS = α = 0.05 525 0.52. Step 3: The sample proportion is pˆ 1010 The standardized test statistic (STS) is: 0.52 0.46 z0 3.83 0.46(0.54) 1010 10-46 A. Critical Value Approach Step 4: This is a right-tailed test, and the critical value at the α = 0.05 level is: z0.05 = 1.645 (See Table V) Step 5: The STS is z0 = 3.83, which is greater than the critical value 1.645. Therefore, the sample lies in the Rejection Region, so we decide to: Reject the null hypothesis. 10-47 B. P-Value Approach Step 4: Since this is a right-tailed test, the P- value is the area to the right of (more extreme) the STS test statistic z0=3.83. P-value = normcdf(3.83, 1E99) = 6.4E-5 ≈ 0. Step 5: Since the P-value is less than α = 0.05, we decide to: Reject the H0 10-48 “Interpret the Decision” Step 6: Based on our sample, there is sufficient evidence at the α = 0.05 level of significance to support the claim that the percentage of Americans that do not trust the media has increased since 1997. 10-49 TI-84 (Proportion) STAT:TESTS:5:1-PropZTest po = p used in Ho x = number of successes in the sample (must be an Integer) n from your sample For prop line: use the Ha symbol Draw: will show the shaded part of the curve and also show z and P-value. TI-84 (Proportion) STAT:TESTS:5:1-PropZTest po = 0.46 x = 525 (must be an Integer) n = 1010 For prop line: > po (Right-tailed Test) Draw: shows z = 3.8133 and p=1E-4 Note: The TI-84 does not know LOS, so it does not know where the Reject Region is. Section 10.3 Hypothesis Tests for a Population Mean (μ) Copyright © 2013, 2010 and 2007 Pearson Education, Inc. To test hypotheses regarding the population mean assuming the population standard deviation is unknown, we use the t-distribution rather than the Z-distribution. When we replace σ with s, x 0 t0 s n follows Student’s t-distribution with n –1 degrees of freedom. 10-53 Properties of the t-Distribution: Table VI 1. 2. 3. The t-distribution is different for different degrees of freedom. The t-distribution is centered at 0 and is symmetric about 0. The area in the tails of the t-distribution is a little greater than the area in the tails of the standard normal distribution because using s as an estimate of σ introduces more variability to the t-statistic. 10-54 Properties of the t-Distribution 4. 5. As the sample size n increases, the density curve of t gets closer to the standard normal z density curve. This result occurs because as the sample size increases, the values of “s” get closer to the values of “σ” by the Law of Large Numbers. 10-55 Testing Hypotheses Regarding a Population Mean To test hypotheses regarding the population mean, we require that: 1. 3. The sample is obtained using simple random sampling. 2. The sample has no outliers, and the population from which the sample is drawn is normally distributed OR the sample size is large (n ≥ 30). The sampled values are independent of each other. 10-56 Step 1: Determine the null and alternative hypotheses. They can be structured in one of three ways: 10-57 Classical Approach Step 2: Compute the standardized test statistic x 0 t0 s n which follows the Student’s t-distribution with n – 1 degrees of freedom. Step 3. Use Table VI to determine the critical t value. 10-58 P-Value Approach Step 1: Compute the STS: x 0 t0 s n Use Table VI or the TI-84 to approximate the P-value. 10-59 P-Value Approach Left-Tailed 10-60 P-Value Approach Right-Tailed 10-61 P-Value Approach Two-Tailed 10-62 P-Value Approach If the P-value < α, the “t” of sample is in the Rejection Region, so your Decision is : Reject the null hypothesis H0. If P ≥ α , then the “t” of your sample is not in the Rejection Region, so your Decision is: Fail-To-Reject the H0 Now, finally “Interpret the Decision” you just made by making a statement about the Claim… 10-63 The procedure is robust, which means that minor departures from normality will not adversely affect the results of the test. However, for small samples, if the data have outliers, or multiple modes, this procedure should not be used because the distribution will not be approximately normal. 10-64 Testing a Hypothesis about a Population Mean, Large Sample Assume the resting metabolic rate (RMR) of healthy males in complete silence is 5710. Researchers measured the RMR of 45 healthy males who were listening to calm classical music and found their mean RMR to be 5500 with a standard deviation of 992. At the α = 0.05 level of significance, is there evidence to support the researcher’s claim that the mean RMR of males listening to calm classical music is not 5710? 10-65 Solution Step 1: H0: μ = 5710 versus H1: μ ≠ 5710 Claim Step 2: The level of significance is α = 0.05. Step 3: The sample mean is x = 5500 and the sample standard deviation is s = 992. The STS is 5710 5500 t0 1.4201 992 45 10-66 Solution: Classical Approach Step 4: Since this is a two-tailed test, the critical values at the α = 0.05 level with (n –1) = 44 to be t0.025 = ± 2.021 Step 5: Since the test statistic, t0 = – 1.4201, is between the critical values, (not in the Reject Region) we fail to reject the null hypothesis. 10-67 TI-84 (n<30) STAT:TESTS:2:T-Test o = 5710 ( Ho ) X-bar = 5500 n = 45 Sx = 992 from sample For line: ≠ o (H1 symbol) Draw: t = - 1.4201 p = 0.1626 Solution: P-Value Approach Step 5: Since the P-value is greater than the LOS (0.1626 > 0. 05), our Decision is to Fail to Reject H0 Step 6: There is insufficient evidence at the α = 0.05 level of significance to support the claim that the mean RMR of males listening to calm classical music is not 5710. 10-69 Testing a Hypothesis about a Population Mean, Small Sample According to the US Mint, standard quarters weigh 5.67 grams (28g/oz). A researcher suspects the new “US State” quarters have a weight that is heavier than the normal 5.67 grams. He randomly selects 18 “state” quarters, weighs them and obtains the following data. 5.70 5.67 5.73 5.61 5.70 5.67 5.65 5.62 5.73 5.65 5.79 5.73 5.77 5.71 5.70 5.76 5.73 5.72 At the α = 0.05 level of significance, is there evidence to support the claim that “State” quarters weigh more than 5.67 grams? 10-70 Solution Step 1: H0: μ ≤ 5.67 versus H1: μ > 5.67 Claim Step 2: The level of significance is α = 0.05. Step 3: From the data, the sample mean is calculated to be 5.7022 and the sample standard deviation is s = 0.0497. The STS statistic is 5.7022 5.67 t0 2.75 .0497 18 10-71 TI-84 (n<30) STAT:TESTS:2:T-Test o = 5.67 ( Ho ) X-bar = 5.7022 n = 18 Sx = 0.0497 from sample For line: > o (H1 symbol) Draw: t = 2.7488 p = 0.0069 Solution: P-Value Approach Step 5: Since the P-value is less than the level of significance (0.0069 < 0.05), our Decision is to Reject the H0 . Step 6: There is sufficient evidence at the α = 0.05 level of significance to support the claim that the mean weight of the “State” quarters is greater than 5.67 grams. Note: t–crit = 1.740 and STS = 2.7488, so sample is in the Reject Region (Critical Value Method). 10-73 Section 10.4 Hypothesis Tests for a Population Standard Deviation (σ) Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chi-Square Distribution If a random sample of size “n” is obtained from a normally distributed population with mean “μ” and standard deviation “σ”, then 2 (n 1)s 2 2 where s2 is a sample variance has a chi-square distribution with n – 1 degrees of freedom. 10-75 Characteristics of the Chi-Square Distribution 1. 2. 3. 4. It is not symmetric. The shape of the chi-square distribution depends on the degrees of freedom, just as with Student’s t-distribution. As the number of degrees of freedom increases, the chi-square distribution becomes more nearly symmetric. The values of χ2 are always nonnegative (greater than or equal to 0). 10-76 10-77 Testing Hypotheses about a Population Variance or Standard Deviation To test hypotheses about the population variance or standard deviation, we can use the following steps, provided that: The sample is obtained using simple random sampling. The population is normally distributed. 10-78 Step 1: Determine the null and alternative hypotheses. The hypotheses can be structured in one of three ways: 10-79 Compute the standardized test statistic 02 (n 1)s 2 2 0 Use Table VII to determine the critical value using n – 1 degrees of freedom. 10-80 Classical Approach Left -Tailed 10-81 Classical Approach Right-Tailed 10-82 Classical Approach Two-Tailed 10-83 Testing a Hypothesis about a Population Standard Deviation A can of soda-pop states that the can contains 355 ml (11.5 oz) of soda. The can-filling machine has a specification of ± 3.2 ml max standard deviation, but a quality control engineer suspects the machine is not calibrated correctly. She wants to verify the machine is not under- or over-filling the cans, so she randomly selects 9 cans of the soda and measures the contents. She obtains the following volume data: 351 360 358 356 359 358 355 361 352 Test her claim that the soda standard deviation, σ, is greater than 3.2 ml at the α = 0.05 level of significance. 10-84 Solution Step 1: H0: σ = 3.2 H1: σ > 3.2 Claim This is a right-tailed test. Step 2: The level of significance is α = 0.05. Step 3: From the data, the sample standard deviation is computed to be s = 3.464. The standardized test statistic is 2 (9 1)(3.464) 2 0 9.374 2 3.2 10-85 Solution: Classical Approach Step 4: Since this is a right-tailed test, we determine the critical value at the α = 0.05 with 8 degrees of freedom to be χ20.05= 15.507. Step 5: Since STS 02 9.374 is less than the critical value, 15.507, so we Fail to Reject H0 Step 6: There is insufficient evidence at α = 0.05 the claim that the standard deviation LOS to support of the soda can contents is greater than 3.2 ml. 10-86 Section 10.5 Which Method Do I Use? Copyright © 2013, 2010 and 2007 Pearson Education, Inc. 10-88 Section 10.6 The Probability of a Type II Error and the Power of the Test Copyright © 2013, 2010 and 2007 Pearson Education, Inc. The Probability of a Type II Error Step 1: Determine the sample mean that separates the rejection region from the non-rejection region. 10-90 Computing the Probability of a Type II Error Earlier, we tested the hypothesis that the mean trade volume of Apple stock was greater than 35.14 million shares, H0: μ = 35.14 versus H1: μ > 35.14, based upon a random sample of size n = 40 with the population standard deviation, σ, assumed to be 15.07 million shares at the α = 0.1 level of significance. Compute the probability of a type II error given that the population mean is μ = 40.62. 10-91 Solution Step 1: Since z0.9=1.28, we let z = 1.28, μ0 = 35.14, σ = 15.07, and n = 40 find the sample mean that separates the rejection region from the nonrejection region: x 35.14 1.28 x 38.19 15.07 40 For any sample mean less than 38.19, we do not reject the null hypothesis. 10-92 Solution 10-93 Solution 10-94 Solution Step 3: P(Type II error) = β = P(do not reject H0 given H1 is true) = P( x< 38.19 given that μ =40.62) = 38.19 40.62 PZ P(Z 1.02) 15.07 40 β = 0.1539 10-95 Power of the Test The probability of rejecting the null hypothesis when the alternative hypothesis is true is (1 – β) which is referred to as the power of the test. The higher the power of the test, the more likely the test will reject the null when the alternative hypothesis is true. 10-96 Recall that β = 0.1539. The power of the test is 1 – β = 1 – 0.1539 = 0.8461. There is a 84.61% chance of rejecting the null hypothesis when the true population mean is 40.62. 10-97 Chap 2 98