* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Hypothesis Testing
Bootstrapping (statistics) wikipedia , lookup
Psychometrics wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Omnibus test wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Statistical Inference: Hypothesis Testing for Single Populations 1 Sourabh What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: – population mean Example: The mean monthly cell phone bill of this city is μ = Rs 429 – 2 population proportion Example: The proportion of adults in this city with cell phones is p = 0.68 Sourabh What is a Hypothesis? 3 A hypothesis is an assumption about the population parameter. – A parameter is a characteristic of the population, like its mean Proportion or variance. – The parameter must be identified before analysis. I assume the mean GPA of this class is 3.5! Sourabh Types of Hypotheses Research Hypothesis – Statistical Hypotheses – 4 a statement of what the researcher believes will be the outcome of an experiment or a study. a more formal structure derived from the research hypothesis. Sourabh Example Research Hypotheses 5 Older workers are more loyal to a company Companies with more than $1 billion of assets spend a higher percentage of their annual budget on advertising than do companies with less than $1 billion of assets. The price of scrap metal is a good indicator of the industrial production index six months later. Sourabh Statistical Hypotheses Two Parts – – Null Hypothesis – nothing new is happening Alternative Hypothesis – something new is happening Notation – 6 a null hypothesis an alternative hypothesis – null: H0 alternative: H1 or Ha Sourabh Null and Alternative Hypotheses 7 The Null and Alternative Hypotheses are mutually exclusive. Only one of them can be true. The Null and Alternative Hypotheses are collectively exhaustive. They are stated to include all possibilities. (An abbreviated form of the null hypothesis is often used.) The Null Hypothesis is assumed to be true. The burden of proof falls on the Alternative Hypothesis. Sourabh The Null Hypothesis, H0 States the claim or assertion to be tested Example: The average number of TV sets in U.S. Homes is equal to three ( H 0 : μ 3 ) Is always about a population parameter, not about a sample statistic H0 : μ 3 8 H0 : X 3 Sourabh Null and Alternative Hypotheses: Example A manufacturer is filling 40 oz. packages with flour. The company wants the package contents to average 40 ounces. H 0 : 40 oz H a : 40 oz 9 Sourabh Hypothesis Testing Process Claim: the population mean age is 50. (Null Hypothesis: H0: μ = 50 ) Population Is X 20 likely if μ = 50? If not likely, REJECT 10 Null Hypothesis Suppose the sample mean age is 20: X = 20 Now select a random sample Sample Sourabh Reason for Rejecting H0 Sampling Distribution of X 20 If it is unlikely that we would get a sample mean of 11 this value ... X μ = 50 If H0 is true ... if in fact this were the population mean… ... then we reject the null hypothesis that μ = 50. Sourabh Level of Significance, Defines the unlikely values of the sample statistic if the null hypothesis is true – Is designated by , (level of significance) – Defines rejection region of the sampling distribution Typical values are 0.01, 0.05, or 0.10 Is selected by the researcher at the beginning 12 Provides the critical value(s) of the test Sourabh Level of Significance and the Rejection Region Level of significance = H0: μ = 3 H1: μ ≠ 3 /2 Two-tail test /2 Upper-tail test H0: μ ≥ 3 H1: μ < 3 13 Rejection region is shaded 0 H0: μ ≤ 3 H1: μ > 3 Represents critical value 0 Lower-tail test 0 Sourabh Errors in Making Decisions Type I Error – Reject a true null hypothesis – Considered a serious type of error The probability of Type I Error is Called level of significance of the test Set 14 by the researcher in advance Sourabh Errors in Making Decisions Type II Error – Fail to reject a false null hypothesis The probability of Type II Error is β 15 Sourabh (continued) Outcomes and Probabilities Possible Hypothesis Test Outcomes Actual Situation Key: Outcome (Probability) 16 Decision H0 True H0 False Do Not Reject H0 No error (1 - ) Type II Error (β) Reject H0 Type I Error () No Error (1-β) Sourabh Type I & II Error Relationship Type I and Type II errors cannot happen at the same time Type I error can only occur if H0 is true Type II error can only occur if H0 is false If Type I error probability ( ) , then Type II error probability ( β ) 17 Sourabh Hypothesis Tests for the Mean Hypothesis Tests for Known (Z test) 18 Unknown (t test) Sourabh Z Test of Hypothesis for the Mean (σ Known) Convert sample statistic ( X ) to a Z test statistic Hypothesis Tests for Known σKnown (Z test) σUnknown Unknown (t test) The test statistic is: Z 19 X μ σ n Sourabh Critical Value Approach to Testing 20 mean For a two-tail test for the, σ known: Convert sample statistic ( X ) to test statistic (Z statistic ) Determine the critical Z values for a specified level of significance from a table or computer Decision Rule: If the test statistic falls in the rejection region, reject H0 ; otherwise do not reject H0 Sourabh Two-Tail Tests There are two cutoff values (critical values), defining the regions of rejection H0: μ = 3 H1: μ 3 /2 /2 X 3 Reject H0 -Z 21 Lower critical value Do not reject H0 0 Reject H0 +Z Upper critical Sourabh value Z 6 Steps in Hypothesis Testing 1. 2. 22 State the null hypothesis, H0 and the alternative hypothesis, H1 Choose the level of significance, , and the sample size, n 3. Determine the appropriate test statistic and sampling distribution 4. Determine the critical values that divide the rejection and nonrejection regions Sourabh 6 Steps in Hypothesis Testing (continued) 23 5. Collect data and compute the value of the test statistic 6. Make the statistical decision and state the managerial conclusion. If the test statistic falls into the nonrejection region, do not reject the null hypothesis H0. If the test statistic falls into the rejection region, reject the null hypothesis. Express the managerial conclusion in the context of the problem Sourabh Hypothesis Testing Example Test the claim that the true mean # of TV sets in US homes is equal to 3. (Assume σ = 0.8) 24 1. State the appropriate null and alternative hypotheses – H0: μ = 3 H1: μ ≠ 3 (This is a two-tail test) 2. Specify the desired level of significance and the sample size – Suppose that = 0.05 and n = 100 are chosen for this test Sourabh Hypothesis Testing Example (continued) 3. Determine the appropriate technique – σ is known so this is a Z test. 4. Determine the critical values – For = 0.05 the critical Z values are ±1.96 5. Collect the data and compute the test statistic – Suppose the sample results are n = 100, X = 2.84 (σ = 0.8 is assumed known) So the test statistic is: Z 25 X μ 2.84 3 .16 2.0 σ 0.8 .08 n 100 Sourabh (continued) Hypothesis Testing Example 6. Is the test statistic in the rejection region? = 0.05/2 26 Reject H0 if Z < -1.96 or Z > 1.96; otherwise do not reject H0 Reject H0 -Z= -1.96 = 0.05/2 Do not reject H0 0 Reject H0 +Z= +1.96 Here, Z = -2.0 < -1.96, so the test statistic is in the rejection region Sourabh Hypothesis Testing Example (continued) 6(continued). Reach a decision and interpret the result = 0.05/2 Reject H0 -Z= -1.96 = 0.05/2 Do not reject H0 0 Reject H0 +Z= +1.96 -2.0 27 Since Z = -2.0 < -1.96, we reject the null hypothesis and conclude that there is sufficient evidence that the mean number Sourabh of TVs in US homes is not equal to 3 Connection to Confidence Intervals For X = 2.84, σ = 0.8 and n = 100, the 95% confidence interval is: 0.8 2.84 - (1.96) to 100 0.8 2.84 (1.96) 100 2.6832 ≤ μ ≤ 2.9968 28 Since this interval does not contain the hypothesized mean (3.0), we reject the null hypothesis at = 0.05 Sourabh One-Tail Tests In many cases, the alternative hypothesis focuses on a particular direction H0: μ ≥ 3 H1: μ < 3 H0: μ ≤ 3 H1: μ > 3 29 This is a lower-tail test since the alternative hypothesis is focused on the lower tail below the mean of 3 This is an upper-tail test since the alternative hypothesis is focused on the upper tail above the mean of 3 Sourabh Lower-Tail Tests H0: μ ≥ 3 There is only one critical value, since the rejection area is in only one tail H1: μ < 3 Reject H0 -Z Do not reject H0 0 μ 30 Critical value Z X Sourabh Upper-Tail Tests There is only one critical value, since the rejection area is in only one tail Z _ X 31 H0: μ ≤ 3 H1: μ > 3 Do not reject H0 0 Zα Reject H0 μ Critical value Sourabh Example: Upper-Tail Z Test for Mean ( Known) A phone industry manager thinks that customer monthly cell phone bills have increased, and now average over $52 per month. The company wishes to test this claim. (Assume = 10 is known) Form hypothesis test: H0: μ ≤ 52 the average is not over $52 per month H1: μ > 52 32 the average is greater than $52 per month (i.e., sufficient evidence exists to support the manager’s claim) Sourabh Example: Find Rejection Region (continued) Suppose that = 0.10 is chosen for this test Find the rejection region: Reject H0 = 0.10 Do not reject H0 0 33 1.28 Reject H0 Reject H0 if Z > 1.28 Sourabh Review: One-Tail Critical Value What is Z given = 0.10? 0.90 Standardized Normal Distribution Table (Portion) 0.10 = 0.10 0.90 Z .07 .08 .09 1.1 .8790 .8810 .8830 1.2 .8980 .8997 .9015 z 34 0 1.28 Critical Value = 1.28 1.3 .9147 .9162 .9177 Sourabh Example: Test Statistic (continued) Obtain sample and compute the test statistic Suppose a sample is taken with the following results: n = 64, X = 53.1 (=10 was assumed known) – 35 Then the test statistic is: Xμ 53.1 52 Z 0.88 σ 10 n 64 Sourabh Example: Decision (continued) Reject H0 Reach a decision and interpret the result: = 0.10 Do not reject H0 1.28 0 Z = 0.88 Reject H0 Do not reject H0 since Z = 0.88 ≤ 1.28 36 i.e.: there is not sufficient evidence that the mean bill is over $52 Sourabh Problem: A survey of CPA’s across the United States found that the average net income for sole proprietor CPA’s is $74,914. Because this survey is now more than seven years old, an accounting researcher wants to test the figure by taking a random sample of 112 sole proprietor accountants having the average net income $78,695 in the United States to determine whether the net income figure changed. The researcher could use the steps of hypothesis testing to do so. Assume the population standard deviation of net incomes for sole proprietor CPA’s is $14,530. 37 Sourabh Solution HYPOTHESIS: Because the researcher is testing to determine whether the figure has changed, the alternative hypothesis is that the average net income is not $74,914. the null hypothesis is that the mean still equals $74,914. these hypothesis follows H0 : µ = $74,914 H1 : µ ≠ $74,914 38 Sourabh Test: To determine the appropriate statistical test and sampling distribution. Because ` sample size is large (n=112) and the researcher is using the sample mean as statistics, the Z formula is appropriate test statistics. i.e, z X n 39 Sourabh Critical Region: Specify the type I error rate, or alpha (α), Which is 0.05 in this problem. Because the test is two tailed and α = 0.05, there is α /2 or 0.025 area in each of the tails of the distribution. Thus, the rejection region is in the two ends of the distribution with 2.5% of the area in each. There is a 0.475 area between the mean and each of the critical values that separates the tails of the distribution (the rejection region from the nonrejection region. By using this 0.475 area and Standard Normal table, the Critical value of z can be obtained. Zα/2 = ± 1.96 40 Sourabh The 112 CPA’s who respond produce samples mean of 78,695. The value of tesy statistic is calculated by using = $78,695, n = 112, σ = $74,914, and µ = $74,914 z 78,695 74,914 2.75 14,530 112 .025 -1.96 41 .025 0 1.96 Sourabh z BUSINESS IMPLICATION: 42 What does this result mean? Statistically, the researcher has enough evidence to reject the figure of $74,914 as the true national average net income for sole proprietor CPAs. Although the researcher conducted a two tailed test, the evidence gathered indicates that the national average may have increased. The sample mean of $78,695 is $3,781 higher than the national mean being tested. The researcher can conclude that the national average is more than before, but because the $78,695 is only a sample mean, it offers no guarantee that the national average for all sole proprietor CPAs is $3,781 more. If a confidence interval were constructed with the sample data, $78,695 would be the point estimate. Other samples might produce different sample means. Managerially this statistical finding may mean that CPAs will be more expensive to hire either as full-time employees or as consultants. It may mean that consulting services have gone up in price. For new accountants, it may mean the potential for greater earning power. If the sample mean of $78,695is the actual new population average for the year 2002, it would represent an increase of $3,781 over a seven-year period. This increase may or may not be substantive depending on one’s point of view. Sourabh Problem (B) In the CPA net income example, suppose only 600 sole proprietor CPAs practice in the United States. A sample of 112 CPAs taken from a population of only 600 CPAs is 18.67% of the population and therefore is much more likely to be representative of the population than a sample of 112 CPAs taken from a population of 20,000 CPAs (.56% of the population). The finite correction factor takes this difference into consideration and allows for an increase in the observed value of z. The observed z value would change to z 43 X ( N n) ( N 1) n 78,695 74,914 3,781 3.05 14,530 (600 112) 1,239.2 (600 1) 112 Use of the finite correction factor increased the observed z value from 2.75 to 3.05. The decision to reject the null hypothesis does not change with this new information. However, on occasion, the finite correction factor can make the Sourabh difference between rejecting and failing to reject the null hypothesis. Question: Generally Electric has developed a new bulb whose design specifications call for a light output of 960 lumens compared to an earlier model that produced only 750 lumens. The company’s data indicate that the standard deviation of light output for this type of bulb is 18.4 lumen. From a sample of 20 bulbs, the testing committee found an average light output of 954 lumens per bulb. At a 0.05 significance level, can Generally Electric conclude that its new bulb is producing the specified 960 lumen output? 44 Sourabh Question: Maxwell’s Hot Chocolate is concerned about the effect of the recent yearlong coffee advertising campaign on hot chocolate sales. The average weekly hot chocolate sales two years ago was 984.7 pounds and the standard deviation was 72.6 pounds. Maxwell’s has randomly selected 30 weeks from the past year and found the average sales of 912.1 pound. (a)State appropriate hypothesis for testing whether hot chocolate sales have decreased. (b)At 2 percent significance level, test this hypothesis. 45 Sourabh Question: Atlas Sporting Goods has implemented a special trade promotion for its propane stove and feels that the promotion should result in a price change for the consumer. Atlas knows that before the promotion began, the average retail price of the stove was $44.95 and the standard deviation was $5.75. Atlas samples 25 of its retailers after the promotion begins and finds the average price for the stove is now $42.95. At a 0.02 significance level, does Atlas have reason to believe that the average retail price to the consumer has decreased? 46 Sourabh t Test of Hypothesis for the Mean (σ Unknown) Convert sample statistic ( X ) to a t test statistic Hypothesis Tests for σKnown Known (Z test) σUnknown Unknown (t test) The test statistic is: t n -1 47 X μ S n Sourabh Example: Two-Tail Test ( Unknown) The average cost of a hotel room in New York is said to be $168 per night. A random sample of 25 hotels resulted in X = $172.50 and S = $15.40. Test at the = 0.05 level. H0: μ = 168 H1: μ 168 (Assume the population distribution is normal) 48 Sourabh Example Solution: Two-Tail Test H0: μ = 168 H1: μ 168 = 0.05 /2=.025 Reject H0 -t n-1,α/2 -2.0639 n = 25 is unknown, so use a t statistic Critical Value: t24 = ± 2.0639 49 t n1 /2=.025 Do not reject H0 0 1.46 Reject H0 t n-1,α/2 2.0639 X μ 172.50 168 1.46 S 15.40 n 25 Do not reject H0: not sufficient evidence that true mean cost is different than $168 Sourabh Connection to Confidence Intervals For X = 172.5, S = 15.40 and n = 25, the 95% confidence interval is: 172.5 - (2.0639) 15.4/ 25 to 172.5 + (2.0639) 15.4/ 25 166.14 ≤ μ ≤ 178.86 50 Since this interval contains the Hypothesized mean (168), we do not reject the null hypothesis at = 0.05 Sourabh Question: Given a sample mean of 94.3, a sample standard deviation of 8.4, and a sample size of 6, tests the hypothesis that the value of the population mean is 100 against the alternative hypothesis that is less than 100. Use the 0.05 significance level. 51 Sourabh Question: The data-processing department at a large life insurance company has installed new color video display terminals to replace the monochrome units it previously used. The 95 operators trained to use the new machines averaged 7.2 hours before achieving a satisfactory level of performance. Their sample variance was 16.2 squared hours. Long experience with operators on the old monochrome terminals showed that they averaged 8.1 hours on the machines before their performances were satisfactory. At the 0.01 significance level, should the supervisor of the department conclude that the new terminals are easier to learn to operate? 52 Sourabh Question: As the bottom fell out of the oil market in early 1986, educators in Texas worried about how the resulting loss of state revenues (estimated to be about $100 million for each $1 decrease in the price of a barrel of oil) would affect their budgets. The state board of education felt the situation would not be critical as long as they could be reasonably certain that the price would stay above $18 per barrel. They surveyed 13 randomly chosen oil economists and asked them to predict how low the price would go before it bottomed out. The 13 predictions average $21.60, and the sample standard deviation was $4.65. At α = 0.01, is the average prediction significantly higher than $18.00? Should the board conclude that a budget crisis is unlikely? Explain. 53 Sourabh Hypothesis Tests for Proportions Involves categorical variables Two possible outcomes 54 – “Success” (possesses a certain characteristic) – “Failure” (does not possesses that characteristic) Fraction or proportion of the population in the “success” category is denoted by p Sourabh z Test of Population Proportion pˆ p z pq n where : pˆ sample proportion n p 5, and nq 5 p population proportion q 1- p 55 Sourabh Proportions Sample proportion in the success category is denoted by p p – X number of successes in sample n sample size When both nπ and n(1-π) are at least 5, p can be approximated by a normal distribution with mean and standard deviation – 56 (continued) μp p σp p(1 p) n Sourabh Example: Z Test for Proportion A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses. Test at the = 0.05 significance level. 57 Check: np = (500)(.08) = 40 n(1-p) = (500)(.92) = 460 Sourabh Z Test for Proportion: Solution H0: p = 0.08 H1: p 0.08 Test Statistic: Z = 0.05 n = 500, p = 0.05 Reject .025 .025 -1.96 58 -2.47 0 1.96 .05 .08 2.47 .08(1 .08) 500 Decision: Critical Values: ± 1.96 Reject p p p(1 p) n z Reject H0 at = 0.05 Conclusion: There is sufficient evidence to reject the company’s claim of 8% response rate. Sourabh Problem: Feronetics specializes in the use of gene-splicing techniques to produce new pharmaceutical compounds. It has recently developed a nasal spray containing interferon, which it believes will limit the transmission of the common cold within families. In the general population 15.1 percent of all individuals will catch a rhinoviruscaused cold once another family member contracts such a cold. The interferon spray was tested on 180 people, one of whose family members subsequently contracted a rhinovirus-caused cold. Only 17 of the test subjects developed similar colds. 1. At a significance level of 0.05, should Feronetics conclude that the new spray effectively reduces transmission of colds? 2. 59 What should it conclude at α = 0.02? Sourabh Problem: Rick Douglas, the new manager of Food Barn, is interested in the percent age of customers who are totally satisfied with the store. The previous manager had 86 percent of the customers totally satisfied, and Risk claims the same is true today. Rick sampled 187 customers and found 157 were totally satisfied. At the 1 percent significance level, is there evidence that Rick’s claim is valid? 60 Sourabh Problem: 61 From a total of 10,200 loans made by a state employees’ credit union in the most recent 5-year period, 350 were sampled to determine what proportion was made to women. This sample showed that 39 percent of the loans were made to women employees. A complete census of loans 5 years ago showed that 41 percent of the borrowers then were women. At a significance level of 0.02, can you conclude that the proportion of loans made to women has changed significantly in the past 5 years? Sourabh Problem: A television documentary on overeating claimed that Americans are about 10 pounds overweight on average. To test this claim, eighteen randomly selected individuals were examined; their average excess weight was found to be 12.4 pounds, and the sample standard deviation was 2.7 pounds. At a significance level of 0.01, is there any reason to doubt the validity of the claimed 10-pound value? 62 Sourabh