Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sufficient statistic wikipedia , lookup
Psychometrics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Foundations of statistics wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Dr. Ka-fu Wong ECON1003 Analysis of Economic Data Ka-fu Wong © 2003 Chap 8- 1 Central Limit Theorem #1 5 balls in the bag: 0 1 2 3 4 Draw 50 ball 1000 times with replacement. Compute the sample mean. Plot a relative frequency histogram (empirical probability histogram) of the 1000 sample means. The Central Limit Theorem says 1. The empirical histogram looks like a normal density. 2. Expected value (mean of the normal distribution) = 2. 3. Variance of the sample means = 2/50=0.04. Ka-fu Wong © 2003 Chap 8- 2 Confidence Interval #1 Five numbered balls in the bag: ? ? ? ? ? Draw one sample of 50 balls with replacement. Compute the sample mean and sample standard deviation. Suppose the sample mean is 10 and the sample standard deviation is 0.04. Can you tell us the range of possible values the population mean may take, at 95% confidence level? m Ka-fu Wong © 2003 Chap 8- 3 Hypothesis testing #1 Five numbered balls in the bag: ? ? ? ? ? Draw one sample of 50 balls with replacement. Compute the sample mean and sample standard deviation. Suppose the sample mean is 10 and the sample standard deviation is 0.04. Do you think the balls in this bag has a mean of 2? 2 Ka-fu Wong © 2003 Chap 8- 4 Chapter Eight One-Sample Tests of Hypothesis GOALS 1. Define a hypothesis and hypothesis testing. 2. Describe the five step hypothesis testing procedure. 3. Distinguish between a one-tailed and a two-tailed test of hypothesis. 4. Conduct a test of hypothesis about a population mean. 5. Conduct a test of hypothesis about a population proportion. 6. Define Type I and Type II errors. 7. Compute the probability of a Type II error. l Ka-fu Wong © 2003 Chap 8- 5 What is a Hypothesis? A Hypothesis is a statement about the value of a population parameter developed for the purpose of testing. Examples of hypotheses made about a population parameter are: The mean monthly income for systems analysts is $3,625. Twenty percent of all customers at Bovine’s Chop House return for another meal within a month. Ka-fu Wong © 2003 Chap 8- 6 What is Hypothesis Testing? Hypothesis testing is a procedure, based on sample evidence and probability theory, used to determine whether the hypothesis is a reasonable statement and should not be rejected, or is unreasonable and should be rejected. Ka-fu Wong © 2003 Chap 8- 7 Hypothesis Testing Step 1: state null and alternative hypothesis Step 2: select a level of significance Step 3: identify the test statistic Step 4: formulate a decision rule Step 5: Take a sample, arrive at a decision Do not reject null Ka-fu Wong © 2003 Reject null and accept alternative Chap 8- 8 Definitions Null Hypothesis H0: A statement about the value of a population parameter. Alternative Hypothesis H1: A statement that is accepted if the sample data provide evidence that the null hypothesis is false. Level of Significance: The probability of rejecting the null hypothesis when it is actually true. Ka-fu Wong © 2003 Chap 8- 9 Objectivity in formulating a hypothesis In court, the defendant is presumed innocent until proven beyond reasonable doubt to be guilty of stated charges. The “null hypothesis”, i.e. the denial of our theory, is presumed true until we prove beyond reasonable doubt that it is false. “Beyond reasonable doubt” means that the probability of claiming that our theory is true when it is not (null hypothesis true) is less than an a priori set significance level (usually 5% or 1%). Is the defendant guilty? Null: the defendant is not guilty. Alternative: the defendant is guilty. Ka-fu Wong © 2003 Chap 8- 10 Definitions Type I Error: conclude the defendant guilty when the defendant did not commit the crime. Level of significance is also the maximum probability of committing a type I error. We want to limit this Type I Error to some small number. Type II Error: Conclude the defendant not guilty when the defendant actually committed the crime. Committed Crime Court Guilty Decision (Guilty or not) Not Guilty Ka-fu Wong © 2003 Yes No Correct decision Type I error Type II error Correct decision Chap 8- 11 Definitions Type I Error: Rejecting the null hypothesis when it is actually true. Level of significance is also the maximum probability of committing a type I error. We want to limit this Type I Error to some small number. Type II Error: Accepting the null hypothesis when it is actually false. State of nature Decision Don’t reject based on the null sample Reject null statistic Ka-fu Wong © 2003 Null true Null false Correct decision Type II error Type I error Correct decision Chap 8- 12 Definitions Test statistic: A value, determined from sample information, used to determine whether or not to reject the null hypothesis. Critical value: The dividing point between the region where the null hypothesis is rejected and the region where it is not rejected. Ka-fu Wong © 2003 Chap 8- 13 One-Tailed Tests of Significance A test is one-tailed when the alternate hypothesis, H1 , states a direction, such as: H1: The mean yearly commissions earned by full-time realtors is more than $35,000. (µ>$35,000) H1: The mean speed of trucks traveling on I-95 in Georgia is less than 60 miles per hour. (µ<60) H1: Less than 20 percent of the customers pay cash for their gasoline purchase. ( < .20) Ka-fu Wong © 2003 Chap 8- 14 Sampling Distribution for the Statistic Z for a One Tailed Test, .05 Level of Significance .95 probability Critical Value z=1.65 .05 probability 0 Ka-fu Wong © 2003 1 2 3 4 Rejection region Reject the null if the test statistic falls into this region. Chap 8- 15 Two-Tailed Tests of Significance A test is two-tailed when no direction is specified in the alternate hypothesis H1 , such as: H1: The mean amount spent by customers at the WalMart in Georgetown is not equal to $25. (µ $25). H1: The mean price for a gallon of gasoline is not equal to $1.54. (µ $1.54). Ka-fu Wong © 2003 Chap 8- 16 Sampling Distribution for the Statistic Z for a Two Tailed Test, .05 Level of Significance .95 probability Critical Value z=-1.96 Critical Value z=1.96 .025 probability .025 probability -4 -3 -2 -1 Rejection region #1 Ka-fu Wong © 2003 0 1 2 3 4 Rejection region #2 Reject the null if the test statistic falls into these two regions.Chap 8- 17 Copyright© 2002 by The McGraw-Hill Companies, Inc. All rights reserved Testing for the Population Mean: Large Sample, Population Standard Deviation Known When testing for the population mean from a large sample and the population standard deviation is known, the test statistic is given by: X m z / n Ka-fu Wong © 2003 Chap 8- 18 EXAMPLE 1 The processors of Fries’ Catsup indicate on the label that the bottle contains 16 ounces of catsup. The standard deviation of the process is 0.5 ounces. A sample of 36 bottles from last hour’s production revealed a mean weight of 16.12 ounces per bottle. At the .05 significance level is the process out of control? That is, can we conclude that the mean amount per bottle is different from 16 ounces? Ka-fu Wong © 2003 Chap 8- 19 EXAMPLE 1 continued Step 1: State the null and the alternative hypotheses: H0: m = 16; H1: m 16 Step 2: Select the level of significance. In this case we selected the .05 significance level. Step 3: Identify the test statistic. Because we know the population standard deviation, the test statistic is z. Ka-fu Wong © 2003 Chap 8- 20 EXAMPLE 1 continued Step 4: State the decision rule: Reject H0 if z > 1.96 or z < -1.96 Step 5: Compute the value of the test statistic and arrive at a decision. X m 16.12 16.00 z 1.44 n 0.5 36 Do not reject the null hypothesis. We cannot conclude the mean is different from 16 ounces. Ka-fu Wong © 2003 Chap 8- 21 p-Value in Hypothesis Testing A p-Value is the probability, assuming that the null hypothesis is true, of finding a value of the test statistic at least as extreme as the computed value for the test. The “critical probability” for our decision to reject the null. If the p-Value is smaller than the significance level, H0 is rejected. If the p-Value is larger than the significance level, H0 is not rejected. Ka-fu Wong © 2003 Chap 8- 22 Computation of the p-Value One-Tailed Test: p-Value = P{z ≥absolute value of the computed test statistic value} Two-Tailed Test: p-Value = 2P{z ≥ absolute value of the computed test statistic value} From EXAMPLE 1, z = 1.44, and because it was a two-tailed test, the p-Value = 2P{z ≥ 1.44} = 2(.5-.4251) = .1498. Because .1498 > .05, do not reject H0. Ka-fu Wong © 2003 Chap 8- 23 Testing for the Population Mean: Large Sample, Population Standard Deviation Unknown Here is unknown, so we estimate it with the sample standard deviation s. As long as the sample size n 30, z can be approximated with: X m z s/ n Ka-fu Wong © 2003 Chap 8- 24 EXAMPLE 2 Roder’s Discount Store chain issues its own credit card. Lisa, the credit manager, wants to find out if the mean monthly unpaid balance is more than $400. The level of significance is set at .05. A random check of 172 unpaid balances revealed the sample mean to be $407 and the sample standard deviation to be $38. Should Lisa conclude that the population mean is greater than $400, or is it reasonable to assume that the difference of $7 ($407-$400) is due to chance? Ka-fu Wong © 2003 Chap 8- 25 EXAMPLE 2 continued Step 1: H0: m $400, H1: m > $400 Step 2: The significance level is .05 Step 3: Because the sample is large we can use the z distribution as the test statistic. Step 4: H0 is rejected if z>1.65 Step 5: Perform the calculations and make a decision. X m $407 $400 z 2.42 s n $38 172 H0 is rejected. Lisa can conclude that the mean unpaid balance is greater than $400. Ka-fu Wong © 2003 Chap 8- 26 Testing for a Population Mean: Small Sample, Population Standard Deviation Unknown The test statistic is the t distribution. The test statistic for the one sample case is given by: X m t s/ n Ka-fu Wong © 2003 Chap 8- 27 Example 3 The current rate for producing 5 amp fuses at Neary Electric Co. is 250 per hour. A new machine has been purchased and installed that, according to the supplier, will increase the production rate. A sample of 10 randomly selected hours from last month revealed the mean hourly production on the new machine was 256 units, with a sample standard deviation of 6 per hour. At the .05 significance level can Neary conclude that the new machine is faster? Ka-fu Wong © 2003 Chap 8- 28 Example 3 continued Step 1: State the null and the alternate hypothesis. H0: m 250; H1: m > 250 Step 2: Select the level of significance. It is .05. Step 3: Find a test statistic. It is the t distribution because the population standard deviation is not known and the sample size is less than 30. Ka-fu Wong © 2003 Chap 8- 29 Example 3 continued Step 4: State the decision rule. There are 10 – 1 = 9 degrees of freedom. The null hypothesis is rejected if t > 1.833. Step 5: Make a decision and interpret the results. X m 256 250 t 3.162 s n 6 10 The null hypothesis is rejected. The mean number produced is more than 250 per hour. Ka-fu Wong © 2003 Chap 8- 30 Tests Concerning Proportion A Proportion is the fraction or percentage that indicates the part of the population or sample having a particular trait of interest. The sample proportion is denoted by p and is found by: Number of successes in the sample p Number sampled Ka-fu Wong © 2003 Chap 8- 31 Test Statistic for Testing a Single Population Proportion z p (1 ) n The sample proportion is p and is the population proportion. Ka-fu Wong © 2003 Chap 8- 32 EXAMPLE 4 In the past, 15% of the mail order solicitations for a certain charity resulted in a financial contribution. A new solicitation letter that has been drafted is sent to a sample of 200 people and 45 responded with a contribution. At the .05 significance level can it be concluded that the new letter is more effective? Ka-fu Wong © 2003 Chap 8- 33 Example 4 continued Step 1: State the null and the alternate hypothesis. H0: .15 H1: > .15 Step 2: Select the level of significance. It is .05. Step 3: Find a test statistic. The z distribution is the test statistic. Ka-fu Wong © 2003 Chap 8- 34 Example 4 continued Step 4: State the decision rule. The null hypothesis is rejected if z is greater than 1.65. Step 5: Make a decision and interpret the results. z p (1 ) n 45 .15 200 2.97 .15(1 .15) 200 The null hypothesis is rejected. More than 15 percent are responding with a pledge. The new letter is more effective. Ka-fu Wong © 2003 Chap 8- 35 Chapter Eight One-Sample Tests of Hypothesis - END - Ka-fu Wong © 2003 Chap 8- 36