Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 10 Hypothesis Testing Using a Single Sample Section 10.1 Hypotheses & Test Procedures Hypothesis In statistics, a hypothesis is a statement about a population characteristic. Sharing prescription drugs with others can be dangerous. Is this a common occurrence among teens? OR How do we answer questions like these using sample data? The National Association of Colleges and In Chapter 9, we used sample data to salary Employers stated that the average starting In thisestimate chapter, the we value will use sample data to of an unknown for graduating with a bachelor’s degree in teststudents somepopulation claim or hypothesis about the characteristic. 2010 is $48,351. Is this true population characteristic to for seeyour if it iscollege? plausible. To do this, we use a test of hypotheses or test procedure. What is a test of hypotheses? A test of hypotheses is a method that uses Is it one of the sample data to decidevalues between two of the competing claims (hypotheses) about the sample statistic population characteristic. that are likely to Is the value of the statistic . . . Is it one thatsampleoccur? isn’t likely to – a random occurrence due to natural occur? variation? OR – a value that would be considered surprising? FORMAL STRUCTURE Hypothesis tests are based on a reductio ad absurdum form of argument. Specifically, we make an assumption and then attempt to show that assumption leads to an absurdity or contradiction, hence the assumption is wrong. Hypothesis statements: YouHare usually The null hypothesis, denoted by 0, is a tryingthat to claim about a population characteristic determine if is initially assumed to be true. this claim is The hypothesis statements are ALWAYS about believable. the population – NEVER about a sample! The alternative hypothesis, denoted by Ha, is the competing claim. To determine what the alternative hypothesis should be, you need to keep the research objectives in mind. Formal Structure Rejection of the null hypothesis will imply the acceptance of the alternative hypothesis. Assume H0 is true and attempt to show this leads to an absurdity, hence H0 is false and Ha is true. Formal Structure Typically one assumes the null hypothesis to be true and then one of the following conclusions are drawn. 1. Reject H0 Equivalent to saying that Ha is correct or true 2. Fail to reject H0 Equivalent to saying that we have failed to show a statistically significant deviation from the claim of the null hypothesis This is not the same as saying that the null hypothesis is true. Let’s consider a murder trial . . . To determine which hypothesis is You are trying to What is thethe null correct, juryhypothesis? will listen to the is what you determine if the So weOnly willifmake one of twoThis decisions: evidence. there is “evidence true H0: theadefendant innocent evidenceis supports beyond reasonable is doubt” would theassume youclaim. begin. null hypothesis be rejected in favor before of this •the Reject the null hypothesis alternative hypothesis. • Fail to reject the null hypothesis What is the alternative hypothesis? If there is not convincing evidence, then we would “fail to reject” the null hypothesis. Ha: the defendant guilty Rememberis that the actual verdict that is returned is “GUILTY” or “NOT GUILTY”. We never end up determining the null hypothesis is true – only that there is not enough evidence to say it’s not true. The Form of Hypotheses: Null hypothesis H0: population characteristic = hypothesized value This one is considered a two-tailed test because you are interested in both Alternative hypothesis The null hypothesis always direction. includes the>equal case. value Ha: population characteristic hypothesized This hypothesized value is Ha: population characteristic < hypothesized value a specific number Ha: population characteristic ≠ This hypothesized value sign is context determined by the Notice that the alternative These are considered one-determined by the of the problem hypothesis tailed tests becauseuses youthe aresame context of the population characteristic and the Let’s practice writing only interested in one problem. same hypothesized value as the hypothesis statements. direction. null hypothesis. Sharing prescription drugs with others can be dangerous. A survey of a representative sample of 592 U.S. teens ages 12 to 17 reported that 118 of those surveyed admitted to having shared a prescription drug with a friend. Is this sufficient evidence that more than 10% of teens have shared prescription medication with friends? What is the What words indicate What theis the population hypothesized direction of the alternative characteristic of interest? value? The true proportion p of H0: phypothesis? = .1 teens who have shared State the hypotheses : Ha: p > .1 prescription medication with friends Compact florescent (cfl) lightbulbs are much more energy efficient than regular incandescent lightbulbs. Ecobulb brand 60-watt cfl lightbulbs state on the package “Average life 8000 hours”. People who purchase this brand would be unhappy if the bulbs lasted less than 8000 hours. A sample of these bulbs will be selected and tested. What is the What words indicate theis the population hypothesized State the hypotheses :What direction of the alternative characteristic value? of interest? hypothesis? H : m = 8000 0 Ha: m < 8000 The true mean (m) life of the cfl lightbulbs Because in variation of the manufacturing process, tennis balls produced by a particular machine do not have the same diameters. Suppose the machine was initially calibrated to achieve the specification of m = 3 inches. However, the manager is now concerned that the diameters no longer conform to this specification. If the mean diameter is not 3 inches, production will have to be halted. State the hypotheses : What words indicate What the is the population direction of interest? The true mean m H0of: mthe = alternative 3 characteristic hypothesis? diameter of tennis H a: m ≠ 3 balls Peaches Example Do the “16 ounce” cans of peaches canned and sold by DelMonte meet the claim on the label (on the average)? Notice, the real concern would be selling the consumer less than 16 ounces of peaches. H0: µ = 16 Ha: µ < 16 Light Bulb Example Do two brands of light bulbs have the same mean lifetime? H0: µBrand A = µBrand B Ha: µBrand A µBrand B Milling Example Do parts produced by two different milling machines have the same variability in diameters? H0 : 1 2 Ha : 1 2 or equivalently H0 : 12 22 Ha : 12 22 Caution When you set up a hypothesis test, the result is either 1. Strong support for the alternate hypothesis (if the null hypothesis is rejected) 2. There is not sufficient evidence to refute the claim of the null hypothesis (you are stuck with it, because there is a lack of strong evidence against the null hypothesis. For each pair of hypotheses, indicate which are Must useand a population characteristic - x is a not legitimate explain why statistics (sample) a) Ho: m = 15 Ha: m ≥ 15 Must be only greater than! c) Must use same Honumber : p = 0.1 H0a! : as in H p ≠ 0.1 ! > 3.2 d) Ho: Hm0 =MUST 2.3 beH“=“ : m a e) Ho: p ≠ 0.5 Ha: p = 0.5 Section 10.2 Errors in Hypothesis Testing When you perform a hypothesis test you make a decision: reject H0 or fail to reject H0 When make one of Each could possibly be you a wrong these decisions, there is a decision; therefore,possibility there are thattwo you could be wrong! types of errors. That you made an error! Type I error • The error of rejecting H0 when H0 is true • The probability of a Type I error is denoted by a. ais called the significance level of the test This is the lower-case Thus, a test withletter a =“alpha”. 0.01 is said to have Greek a level of significance of 0.01 or to be a level 0.01 test. Type II error • The error of failing to reject H0 when H0 is false • The probability of a Type II error is denoted by b This is the lower-case Greek letter “beta”. Here is another way to look at the types of errors: Suppose H is false true Suppose Suppose H H000His is true Suppose 0 is and we fail to and and wewe fail reject towe reject it, false and reject it, what it,what whatit, type type ofof reject what type of decision decision decision was type of was made? made? was made? Reject H0 H0 is true H0 is false Type I error Correct Fail to reject H0 Correct Type II error Error Analogy Consider a medical test where the hypotheses are equivalent to H0: the patient has a specific disease Ha: the patient doesn’t have the disease Then, Type I error is equivalent to a false negative (i.e., Saying the patient does not have the disease when in fact, he does.) Type II error is equivalent to a false positive (i.e., Saying the patient has the disease when, in fact, he does not.) The U.S. Bureau of Transportation Statistics reports that for 2009 72% of all domestic passenger flights arrived on time (meaning within 15 minutes of its scheduled arrival time). Suppose that an airline with a poor on-time record decides to offer its employees a bonus if, in an upcoming month, the airline’s proportion of on-time flights exceeds the overall 2009 industry rate of .72. State the hypotheses. Type I error – the airline decides to reward H0: p = .72 State a Type I error in the employees when the context. H : p > .72 a proportion of on-time flights doesn’t exceed .72 State a Type II error in Type II error – the airline context. employees do not receive the bonus when they deserve it. In 2004, Vertex Pharmaceuticals, a biotechnology company, issued a press release announcing that it had filed an application with the FDA to begin clinical trials on an experimental drug VX-680 that had been found to reduce the growth rate of pancreatic and colon cancer State a Type I error in the tumors in animal studies. context of thisofproblem. A potential consequence making a Type I DataWhat resulting from the planned clinical trialswould can be is aerror potential consequence would be that the company used to test: continue to devote resources to the of this error? development of the drug when itfor really is not Let m = the true mean growth rate of tumors patients taking the experimental drug effective. H0: m = mean growth rate of tumors for patients not taking the experimental drug Ha: m < mean growth rate of tumors for patients not taking the experimental drug A Type I error would be to incorrectly conclude that the experimental drug is effective in slowing the growth rate of tumors In 2004, Vertex Pharmaceuticals, a biotechnology company, issued a press release announcing that it had filed an application with the FDA to begin clinical trials on an experimental drug VX-680 that had been a Typerate II error in the context found to reduceState the growth of pancreatic and of this studies. problem. colon cancer tumorsconsequence in animal A potential of making a Type II What is aerror potential consequence would be that the company might Data resulting from plannedofclinical can be abandon development a drug trials that was of thisthe error? used to test: effective. H0: m = mean growth rate of tumors for patients not taking the experimental drug Ha: m < mean growth rate of tumors for patients not taking the experimental drug A Type II error would be to conclude that the drug is ineffective when in fact the mean growth rate of tumors is reduced The relationship between a and b The ideal test procedure would result in both a = 0 (probability of aa significance Type I error)level and a b == 0.05 (probability Selecting results a test procedure that, used over and over of a in Type II error). different samples, rejects a true H about 0 Thiswith isSo impossible to achieve since we must why not always choose a small a base a5=times .05data. orina100. = .01)? our decision (like on sample Standard test procedures allow us to select a, the significance level of the test, but we have no direct control over b. Relationships Between a and b Generally, with everything else held constant, decreasing one type of error causes the other to increase. The only way to decrease both types of error simultaneously is to increase the sample size. No matter what decision is reached, there is always the risk of one of these errors. The relationship between a and b Suppose thisisnormal curve If the null hypothesis false and the Let’s consider the represents is the sampling alternative hypothesis true, then the This taildistribution would represent b, the following hypotheses: for ppart when null This is the the true proportion is believed to of bethe greater probabilitycurve of hypothesis failing to reject a a is true. that represents than .5 – so the curve should really be false H0Type . or the I error. shifted to the right. H0: p = .5 Ha: p > .5 Let a = .05 .5 The relationship between a and b If the null hypothesis is false and the Let’s consider the hypothesis is true, then the true alternative This tail that would b, the followingproportion hypotheses: is believed to be greater Notice as represent a gets smaller, b than .5 of failing to reject a –probability so the curve should really be shifted to gets larger! false Hright. 0. the H : p = .5 0 Ha: p > .5 Let a = .01 How does one decide what a level to use? After assessing the consequences of Type I and Type II errors, identify the largest a that is tolerable for the problem. Then employ a test procedure that uses this maximum acceptable value –rather than anything smaller – as the level of significance. Remember, using a smaller a increases b. The EPA has adopted what is known as the Lead and Copper Rule, which defines drinking water as unsafe if the concentration of lead is 15 parts per billion (ppb) or greater or Since mostofpeople would consider the if the concentration copper is 1.3 ppb or greater. consequence of the Type I error more The manager of a we community water to system might use lead serious, would want keep a small aaa sample Type error in context. WhichState type errorIII has a more serious level measurements from of of water specimens to State Type error in context. – so select a smaller significance level test the following hypotheses: consequence? What of a = .01. of What is is aa consequence consequence of aa Type Type I? II? H0: m = 15 versus Ha: m < 15 A Type I error leads to the conclusion that a water source meets EPA standards when the water is really unsafe. There are possible health risks to the community A Type II error leads to the conclusion that a water source does NOT meet EPA standards when the water is really safe. The community might lose a good water source. Section 10.3 Large-Sample Hypothesis Test for a Population Proportion Large-Sample Hypothesis Test for a Population Proportion The fundamental idea behind hypothesis testing is: We reject H0 if the observed sample is very unlikely to occur if H0 is true. Test Statistic A test statistic is the function of sample data on which a conclusion to reject or fail to reject H0 is based. Recall the General Properties for These three properties imply that the standardized Sampling Distributions of p variable p̂ p z p1 p n 1. p̂ has an approximately standard normal distribution when n is large. As long as the sample size is less μ =p 2. p(1-p) σp̂ = n than 10% of the population P-value The P-value (also called the observed significance level) is a measure of inconsistency between the hypothesized value for a population characteristic and the observed sample. The P-value is the probability, assuming that H0 is true, of obtaining a test statistic value at least as inconsistent with H0 as what actually resulted. Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. • Ha: p > hypothesized value z curve P-value = area in upper tail Calculated z Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. • Ha: p < hypothesized value z curve P-value = area in lower tail Calculated z Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. • Ha: p ≠ hypothesized value P-value = sum of area in two tails z curve Calculated z and –z Using P-values to make a decision: To decide whether or not to reject H0, we compare the P-value to the significance level a If the P-value > a, we “fail to reject” the null hypothesis. If the P-value < a, we “reject” the null hypothesis. Summary of the Large-Sample z Test for p Null hypothesis: Test Statistic: H0: p = hypothesized value p̂ p z p(1 P) n Alternative Hypothesis: Ha: p > hypothesized value Ha: p < hypothesized value Ha: p ≠ hypothesized value or P-value: Area to the right of calculated z Area to the left of calculated z 2(Area to the right of z) of +z 2(Area to the left of z) of -z Summary of the Large-Sample z Test for p Continued . . . In June 2006, an Associated Press survey was conducted to investigate how people use the nutritional information provided on food packages. Interviews were conducted on thisselected data, isadult it reasonable to conclude with Based 1003 randomly that aand majority of adult was Americans Americans, each participant asked frequently nutritional labels when purchasing a series check of questions, including the following two: packaged foods?food, how often do Question 1: When purchasing packaged you check the nutritional labeling on the package? Question 2: How often do you purchase food that is bad for you, even after you’ve checked the nutrition labels? It was reported that 582 responded “frequently” to the question about checking labels and 441 responded “very often” or “somewhat often” to the question about purchasing bad foods even after checking the labels. Nutritional Labels Continued . . . H0: p = .5 Ha:We p > will .5 create a test statistic using: p = true proportion of adult who p̂ Americans p frequently checkz nutritional p1 p labels n We use p > .5 582 to test for a majority of pˆ whofrequently .58 adult Americans check For this sample: 1003 labels. nutritional A test statistic indicates how many standard This observedthe sample proportion is greater deviations sample statistic (p) is fromthan the .5. Is itplausible a sample proportion of p = .58 .58 .5 population characteristic (p). z occurred as 5.08a result of chance variation, or is it .5.5 unusual to observe a sample proportion this large 1003 when p = .5? Nutritional Labels Continued . . . H0: p = .5 Ha: p > .5 p = true proportion of adult Americans who frequently check nutritional labels In the normal curve, Next we findstandard the P-value for this testseeing a value ofstatistic. 5.08 or larger is unlikely. It’s probability is approximately 0. 582 pˆ .58 For this sample: 1003 Since the of P-value is so small, .58 .5 The P-value is the probability obtaining a testwe z 5.08 reject H0. There statistic withisHconvincing .5.5at least as inconsistent 0 as was evidence H to suggest that the observed, assuming is true. 0 1003 P-value ≈ 0 majority of adult Americans frequently check the nutritional labels on packaged0foods. A report states that nationwide, 61% of high school graduates go on to attend a two-year or four-year college the year after graduation. Suppose a random sample of 1500 high school graduates in 2009 from a particular state estimated the proportion of high school graduates that attend college the year after graduation to be 58%. Can we reasonably conclude that the proportion of this state’s high school graduates in 2009 who attended college the year after graduation is different from the national figure? Use a = .01. H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all State hypotheses. 2009 high the school graduates in this state who attended college the year after graduation College Attendance Continued . . . H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation Assumptions: • Given a random sample of 1500 high school graduates • Since 1500(.61) > 10 and 1500(.39) > 10, sample size is large enough. • Population size is much larger than the sample size. College Attendance Continued . . . H0: p = .61 Ha: p ≠ .61 Test statistic: Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation .58 .61 z 2.38 .61(.39) What potential error could you 1500 have made? Type II area to the left P-value = 2(.0087)The = .0174 Useofa-2.38 = .01is approximately .0087 Since P-value > a, we fail to reject H0. The evidence does not suggest that the proportion of 2009 high school graduates in this state who attended college the year after graduation differs from the national value. County Judge Example A county judge has agreed that he will give up his county judgeship and run for a state judgeship unless there is evidence at the 0.10 level that more then 25% of his party is in opposition. A random sample of 800 party members included 217 who opposed him. Please advise this judge. County Judge Example Continued p = proportion of his party that is in opposition H0: p = 0.25 HA: p > 0.25 a = 0.10 Note: hypothesized value = 0.25 n 800, p 217 0.27125 800 0.27125 0.25 z 1.39 0.25(0.75) 800 County Judge Example Continued P-value=P(z 1.39) 1 0.9177 0.0823 At a level of significance of 0.10, there is sufficient evidence to support the claim that the true percentage of the party members that oppose him is more than 25%. Under these circumstances, I would advise him not to run. In December 2009, a county-wide water conservation campaign was conducted in a particular county. In 2010, a random sample of 500 homes was selected and water usage was recorded for each home in the sample. Suppose the sample results were that 220 households had reduced water consumption. The county supervisors wanted to know if their data supported the claim that fewer than half the households in the county reduced water consumption. H0: p = .5 Ha: p < .5 the hypotheses. where p is theState proportion of all households in Calculate p. the county with reduced water usage 220 p̂ .44 500 Water Usage Continued . . . H0: p = .5 Ha: p < .5 where p is the proportion of all households in the county with reduced water usage Verify assumptions 1. p is from a random sample of households 2. Sample size n is large because np = 250 >10 and n(1-p) = 250 > 10 3. It is reasonable that there are more than 5000 (10n) households in the county. Water Usage Continued . . . H0: p = .5 Ha: p < .5 where p is the proportion of all households in the county with reduced water usage Calculate the testup statistic Look this value in the .44 .5 and P-value z 2.68 What potential tableerror of zcould curveyou areas .5(.5) have made? Type I 500 P-value = .0037 Use a = .01 Since P-value < a, we reject H0. There is convincing evidence that the proportion of households with reduced water usage is less than half. Water Usage Continued . . . H0: p = .5 Ha: p < .5 where p is the proportion of all households in the county with reduced water usage Since P-value < a, we reject H0. Confidence intervals are two-tailed, Use a = .01 Compute a 98% so we.01 need to putconfidence .01that in the upper With in each tail, puts .98 tail (since the curve is in the middle – Notice that theinterval: Let’s create a confidence symmetrical). Since weisare testing Ha: pconfidence < .05, a hypothesized value of this the appropriate interval with this data. lower .44(.56) would also be in the tail. .5 is NOT in the 98% .44 2.326 level What is the appropriate 500 .98 confidence interval to use? confidence level and that we “rejected” (.388, .492) H ! .50 College Attendance Revisited . . . H0: p = .61 Ha: p ≠ .61 Where p is the proportion of all 2009 high school graduates in this state who attended college the year after graduation Since P-value > a, we fail to reject H0. Use a = .01 Let’s compute This is a two-tailed test so aagets interval for Notice that the split evenlyconfidence into both tails, leaving problem. hypothesized value of 99% in this the middle. .58(.42) .58 2.576 .61.99 IS in the 99% 1500 confidence interval and that we “failed to (.547, .613) reject” H0! Section 10.4 Hypothesis Tests for a Population Mean Let’s review the assumptions for a confidence interval for a population mean The assumptions are the same for a large-sample hypothesis test for a 1) x is the sample mean from a random sample, population mean. 2) the sample size n is large (n > 30), and 3) , the population standard deviation, is known or unknown This is the test statistic This is the test statistic when is known. when is unknown. x μ z σ n P-value is area under the z curve x μ t s n P-value is area under the t curve with df=n-1 The One-Sample t-test for a Population Mean Null hypothesis: Test Statistic: H0: m = hypothesized value x m t s n Alternative Hypothesis: P-value: Ha: m > hypothesized value Area to the right of calculated t with df = n-1 Ha: m < hypothesized value Area to the left of calculated t with df = n-1 Ha: m ≠ hypothesized value 2(Area to the right of t) of +t or 2(Area to the left of t) of -t The One-Sample t-test for a Population Mean Continued . . . Assumptions: 1. x and s are the sample mean and sample standard deviation from a random sample 2. The sample size n is large (n > 30) or the population distribution is at least approximately normal. A study conducted by researchers at Pennsylvania State University investigated whether time perception, an indication of a person’s ability to concentrate, is impaired during nicotine withdrawal. After a 24-hour smoking abstinence, 20 smokers were asked to estimate how much time had elapsed during a 45-second period. Researchers wanted to see whether smoking abstinence had a negative impact on time perception, causing elapsed time to be overestimated. Suppose the resulting data on perceived elapsed time (in seconds) were as follows: 69 65 72 73 59 55 39 52 67 57 56 50 70 47 56 45 70 64 67 53 What is the mean and standard x = 59.30 s = 9.84 n = 20 deviation of the sample? Smoking Abstinence Continued . . . 69 65 72 73 59 55 39 52 67 57 56 50 70 47 56 45 70 64 67 53 x = 59.30 s = 9.84 n = 20 Where m is the true mean perceived elapsed H0: m = 45 time for smokers who have abstained State thefrom Ha: m > 45 smoking for 24-hours Since the boxplot is approximately hypotheses. Assumptions: symmetrical, it is plausible that the 1) Itpopulation is reasonable to believe that the sample of Verify smokers is distribution is approximately To do this, we need to graph the assumptions. representative ofnormal. all smokers. data using a boxplot or normal 2) Since the sample size is not at probability plot least 30, we must determine if it is plausible that the population distribution is approximately normal. 40 50 60 70 Smoking Abstinence Continued . . . 69 65 72 73 59 55 39 52 67 57 56 50 70 47 56 45 70 64 67 53 x = 59.30 s = 9.84 n = 20 Where m is the true mean perceived elapsed H0: m = 45 time for smokers who have abstained Compute the test from statistic Ha: m > 45 smoking for 24-hours and P-value. Test statistic: P-value ≈ 0 59.30 45 t 6.50 9.84 20 Use a = .05 Since P-value < a, we reject H0. There is convincing evidence that the mean perceived elapsed time is greater than the actual elapsed time of 45 seconds. Smoking Abstinence Continued . . . 69 65 72 73 59 55 39 52 67 57 56 50 70 47 56 45 70 64 67 53 x = 59.30 s = 9.84 n = 20 Compute the appropriate Where m is the true mean perceived elapsed H0: m = 45 interval.from time for smokersconfidence who have abstained Ha: m > 45 smoking for 24-hours Since P-value < a, we reject H0. Notice that the a = .05 hypothesized value of 59.30 1.729 9.84 20 45 is NOT in the 90% confidence (55.497, 63.103) Since this isinterval a one-tailed test, a goes and in that “rejected” thewe upper tail. .05 goes in the H0! leaving .90 in the middle. lower tail, A growing concern of employers is time spent in activities like surfing the Internet and emailing friends during work hours. The San Luis Obispo Tribune summarized the findings of a large survey of workers in an article that ran under the headline “Who Goofs Off More than 2 Hours a Day? Most Workers, Survey Says” (August 3, 2006). Suppose that the CEO of a large company wants to determine whether the average amount of wasted time during an 8-hour day for employees of her company is less than the reported 120 minutes. Each person in a random sample of 10 employees was contrasted and asked about daily wasted time at work. The resulting data are the following: 108 112 117 130 111 131 113 113 105 128 What is the mean and standard x = 116.80 s = 9.45 n = 10 deviation of the sample? Surfing Internet Continued . . . 108 112 117 130 111 131 113 113 105 128 x = 116.80 s = 9.45 n = 10 H0: m = 120 Ha: m < 120 Where m is the true mean daily wasted time for employees of this company The boxplot reveals some skewness, but State the Verify the there is no outliers. It is plausible that the Assumptions: hypotheses. assumptions. population distribution is approximately 1) The given sample was a random sample of employees normal. 2) Since the sample size is not at least 30, we must determine if it is plausible that the population distribution is 110 120 130 approximately normal. Surfing Internet Continued . . . 108 112 117 130 111 131 113 113 105 128 x = 116.80 s = 9.45 n = 10 H0: m = 120 Ha: m < 120 Test Statistic: Where m is the true mean daily wasted time for employees of this company Compute the could test statistic 116potential .80 120error What we t 1 . 07 and P-value. Type II .45 have 9made? 10 P-value =.150 Use a = .05 Since p-value > a, we fail to reject H0. There is not sufficient evidence to conclude that the mean daily wasted time for employees of this company is less than 120 minutes. Bolt Example A manufacturer of a special bolt requires that this type of bolt have a mean shearing strength in excess of 110 lb. To determine if the manufacturer’s bolts meet the required standards a sample of 25 bolts was obtained and tested. The sample mean was 112.7 lb and the sample standard deviation was 9.62 lb. Use this information to perform an appropriate hypothesis test with a significance level of 0.05. Bolt Example Continued m = the mean shearing strength of this specific type of bolt The hypotheses to be tested are H0: m = 110 lb Ha: m 110 lb The significance level to be used for the test is a = 0.05. x 110 The test statistic is t s n Bolt Example Continued x 112.7, s 9.62, n 25, df 24 112.7 110 P-value P t 9.62 25 P(t 1.4) 0.087 Bolt Example Continued Because P-value = 0.087 > 0.05 = a, we fail to reject H0. At a level of significance of 0.05, there is insufficient evidence to conclude that the mean shearing strength of this brand of bolt exceeds 110 lbs. Charm Example A jeweler is planning on manufacturing gold charms. His design calls for a particular piece to contain 0.08 ounces of gold. The jeweler would like to know if the pieces that he makes contain (on the average) 0.08 ounces of gold. To test to see if the pieces contain 0.08 ounces of gold, he made a sample of 16 of these particular pieces and obtained the following data. 0.0773 0.0779 0.0756 0.0792 0.0777 0.0713 0.0818 0.0802 0.0802 0.0785 0.0764 0.0806 0.0786 0.0776 0.0793 0.0755 Use a level of significance of 0.01 to perform an appropriate hypothesis test. Charm Example Continued The population characteristic being studied is m = true mean gold content for this particular type of charm. H0:µ = 0.08 oz Ha:µ 0.08 oz a = 0.01 x hypothesized mean x 0.08 t s s n n Charm Example Continued Minitab was used to create a normal plot along with a graphical display of the descriptive statistics for the sample data. The result of this display is that it is reasonable to assume that the population of gold contents of this type of charm is normally distributed Charm Example Continued We can see that with the exception of one outlier, the data is reasonably symmetric and mound shaped in shape, indicating that the assumption that the population of amounts of gold for this particular charm can reasonably be expected to be normally distributed. Descriptive Statistics Variable: Gold Anderson-Darling Normality Test A-Squared: P-Value: 0.072 0.074 0.076 0.078 0.080 0.082 95% Confidence Interval for Mu 0.363 0.396 Mean StDev Variance Skewness Kurtosis N 7.80E-02 2.51E-03 6.32E-06 -1.10922 2.23191 16 Minimum 1st Quartile Median 3rd Quartile Maximum 7.13E-02 7.66E-02 7.82E-02 8.00E-02 8.18E-02 95% Confidence Interval for Mu 7.66E-02 0.0765 0.0775 0.0785 0.0795 7.93E-02 95% Confidence Interval for Sigma 1.86E-03 3.89E-03 95% Confidence Interval for Median 95% Confidence Interval for Median 7.71E-02 7.95E-02 Charm Example Continued 7. Computations: n 16, x 0.077981, s 0.0025143 0.077981 0.08 t 3.2 0.0025143 16 This is a two tailed test. Looking up in the table of tail areas for t curves, t = 3.2 with df = 15. We see the table entry is 0.003 so P-Value = 2(0.003) = 0.006 Charm Example Continued Since P-value = 0.006 0.01 = a, we reject H0 at the 0.01 level of significance. At the 0.01 level of significance there is convincing evidence that the true mean gold content of this type of charm is not 0.08 ounces. Actually when rejecting a null hypothesis for the alternative, a one tailed claim is supported. In this case, at the 0.01 level of significance, there is convincing evidence that the true mean gold content of this type of charm is less than 0.08 ounces. Section 10.5 Power and Probability of Type II Error Recall Type I and Type II Errors: SupposeH H0is isfalse true Suppose Suppose H H is is true Suppose 000 false and we fail to reject reject and and we we fail reject to it, and we it, what type of it, what what type type ofof what type of decisionwas wasmade? made? decision was made? decision Reject H0 H0 is true H0 is false Type I error Correct Power Fail to Type II Correct reject H0 that The probability we correctly reject H0 is called the power of theerror test. Power and Probability of Type II Error The power of a test is the probability of rejecting the null hypothesis. When H0 is false, the power is the probability that the null hypothesis is rejected. Specifically, power = 1 – b. Comments on Power Calculating b (hence power) depends on knowing the true value of the population characteristic being tested. Since the true value is not known, generally, one calculates b for a number of possible “true” values of the characteristic under study and then sketches a power curve. Suppose that the student body president at a university is interested in studying the amount of money that students spend on textbooks each semester. The director of financial aid services believes that average amount spent The the power of a test depends onon the true value of the textbooks is $500 each semester, and uses Because the was actual value ofthis m we istounknown, However, ifmean the true mean $525, itthen is less likely that we If themean! true is greater than $500, should cannot the power for the actual value of m. determine amount ofbe financial aid for a from the sample would mistaken for awhich sample the reject Hthe ifknow the true mean is ONLY a little greater, 0. BUT, population ifthen the mean were $500. So, isplans more sayis $505, the sample mean look like we student eligible. The student body president BUT, we can gain insight tomight the itpower of mlikely by that will were correctly reject H0much . wouldn’t expect theinvestigating trueinwe mean $500. Thus we some “what if “scenarios ... to ask each ifstudent a random sample how haveonconvincing reject 0. he or she spent books thisevidence semestertoand useHthe data to test (using a = .05) the following hypotheses: H0: m = 500 versus Ha: m > 500 Let’s consider a one-sided, upper tail test. Fail to Reject H0 Reject H0 Power = 1 - b b m0 a ma If the null hypothesis is false, then m > hypothesized value Textbooks Continued . . . H0: m = 500 Ha: m > 500 Suppose that = $85 and n = 100. (Since n is large, the sampling distribution of x is approximately normal.) What is the probability of committing a Type I error? a = .05 If m = 500 is true, forThis what values of with the sample mean is the z value .95 area to its left. would you reject the null hypothesis? Rejection Region .95 x-500 1.645= What is the value of this x? 85 We would reject H0 for 100 x > 513.98. Use: Textbooks Continued . . . Suppose that = $85 and n = 100. Ha: m > 500 We would reject H0 for x > 513.98. If the null hypothesis is false, then m > 500. What is the probability of a Type of II xerror (b)? is b. This area (to the leftWhat = if513.98) m = 520? H0: m = 500 513.98-520 z= =-.708 85 100 Rejection Region b =Look .24 this up in the table of areas for z curves Textbooks Continued . . . H0: m = 500 Ha: m > 500 Suppose that = $85 and n = 100. We would reject H0 for x > 513.98. What is the power of the test if m = 520? b = .24 Power is the probability of power is in the Power = Notice 1 -correctly .24 that = .76 SAME rejecting curve as H b 0. Rejection Region Power = 1 – b Textbooks Continued . . . Suppose that = $85 and n = 100. Ha: m > 500 We would reject H0 for x > 513.98. If we reject H0, then Notice that, as the distancembetween > 500. the Find b andnull power. hypothesized value for m and our if m = 530? alternative value forWhat m increases, b H0: m = 500 b = .03 decreases AND power increases. power = .97 Rejection Region Textbooks Continued . . . Suppose that = $85 and n = 100. Ha: m > 500 We would reject H for 0 x > 513.98. If the null hypothesis is Notice that, as the distance between the false, then m > 500. Find b andnull power. hypothesized value for ifmmand our What = 510? alternative value for m decreases, b increases AND power decreases. b = .68 power = .32 H0: m = 500 Rejection Region Textbooks Continued . . . Suppose that = $85 and n = 100. H0: m = 500 Ha: m > 500 b will increase and power will decrease. What happens if we use a = .01? Rejection Region b Rejection Region Power b Power What happens to a, b, & power when the sample size is increased? Fail to Reject H0 a The standard and bThe decreases significance deviation will power Reject H0(a)increases! level remains decrease making the curve same taller – so the the value where the and skinnier. rejection region begins must move. m0 b Power ma Effects of Various Factors on the Power of a Test • The larger the size of the discrepancy between the hypothesized value and the actual value of the population characteristic, the higher the power. • The larger the significance level a, the higher the power of a test. • The larger the sample size, the higher the power of a test b and Power for a t Test When using a t-test, the population standard deviation is unknown. b not only depends on a, n, and the actual value of m, but b also depends on so we must have an estimate of . The b curves (on the next slide) can be used to estimate b and the power of a test based upon the value of d. d alternativ e value - hypothesiz ed value σ b curves Consider testing H0: m = 100 versus Ha: m > 100 and focus the alternative valueism normal, = 110. the t When theon population distribution Suppose =hypotheses 10, n = 7, and a =m.01. test for that testing about has smaller b than does any other test procedure that has 110 the 100 same d 1significance level a. 10 b ≈ .6 Calculate d. Use the df = 6 Power ≈ .4 curve to estimate b