Download Chapter 9 Hypothesis Testing

Chapter 9 Hypothesis Testing Note: • In Chapter 8 we used methods of estimating the value of a parameter. • In this chapter, we are drawing inferences about the parameter by making decisions concerning the value of the parameter Two Hypothesis: • Null hypothesis 𝐻0 : This is the statement that is under investigation or being tested. Usually the null hypothesis represents a statement of “no effect,” “no difference,” or, put another way, “things haven’t changed.” • Alternate hypothesis 𝐻1 : This is the statement you will adopt in the situation in which the evidence (data) is so strong that you reject 𝐻0 . A statistical test is designed to assess the strength of the evidence (data) against the null hypothesis. Real Life Example: • You are deciding if you want to breakup with your significant other. Let 𝜇 be the mean of your feeling for him/her. • 𝑁𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 𝐻0 : 𝜇 = 0 (meaning you have no negative feeling for your significant other) • Alternate hypothesis 𝐻1 : 𝜇 < 0 (meaning you have some negative feeling for your significant other) • If you went through the “data”, and find out that you Do not reject the null hypothesis then you guys should stay together • If you went through the “data” and find out that you reject the null hypothesis (accept the alternate hypothesis) then you guys should break up. Actual Statistic Example: • Ford advertises that its new Fusion get 47 miles per gallon. Let 𝜇 be the mean of the mileage distribution for these cars. You assume that the manufacturer will not underrate the car, but you suspect that the mileage might be overrated. • A) What is the null hypothesis? • B) What is the alternate hypothesis? Answer • A) Null hypothesis: – 𝐻0 : 𝜇 = 47 𝑚𝑝𝑔 • B) Alternate Hypothesis: – 𝐻1 : 𝜇 < 47 𝑚𝑝𝑔 – We have every reason to believe that the advertised mileage is too high. If the mean is not 47 mpg, then it is less than 47 mpg. Group Work • A company manufactures ball bearings for precision machines. The average diameter of a certain type of ball bearing should be 6.0 mm. To check that the average diameter is correct, the company formulates a statistical test. • A) What is the null hypothesis? (Hint: what is the company trying to test?) • B) What is the alternate hypothesis? (Hint: If it’s not precise, then it’s in trouble) Answer • A) 𝐻0 : 𝜇 = 6.0𝑚𝑚 • B) 𝐻1 : 𝜇 ≠ 6.0𝑚𝑚 Group Work • A computer manufacturer averages 7% defective part. To check that if the average is correct, the company formulate a statistical test. • A) What is the null hypothesis? • B) What is the alternate hypothesis? Answer • A) 𝐻0 : 𝜇 = 7% • B) 𝐻1 : 𝜇 ≠ 7% Note: • How do you know to use <, >, 𝑜𝑟 ≠ in the alternate hypothesis depends on the problem. Read it and interpret it! It should be logical. Types of statistical tests • Assuming that 𝐻0 : 𝜇 = 𝑘 • A statistical test is: – Left-tailed if 𝐻1 states that the parameter is less than the value claimed in 𝐻0 (H1 : 𝜇 < 𝑘) – Right-tailed if 𝐻1 states that the parameter is greater than the value claimed in 𝐻0 (H1 : 𝜇 > 𝑘) – Two-tailed if 𝐻1 states that the parameter is different from (or not equal to) the value claimed in 𝐻0 (H1 : 𝜇 ≠ 𝑘) Hypothesis tests of 𝜇, Given x is normal and 𝜎 is known • Given that x has a normal distribution with known standard deviation 𝜎, then • 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑠 = 𝑧 = 𝑥−𝜇 𝜎/ 𝑛 • Where 𝑥 = mean of a simple random sample • 𝜇 = value stated in 𝐻0 • n=sample size Example: • Rosie is a dog. Let x be a random variable that represents Rosie’s heart rate. From past experience, the vet knows that x has a normal distribution with 𝜎 = 12. The vet checked the Manual and found that for dogs have 𝜇 = 115 𝑏𝑒𝑎𝑡𝑠 𝑝𝑒𝑟 𝑚𝑖𝑛. • • Over the past six weeks, Rosie’s heart rates are: 93 109 110 89 112 117 • The vet is concerned that Rosie’s heart rate may be slowing. Do the data indicate that this is the case? • A) What’s the null and alternate hypothesis? • B) Compute the probability. (you have to find 𝑥, the use the formula and charts to find the probability) • C) What’s the conclusion? Answer • A) 𝐻0 : 𝜇 = 115 • 𝐻1 : 𝜇 < 115 • B) We found out that 𝑥 = 105.0 • 𝑧= 𝑥−𝜇 𝜎/ 𝑛 = 105.0−115 12/ 6 ≈ −2.04 • Using the z table 𝑃 𝑥 < 105.0 = 𝑃 𝑧 < −2.04 = .0207 • C) If H0 : μ = 115 is in fact true, the probability of getting a sample mean of 𝑥 ≤ 105.0 is about 2%. Because this probability is small, we reject null hypothesis and conclude that alternate hypothesis 𝜇 < 115. The average heart rate seems to be slowing. • Although since probability is so small, it doesn’t necessary prove null to be false and alternate to be true. P-Value • Assuming 𝐻0 is true, the probability that the test statistic will take on values as extreme as or more extreme than the observed test statistic (computed from sample data) is called the P-value of the test. The smaller the Pvalue computed from sample data, the stronger the evidence against 𝐻0 Look in the book and copy Pg 405 and 406 graphes Types of error • There are two types of error. Type I and Type II Our decision Our decision Truth of 𝐻0 If we do not reject 𝐻0 If we reject 𝐻0 If 𝐻0 is true Correct decision; no error Type I error If 𝐻0 is false Type II error Correct decision; no error Level of significance 𝛼 • This is the probability of rejecting 𝐻0 when it is true. This is the probability of a type I error. • Important!!! • If 𝑃 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼, we reject the null hypothesis and say the data are statistically significant at the level 𝛼. • If 𝑃 − 𝑣𝑎𝑙𝑢𝑒 > 𝛼, we do not reject the null hypothesis. Probabilities associated with a statistical test Our decision Our decision Truth of 𝐻0 If we do not reject 𝐻0 If we reject 𝐻0 If 𝐻0 is true Correct decision; with corresponding probability 1−𝛼 Type I error, with corresponding probability 𝛼, called the level of significance of the test If 𝐻0 is false Type II error, with corresponding probability 𝛽 Correct decision; with corresponding probability 1 − 𝛽, called the power of the test Power of the test 1 − 𝛽 • This is the probability of rejecting 𝐻0 when it is in fact, false. Note: • We usually choose 𝛼 first. • 1) increasing ∝ (level of significance) increases the 1 − 𝛽 (power of the test). Meaning we will more likely reject the null hypothesis when it is false. • 2) increasing ∝ (level of significance) also increases the probability of a type I error. Usually we want to use a small ∝. It means that we are usually more willing to make an error by failing to reject a claim (𝐻0 ) than to make an error by accepting another claim (𝐻1 ) that is false. Example: • Consider the ball bearing problem. • 𝐻0 : 𝜇 = 6.0 mm • 𝐻1 : 𝜇 ≠ 6.0 mm • A) Suppose the manufacturer requires a 1% level of significance. Describe a type I error, its consequence, and its probability • B) Discuss a type II error and its consequences Answer • A) type I error is when we should reject the null when, in fact, the average diameter of the ball bearing being produced is 6.0 mm. Type I error will cause a needless adjustment and delay of the manufacturing process. The probability is 1% because alpha is .01 • B) type II error is when we accept the null when it is in fact, false. It means that the bearings are either too large or too small to meet specifications. It could lead to product call-backs. Example: • Let x be a random variable representing dividend yield of Australian bank stocks. We may assume that x has a normal distribution with 𝜎 = 2.4%. A random sample of 10 Australian bank stocks gave the following yields. • 5.7 4.8 6.0 4.9 4.0 3.4 6.5 7.1 5.3 6.1 • For the entire Australian stock market, the mean dividend yield is 𝜇 = 4.7% Do these data indicate that the dividend yield of all Australian bank stocks is higher than 4.7%? Use 𝛼 = .01 Answer • You first identify your variables • 𝛼 = .01; 𝐻0 : 𝜇 = 4.7%; 𝐻1 : 𝜇 > 4.7%; 𝑟𝑖𝑔ℎ𝑡 𝑡𝑎𝑖𝑙𝑒𝑑 • Then you find 𝑥 • 𝑥 = 5.38%, using 𝑧 = 𝑥−𝜇 𝜎/ 𝑛 = 5.38−4.7 2.4/ 10 = .90 • Z=.90, which means p=.8159, since it is right-tailed we have to subtract from 1, so p=.1841. • Because .1841 > .01, we fail to reject the null hypothesis • It means that there is insufficient evidence at the 0.01 level to reject claim that average yield for bank stocks equals average yield for all stocks Group Work • You are investigating the weight of a cereal box. A random sample of six cereal box reveals the following weigh in grams: • 3.7 2.9 3.8 4.2 4.8 3.1 • Let x be a random variable representing the weights of all the cereal boxes. We assume that x has a normal distribution and 𝜎 = 0.70 𝑔𝑟𝑎𝑚. It is known that the mean weight of the cereal box is 𝜇 = 4.55. Do the data indicate that the mean weight of these cereal box is less than 4.55 grams? Use 𝛼 = 0.01 Answer • You first identify your variables • 𝛼 = .01; 𝐻0 : 𝜇 = 4.55; 𝐻1 : 𝜇 < 4.55; 𝑙𝑒𝑓𝑡 𝑡𝑎𝑖𝑙𝑒𝑑 • Then you find 𝑥 • 𝑥 = 3.75, using 𝑧 = 𝑥−𝜇 𝜎/ 𝑛 = 3.75−4.55 .70/ 6 = −2.80 • Z=-2.80, which means p=.0026, since it is left tailed, we keep the number. • Because .0026 < .01, we reject the null hypothesis • It means that there is sufficient evidence at the 0.01 level to reject the null of 4.55 gram and accept the alternate hypothesis that the cereal box has a lower average. Group Work • Water usually contain ammonia nitrogen. For many years, the concentration has been 2.1 mg/l. Due to acid rains, residents are worried that the rain led to increased the level of ammonia nitrogen. Let x be a random variable representing ammonia nitrogen concentration. Based on recent studies of the water, we can assume that x has a normal distribution with 𝜎 = .27 Recently, a random sample of eight water test are the following: • 2.5 2.7 3.1 2.8 3.0 2.2 2.9 2.5 • Do the data indicate that the mean concentration is greater than 2.1 mg/l? Use 𝛼 = .01 Group Work • Nationally, about 43% of all car accident is caused by teenagers. An insurance company is studying damage claims (in %) in California. A random sample of 12 samples gave the following data: • 50 64 34 26 53 27 24 79 42 43 13 54 • Assume that x has a normal distribution and 𝜎 = 8% • Do these data indicate that the percentage of car accidents in California is different than the national mean? Use 𝛼 = .05 Homework Practice • Pg 411 #1-14 even TESTING THE MEAN 𝝁 Summary so far… • 1) We first state the proposed value for a population parameter in the null hypothesis 𝐻0 . The alternate hypothesis 𝐻𝐴 states alternative values of the parameter, either <, >, 𝑜𝑟 ≠ the value proposed in 𝐻0 . We also set level of significance 𝛼. This is the risk we are willing to take of committing a type I error. That is, 𝛼 is the probability of rejecting 𝐻0 when it is, in fact, true. • 2) We then use corresponding sample statistic to challenge the statement in 𝐻0 . We convert sample statistic to a test statistic, which corresponding value of the appropriate sampling distribution • 3) We compute the P-value of the statistic, P-value is the probability of getting a sample statistic as extreme as or more extreme than the observed statistic from our random sample. • 4) Conclusion. If the P-value is very small, we have evidence to reject 𝐻0 and adopt 𝐻𝐴 . If P-value ≤ 𝛼 then we say we have evidence to reject 𝐻0 and adopt 𝐻𝐴 . Otherwise, we say that the sample evidence is insufficient to reject 𝐻0 • 5) Interpret the result Example: Testing 𝜇, 𝜎 known (you should know how to do this) • Let x be a random variable representing the number of sunspots observed in a four-week period. A random sample of 40 such periods from Spanish colonial times gave the following data: • 12.5 14.1 27.4 53.5 65 134.7 45.3 61.0 37.6 73.9 114.0 39.0 48.3 67.3 70.0 43.8 56.5 59.7 24.0 12.0 104.0 54.6 4.4 177.3 70.1 54.0 28.0 13.0 72.7 81.2 24.1 20.4 13.3 9.4 25.7 50.0 12.0 7.25 11.3 • 𝑥 = 47.0 Previous studies of sunspot activity during this period indicate that 𝜎 = 35. It is thought that for thousands of years, the mean number of sunspots per four-week period was about 𝜇 = 41. Do the data indicated that the mean sunspot activity during Spanish colonial period was higher than 41? Use 𝛼 = 0.05 Answer • 𝐻0 : 𝜇 = 41 • 𝐻𝐴 : 𝜇 > 41 • 𝑧= 𝑥−𝜇 𝜎/ 𝑛 = 47−41 35/ 40 ≈ 1.08 • P-value=P(𝑧 > 1.08) ≈ 0.1401 • Since .1401 > .05 we do not reject 𝐻0 • At the 5% level of significance, the evidence is not sufficient to reject 𝐻0 . Based on the sample data, we do not think the average sunspot activity during the Spanish colonial period was higher than the long-term mean. Testing 𝜇 𝑤ℎ𝑒𝑛 𝜎 𝑖𝑠 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 • 1) In the context of the application, state the null and alternate hypothesis and set the level of significance 𝛼 • 2) If you can assume that x has a normal distribution or simply has a mound-shaped symmetric distribution, then any sample size n will work. If you cannot assume this, then use a sample size 𝑛 ≥ 30. • 𝑡= 𝑥−𝜇 𝑠/ 𝑛 with degrees of freedom d.f.= n-1 • 3) Use student’s t-distribution and the type of test, one-tailed or two-tailed, to find (or estimate) the P-value corresponding to the test statistic. • 4) Conclude the test. If P-value ≤ 𝛼 then we say we have evidence to reject 𝐻0 and adopt 𝐻𝐴 . Otherwise, we say that the sample evidence is insufficient to reject 𝐻0 • 5) Interpret your conclusion Key: • Use one-tail area to estimate P-value for lefttailed tests • Use one-tail area to estimate P-value for righttailed tests • Use two-tail area to estimate P-value for twotailed tests. Example: • The drug 6-mP is used to treat leukemia. The following data represent the remission times (in weeks) for a random sample of 21 patients using 6-mP. • 10 7 32 23 22 6 16 34 32 25 11 20 19 6 17 35 6 13 9 6 10 • Assume the x distribution is mound-shaped and symmetric. A previously used drug treatment had a mean remission time of 𝜇 = 12.5 weeks. • Do the data indicate that the mean remission time using the drug 6-mP is different from 12.5 weeks? Use 𝛼 = 0.01 Answer • • 𝐻0 : 𝜇 = 12.5 𝐻𝐴 : 𝜇 ≠ 12.5 • 𝑡≈ • • d.f.=21-1=20 The sample statistic t=2.108 falls between 2.086 and 2.528 • 0.02<P-value<0.05, since it is > than 0.01, we do not reject the null. • Even though it does not give us that specific p-value, it does give a range that contains the specific P-value. As the diagram shows, the entire range is greater than 𝛼. This means we cannot reject 𝐻0 • So at the 1% level of significance, the evidence is not sufficient to reject the null. We cannot say that the drug 6-mP provides a different average remission time than the previous drug. 𝑥−𝜇 𝑠 𝑛 ≈ 17.1−12.5 10.0 21 ≈ 2.108 Group Work • Suppose the length of projectile points at a certain archaeological site have mean length 𝜇 = 2.6 𝑐𝑚. A random sample of 11 recently discovered projectile points in an adjacent cliff dwelling gave the following length: • 3.1 4.1 1.8 2.1 2.2 1.3 1.7 3.0 3.7 2.3 2.6 • Do these data indicate that the mean length of projectile points in the adjacent cliff dwelling is longer than 2.6? Use 𝛼 = .01 Group Work • USA Today reported that the state with the longest mean life span is Hawaii where the population mean life span is 81 years. A random sample of 15 obituary gave the following information about life span of the residents: 72 68 91 85 80 68 56 93 47 86 97 77 69 87 47 • Does the information indicate that the population mean life is less than 81 years? Use a 5% level of significance Group Work • A good way to measure value of a company is the P/E or price to earning ration. High P/E may indicate a stock is overpriced. For the S&P Stock Index of all major stocks, the mean P/E ration is 𝜇 = 19.4. A random sample of 36 pharmaceutical stocks gave a P/E ratio of 𝑥 = 17.5 with 𝑠 = 6.1. Does this indicate that the mean P/E ratio of all pharmaceutical stocks is different than the mean of S&P Stocks? Use 𝛼 = .05 Testing 𝜇 Using Critical Regions (Traditional Method) • It is very very very similar to what we have been doing, but we are just comparing the critical values. Hypothesis Testing, Critical Values 𝒁𝟎 • 𝑧= 𝑥−𝜇 𝜎/ 𝑛 Level of Significance 𝛼 = 0.05 𝛼 = 0.01 Critical value 𝑍0 for a left-tailed test -1.645 -2.33 Critical value 𝑍0 for a right-tailed test 1.645 2.33 Critical value ±𝑍0 for a two-tailed test ±1.96 ±2.58 Continue • A) for a left-tailed test, – i. if sample test statistic ≤ critical value, reject 𝐻0 – ii. If sample test statistic > critical value, fail to reject 𝐻0 • B) for a right tailed test, – i. if sample test statistic ≥ critical value, reject 𝐻0 – ii. If sample test statistic < critical value, fail to reject 𝐻0 • C) for a two-tailed test, – i. if sample test statistic lies beyond critical values, reject 𝐻0 – ii. If sample test statistic lies between critical values, fail to reject 𝐻0 Example: (from previous example) • Let x be a random variable representing the number of sunspots observed in a four-week period. A random sample of 40 such periods from Spanish colonial times gave the following data: • 12.5 14.1 27.4 53.5 65 134.7 45.3 61.0 37.6 73.9 114.0 39.0 48.3 67.3 70.0 43.8 56.5 59.7 24.0 12.0 104.0 54.6 4.4 177.3 70.1 54.0 28.0 13.0 72.7 81.2 24.1 20.4 13.3 9.4 25.7 50.0 12.0 7.25 11.3 • 𝑥 = 47.0 Previous studies of sunspot activity during this period indicate that 𝜎 = 35. It is thought that for thousands of years, the mean number of sunspots per four-week period was about 𝜇 = 41. Do the data indicated that the mean sunspot activity during Spanish colonial period was higher than 41? Use 𝛼 = 0.05 Answer • Because it is a right tailed test, and you got z=1.08. Since 1.08 <1.645. We fail to reject 𝐻0 TI 83/TI 84 Calculator • In your calculator, press Stat, select Tests, and use option 1:Z-Test when the question is appropriate and 2: T-Test when the question is appropriate Homework Practices • Pg 426 #1-21 odd TESTING A PROPORTION 𝝆 Intro • Many situations arise that call for tests of proportions or percentages rather than means. For example, a college registrar may want to determine if the proportion of students wanting 3-weeks intensive courses has increased. Note: • In this section, we will assume that the situations we are dealing with satisfy the conditions underlying the binomial distribution. r is the number of successes out 𝑟 of n trials. 𝑝 = 𝑞 𝑟𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑠 𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑓𝑎𝑖𝑙𝑢𝑟𝑒, 𝑤𝑒 𝑎𝑙𝑠𝑜 𝑎𝑠𝑠𝑢𝑚𝑒 𝑛𝑝 > 5 𝑎𝑛𝑑 𝑛𝑞 > 5 Proportion 𝜌 • 𝑧= 𝑝−𝜌 𝜌𝑞 𝑛 𝑟 𝑛 • 𝑝 = is the sample test statistic • n=number of trials • 𝜌= proportion specified in 𝐻0 • q=1-𝜌 The different tails • Left-tailed test • 𝐻0 : 𝑝 = 𝑘 • 𝐻𝐴 : 𝑝 < 𝑘 • Right-tailed test • 𝐻0 : 𝑝 = 𝑘 • 𝐻𝐴 : 𝑝 > 𝑘 • Two-tailed test • 𝐻0 : 𝑝 = 𝑘 • 𝐻𝐴 : 𝑝 ≠ 𝑘 Example: • A team of eye surgeons has developed a new technique for a risky eye operation to restore the sight of people blinded from a certain disease. Under the old method, it is known that only 30% of the patients who undergo this operation recover their eyesight. • Suppose that surgeons in various hospitals have performed a total of 225 operations using the new methods and that 88 have been successful. Can we justify the claim that the new method is better than the old one? Use 1% level of significance Answer • • • 𝐻0 : 𝑝 = .30 𝐻𝐴 : 𝑝 > .30 𝛼 = .01 • • • • 𝑝= = .39 225 p= .30 q=.70 n=225 • 𝑧= • P(z>2.95)≈ 0.0016 • • Since 0.0016<.01, we reject the null and accept the alternate At the 1% level of significance, the evidence shows that the population probability of success for the new surgery technique is higher than that of the old technique. 88 𝑝−𝑝 𝑝𝑞 𝑛 = .39−.30 .30(.70) 225 ≈ 2.95 Group Work • A botanist has produced a new variety of hybrid wheat that is better able to withstand drought than other varieties. The botanist knows that for the parent plants, the proportion of seeds germinating is 80%. The proportion of seeds germinating for the hybrid variety is unknown, but the botanist claims it is 80%. To test this claim, 400 seeds from the hybrid plant are tested, and it is found that 312 germinate. Use a 5% level of significance to test the claim that proportion germinating for the hybrid is 80% Group Work • A recent study claims that 60% of SEC investigation will be dropped. Suppose 500 cases has been reported and 384 cases were dropped. Can we justify that these cases were dropped more often than previously claimed? Use 5% level of significance. Group Work • If you watched Dragon Ball Z, you would know that Super Saiyans are a rare event. It is claimed that 2% of the population in Planet Vegeta are Super Saiyans. A recent study of 600 families, 30 of which claimed that their son/daughter is a Super Saiyan. Do these claims signal that the families these days create more Super Saiyans than the population states? Use 5% level of significance. Note: • The central question in hypothesis testing is whether or not you think the value of the sample test statistic is too far away from the value of the population parameter proposed in 𝐻0 to occur by chance alone. • When you reject the null, are you absolutely certain that you are making a correct decision? The answer is NO! You are simply willing to take a chance that you are making a type I error. Note cont. • 1) What if the P-value is so close to 𝛼 that we “barely” reject or fail to reject the null? In such cases, researchers might attempt to clarify the results by • Increasing the sample size • Controlling the experiment to reduce the standard deviation Note cont. 2 • 2) How reliable is the study and the measurements in the sample? • When reading results of statistical study, be aware of the source of the data and the reliability of the organization doing the study • Is the study sponsored by an organization that might profit or benefit from the stated conclusions? If so, look at the study carefully to ensure that the measurements, sampling technique, and handling of data are proper and meet professional standards. Homework Practice • Pg 437 #1-22 eoe TESTS INVOLVING PAIRED DIFFERENCES (DEPENDENT SAMPLES) Note: • Many statistical applications use paired data samples to draw conclusions about the difference between two population means. Data pairs occur very naturally in “before and after” situations, where the same object or item is measured both before and after a treatment. • Example: Psychological studies of identical twins; biological studies of plant growth on plots of land matched for soil type, moisture, and sun, etc Example: • A shoe manufacturer claims that among the general population of adults in the United States, the average length of the left foot is longer than that of the right. To compare the average length of the left foot with that of the right, we can take a random sample of 15 U.S. adults and measure the length of the left foot and then the length of the right foot for each person in the sample. Is there a natural way of pairing the measurements? How many pairs will we have? • Answer: We can pair each left foot measurement with the same person’s right foot measurement. The person serves as the “matching link” between the two distributions. We will have 15 pairs of measurements. How to Test Paired Differences Using the Student’s t distribution • Obtain a simple random sample of n matched data pairs A, B. Let 𝑑 be a random variable representing the difference between the values in a matched data pair. Compute the sample mean 𝑑 and sample standard deviation 𝑠𝑑 • 1) Use the null hypothesis of no difference, 𝐻0 : 𝜇𝑑 = 0. In the context of the application, choose the alternate hypothesis to be 𝐻𝐴 : 𝜇𝑑 > 0, 𝜇𝑑 < 0, 𝑜𝑟 𝜇𝑑 ≠ 0. Set the level of significance 𝛼. • 2) If you can assume that 𝑑 has a normal distribution or simply has a mound-shaped symmetric distribution, then any sample size n will work. If you cannot assume this, then use a sample size 𝑛 ≥ 30. Use 𝑑, 𝑠𝑑 , 𝑛 𝑎𝑛𝑑 𝜇𝑑 = 0 from the null hypothesis to compute the sample test statistic. • 𝑡= • With d.f.= n-1 • • • 3) Determine the tail 4) Conclude the test by comparing p-value to 𝛼 5) Interpret 𝑑−0 𝑠𝑑 / 𝑛 = 𝑑 𝑛 𝑠𝑑 Example: • A team of heart surgeons at Saint Ann’s Hospital knows that many patients who undergo corrective heart surgery have a dangerous buildup of anxiety before their scheduled operations. The staff psychiatrist at the hospital has started a new counseling program intended to reduce this anxiety. A test of anxiety is given to patients who know they must undergo heart surgery. Then each patient participates in a series of counseling sessions with the staff psychiatrist. At the end of the counseling sessions, each patient is retested to determine anxiety level. Higher scores mean higher levels of anxiety. From the given data, can we conclude that the counseling sessions reduce anxiety? Use 1% level of significance. • Chart on the next slide Patient B Score before Counseling A Score after counseling d=B-A Difference 1 121 76 45 2 93 93 0 3 105 64 41 4 115 117 -2 5 130 82 48 6 98 80 18 7 142 79 63 8 118 67 51 9 125 89 36 Answer • • • 𝐻0 : 𝜇𝑑 = 0 𝐻𝐴 : 𝜇𝑑 > 0 (remember positive difference means reducing stress) 𝛼 = 0.01 • • 𝑑 ≈ 33.33 𝑠𝑑 ≈ 22.92 • 𝑡= • 0.0005<P-value<0.005 • Since it is in between those two number, it is less than .01, therefore we reject the null hypothesis and accept the alternate hypothesis. • Based on 1% level of significance, we determined that going to counseling sessions reduce anxiety because we reject the notion that the difference between before and after is 0 and take a chance at the fact they are greater than 0 by accepting the alternate hypothesis. 𝑑−0 𝑠𝑑 / 𝑛 ≈ 33.33 22.92/ 9 ≈4.363 Group Work • Do Educational toys make a difference in the age at which child learns to read? To study this question, researcher designed an experiment in which one group of preschool children spent 2 hours each day in a room well supplied with educational toys. A control group of children spent 2 hours a day in a noneducational toy room. It was anticipated that IQ differences and home environment might be uncontrollable factors unless identical twins could be used. Here is the chart. Use 1% level of significance. Ages are in months Twin Pair Experimental Group B = Reading Age Control Group A = reading Age 1 58 60 2 61 64 3 53 52 4 60 65 5 71 75 6 62 63 Difference d=B-A Group Work • Athersys is a company that uses Multistem to treat patients. Does Multistem really reduce inflammation in a disease? (IBD or UC). The company designed an experiment in which one group receives the Multistem treatment and control group receive a placebo. Here are the results of 6 patients. Use 5% level of significance. Mayo Score is from 0-5 0 is the best and 5 is the worst Trial Experimental Control Group Group B = multistem A = placebo 1 2 3 2 1 4 3 2 2 4 3 4 5 3 3 6 0 2 Difference d=B-A Group Work • Are America’s top CEO really worth all that money? One way to answer this question is to look at the annual company percentage increase in revenue (B), versus CEO’s annual percentage salary increase (A). Do these data indicate that the population mean percentage increase in corporate revenue different from the population mean percentage increase in CEO? Use 5% level of significance B: 24 23 25 18 6 4 21 37 A: 21 25 20 14 -4 19 15 30 Homework Practice • Pg 449# 1-19 eoe TESTING 𝝁𝟏 − 𝝁𝟐 AND 𝝆𝟏 − 𝝆𝟐 (INDEPENDENT SAMPLES) Note: Last section we talked about how to calculate 2 DEPENDENT samples. In this section, we will turn our attention to tests of differences of means from INDEPENDENT samples. We will see new techniques for testing the difference of means from independent sample. Note: • There will be three situations – Testing 𝜇1 − 𝜇2 when 𝜎1 and 𝜎2 are known – Testing 𝜇1 − 𝜇2 when 𝜎1 and 𝜎2 are unknown – Testing 𝜌1 − 𝜌2 Group Work • What is the difference between Independent samples and dependent samples? Group Work • Determine if the situation is dependent or independent. • A teacher wishes to compare the effectiveness of two teaching methods. Students are randomly divided into two groups; The first group is taught by direct instruction. The second group is taught by studentdirected learning. At the end of the course, a comprehensive exam is given to all students, and the mean score 𝑥1 is compared with 𝑥2 . Are the samples independent or dependent? Why? Answer • Independent because they were randomly divided into two groups. Group Work • Determine if the situation is dependent or independent. • Shoe manufacturer claimed that for the general population of adult US citizens, the average length of the left foot is longer than the average length of the right foot. To study this claim, the manufacturer gathers data in this fashion: Sixty adult US citizens are drawn at random and for these 60 people, both left and right feet are measured. Let 𝑥1 be the mean length of the left and 𝑥2 be the mean length of the right feet. Are they independent or dependent? Answer • Dependent, usually the person’s left feet are related to the right feet. Also, they are paired. How to test 𝜇1 − 𝜇2 when 𝜎1 and 𝜎2 are known • Let 𝜎1 and 𝜎2 be the population standard deviations of populations 1 and 2. Obtain two independent random samples from populations 1 and 2, where – – • 1. In the context of the application, state the null and alternate hypothesis and set the level of significance. It is customary to use – • 𝑥1 and 𝑥2 are sample means from populations 1 and 2 𝑛1 and 𝑛2 are the sample sizes from populations 1 and 2 𝐻0 : 𝜇1 − 𝜇2 = 0 2. If you can assume that both population distributions 1 and 2 are normal, any sample sizes 𝑛1 and 𝑛2 will work. If you cannot assume this, then use samples sizes greater than 30 for both samples. – 𝑧= 𝑥1 −𝑥2 −(𝜇1 −𝜇2 ) 2 𝜎2 1 +𝜎2 𝑛1 𝑛2 • • • 3. Use the standard normal distribution and the type of test, one-tailed or two-tailed, to find the pvalue 4. Conclude 5. interpret How to test 𝜇1 − 𝜇2 when 𝜎1 and 𝜎2 are unknown • Obtain two independent random samples from populations 1 and 2, where – – – • 1. In the context of the application, state the null and alternate hypothesis and set the level of significance. It is customary to use – • • • 𝐻0 : 𝜇1 − 𝜇2 = 0 2. If you can assume that both population distributions 1 and 2 are normal, any sample sizes 𝑛1 and 𝑛2 will work. If you cannot assume this, then use samples sizes greater than 30 for both samples. – • 𝑥1 and 𝑥2 are sample means from populations 1 and 2 𝑠1 and 𝑠2 are sample standard deviations from populations 1 and 2 𝑛1 and 𝑛2 are the sample sizes from populations 1 and 2 t= 𝑥1 −𝑥2 −(𝜇1 −𝜇2 ) 2 𝑠2 1 + 𝑠2 𝑛1 𝑛2 , d.f= (remember you use the smaller of 𝑛1 − 1 and 𝑛2 − 1) 3. Use the standard normal distribution and the type of test, one-tailed or two-tailed, to find the pvalue 4. Conclude 5. interpret Note: • Null hypotheses • 𝐻0 : 𝜇1 − 𝜇2 = 0 or H0 : 𝜇1 = 𝜇2 Note: • Alternate hypotheses and the type of test • 𝐻𝐴 : 𝜇1 − 𝜇2 < 0 or HA : 𝜇1 < 𝜇2 left tailed test • 𝐻𝐴 : 𝜇1 − 𝜇2 > 0 or HA : 𝜇1 > 𝜇2 right tailed test • 𝐻𝐴 : 𝜇1 − 𝜇2 ≠ 0 or HA : 𝜇1 ≠ 𝜇2 two-tailed test Special situation: pooled 2-sample procedure • In this situation, even though you know it’s independent but you have reasons to believe 𝜎1 = 𝜎2 • An example is the weight of Asians in Asia vs Asians in U.S. 𝑥1 −𝑥2 • 𝑡= 𝑠 1 1 + 𝑛1 𝑛2 , with d.f=𝑛1 + 𝑛2 − 2 • Pooled standard deviation s is • 𝑠= 𝑛1 −1 𝑠12 + 𝑛2 −1 𝑠22 𝑛1 +𝑛2 −2 How to test a difference of proportions 𝜌1 − 𝜌2 • • • • Consider two independent binomial experiments Binomial Experiment (for both experiment 1 and experiment 2) 𝑛1,2 =number of trials 𝑟1,2 =number of successes • 𝑝1,2 = • 𝜌1,2 = population probability of success on a single trial • • 1. Use null hypothesis of no difference 𝐻0 : 𝜌1 − 𝜌2 = 0 and level of significance 2. The pooled best estimates for population probability of success and failure are 𝑟1,2 𝑛1,2 𝑟 +𝑟 – 𝑝 = 𝑛1 +𝑛2 𝑎𝑛𝑑 𝑞 = 1 − 𝑝 – z= – – – 3. Determine the tail 4. Conclude Interpret 1 𝑝1 −𝑝2 𝑝𝑞 𝑝𝑞 + 𝑛1 𝑛2 2 remember (𝑛1 𝑝, 𝑛1 𝑞, 𝑛2 𝑝, 𝑛2 𝑞 𝑎𝑙𝑙 ℎ𝑎𝑣𝑒 𝑡𝑜 𝑏𝑒 𝑔𝑟𝑒𝑎𝑡𝑒𝑟 𝑡ℎ𝑎𝑛 5) Example: (when 𝜎 is known) • A consumer group is testing camp stoves. To test the heating capacity of a stove, it measures the time required to bring 2 quarts of water from 50F to boiling. Two competing models are under consideration. Ten stoves of the first model and 12 stoves of the second model are tested. The following results are obtained • Model 1: 𝑥1 = 11.4 𝑚𝑖𝑛, 𝜎1 = 2.5 𝑚𝑖𝑛; 𝑛1 = 10 • Model 2: 𝑥2 = 9.9 𝑚𝑖𝑛; 𝜎2 = 3.0 𝑚𝑖𝑛; 𝑛2 = 12 • Assume that the time required to bring water to a boil is normally distributed for each stove. Is there any difference between the performances of these two models? Use 5% level of significance. Group Work • A teacher wish to compare the two teaching methods. The first group consists of 49 students with a mean score of 74.8 points. The second group has 50 students with a mean score of 81.3 points. The teacher claims that the second method will increase the mean score on the exam. Is this claim justified at the 5% level of significance? Earlier research for the two methods indicates that 𝜎1 = 14 𝑝𝑜𝑖𝑛𝑡𝑠 𝑎𝑛𝑑 𝜎2 = 15 𝑝𝑜𝑖𝑛𝑡𝑠 Example: (when 𝜎 is unknown) • Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B headache remedies. • 12 people were randomly selected and given brand A, another 12 were randomly selected and given brand B. The lengths of time in minutes for the drugs to reach a specified level in the blood were recorded. • Brand A: 𝑥1 = 21.8 𝑚𝑖𝑛; 𝑠1 = 8.7 𝑚𝑖𝑛, 𝑛1 = 12 • Brand B: 𝑥2 = 18.9 𝑚𝑖𝑛; 𝑠2 = 7.5 𝑚𝑖𝑛; 𝑛2 = 12 • Let Past experience with the drug composition of the two remedies permits researchers to assume that both distributions are normal. Let us use a 5% level of significance to test the claim that there is no difference in the mean time for bodily absorption. Find the P-value and evaluate the two drugs. Group Work • Suppose the experiment to measure the times in minutes for the headache remedies to enter the bloodstream yielded sample means, sample stand deviations and sample sizes as follows: • Brand A: 𝑥1 = 20.1 𝑚𝑖𝑛; 𝑠1 = 8.7 𝑚𝑖𝑛, 𝑛1 = 12 • Brand B: 𝑥2 = 13.4 𝑚𝑖𝑛; 𝑠2 = 7.6 𝑚𝑖𝑛; 𝑛2 = 8 • Brand B claims to be faster. Is this claim justified at the 1% level of significance? Example: Difference of proportions • CCHS wants to improve student involvement. One method under consideration is to send reminders through texts to all students in the school to participate in school events. As part of the pilot study to determine if this method will actually improve student involvement, a random sample of 1250 students are taken. Then it is divided into two groups; • Group 1: 625 students. No reminder of school events. The number of participants are 125 • Group 2: 625 students. Reminders were sent through texts. The number of participants are 501. • ABS claims that the proportion of students who got texts was significantly greater in group 2 Use a 5% level of significance to test the claim that the proportion of students that participates is greater in group 2, the group that received texts. Group Work • Sample of 1100 voters was randomly divided into two groups. • Group 1: 500 voters; no reminders sent; 200 voted • Group 2: 600 voters; reminders sent; 330 vote • Do the data support the claim that the proportion of voters who registered was greater in group that received reminders than in the group that did not? Use a 1% level of significance. TI 83/TI 84 • Press STAT and selects TESTS. You either use 2-SampZTest, 2-SampTTest, or 2-PropZTest. Homework Practice • Pg 470 #1-20 odd

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 9 Hypothesis Testing