Download 1 Chapter 10 Exercises 1. A sample of 30 observations has ¯x = 137

1 Chapter 10 Exercises q Pn 1 2 1. A sample of 30 observations has x̄ = 137 and s = n−1 i=1 (xi − x̄) = 30. Assuming the data follow a normal distribution, use these values to test whether or not the population mean is different from 150. 2. A variable X follows a normal distribution with unknown mean µ. A sample of 60 observations of X is obtained and the following sample statistics are found: v u n u 1 X t x̄ = 915, s = (xi − x̄)2 = 80. n − 1 i=1 Use the data to test the following hypotheses: H0 : µ ≤ 900 vs. H1 : µ > 900 3. A variable Y follows a normal distribution with unknown mean µ but a known standard deviation of 6. A sample of 5 observations of Y is obtained and the sample average is ȳ = 25. (a) Use the data to test the following hypotheses: H0 : µ > 30 vs. H1 : µ ≤ 30 (b) Find the p-value of the test and use it to comment on the error probability in relation to the conclusion of the test. q Pn 1 2 (c) Suppose the sample standard deviation s = n−1 i=1 (yi − ȳ) = 5.5 is also known; repeat the test in (a) using this piece of information. Which of the two tests: (a) or (c), is better? 4. To determine whether a coin is fair, an experiment is carried out in which the coin is tossed 15 times. Out of these 15 tosses, heads is observed 13 times and tails 2 times. Assuming these tosses are independent of each other and each toss has the same probability of observing a head. Formulate the hypothesis test, find its exact p-value and drawn a conclusion. 5. In an experiment to test the effectiveness of bed-nets in a malaria infested region, 500 households are provided with bed-nets, and out of these, 100 households reported new infections in the month. The historic rate of new infections in the region is 1 in 4 households every month. Based on the data, is there sufficient evidence to suggest bed-nets are effective in reducing new malaria infections? 6. A student just signed up for a new internet service. Using the new service, it took him 1.3 days to download a movie, which is shorter than the average download speed of 1.55 days using the old service. Assuming download time follows an exponential distribution, if 2 he uses the data to claim that the average download time is faster under the new service, what is the probability that his claim is false? 7. Tofu, one of the great delicacies from Asia, was invented during the Han Dynasty (206 BC - 226 AD) in China and it was introduced to Japan during the Nara period (710 - 794 AD). Tofu is solidified soy milk made from pouring soy milk over a coagulant. Interestingly, some of the best tofus nowadays are found in Nara, Japan, where tofu shops still make tofu in the same way it was done a few centuries ago. A young man just inherited a tofu shop from his father and he wants to experiment a new type of coagulant. Traditional Japanese tofu uses nigiri, which is extracted from sea water. The young man made two lots of tofu, one uses nigiri as the coagulant and another using the new coagulant. Samples from both lots of tofu were offered to customers and each customer was asked to state his/her preference. Let x1 , ..., x800 be IID Bernoulli(p) preferences (X) from his customers, where X = 1 if a customer prefers tofu using the new coagulant and X = 0 if the customer prefers tofu usingP nigiri; p represents the proportion of preference for tofu using a new coagulant. Let Y = 800 i=1 xi . Suppose 376 customers prefer tofu using the new coagulant. (a) State the MLE for p, based on the given data? (b) How many observations of X are there? (c) What does Y represent and what distribution does Y follow? How many observations of Y are there and what are their values? (d) The new coagulant is cheaper and easier to obtain, so he would be happy with the new coagulant if customers have no preference between the two. The relevant hypotheses are: H0 : p = 0.5 vs. H1 : p 6= 0.5. Are the hypotheses 1-sided or 2-sided? State your reason. (e) Find the p-value of the test. Write one sentence to explain what this p-value represent. Does the p-value you obtain suggest H0 may be untrue? (f) What is a 5% significance test? In a 5% significance test, what is the chance of a type-one error? (g) Based on the p-value, what is the conclusion of your test, if a 5% significance level is to be used? (h) Another way to carry out a 5% significance test is to use a test statistic. Repeat the 5% significance test using a test statistic. 3 8. Visitors to Nara love to visit its onsens, or hot springs. However, the onsen business is very competitive, especially during economic recessions. The owner of an onsen suspects his business is not doing well but he has no clue on how to confirm his hypothesis, so he approaches his daughter, who has just finished a degree in Statistics. She advises him that he should take some data and carry out a statistical test. The number of guests followed a Poisson distribution with a rate of 5 per day in the past. Let λ be the rate right now. Then the hypotheses of interest are: H0 : λ = 5 vs. H1 : λ < 5. (a) Are the hypotheses 1-sided or 2-sided? State your reason. (b) On the day they carried out the test, only 1 guest arrived. How many observations are there? (c) Find the p-value of the test. Write one sentence to explain what this p-value represent. Does the p-value suggest H0 may be untrue? (d) Based on the p-value, what is the conclusion of the test, if a 5% significance test is to be used? 9. To improve business, they introduce a new package so guests can enjoy unlimited free transportation to the surrounding hillsides and a nine-course Kaiseki dinner. Let x1 , ..., x50 be the daily revenue (X, in 1000 U) in 50 days following the introduction of the new package. Suppose 2 X ∼ N (µ, σ 2 ), where µ is theP average daily revenue and σP is the variance 50 1 1 2 2 of daily revenue. Let x̄ = 50 i=1 xi = 1060 and s = 50 50 i=1 (xi − x̄) = 51529. (a) State the MLE for (µ, σ 2 ), based on the given data? (b) How many observations of X are there? (c) Their interest is to determine whether the average revenue has improved compared to the past, so the appropriate hypotheses are: H0 : µ < 1000 vs. H1 : µ > 1000. Are the hypotheses 1-sided or 2-sided? State your reason. (d) Find the p-value of the test. Write one sentence to explain what this p-value represent. Does the p-value suggest H0 may be untrue? 4 (e) Based on the p-value, what is the conclusion of the test, if a 5% significance level is to be used? (f) Another way to carry out a 5% significance test is to use a test statistic. Repeat the 5% significance test using a test statistic. (g) Repeat (f) using a “t-test” by choosing the appropriate critical value from the table below: df = n − 1 29 critical value 1.699 30 1.697 40 1.684 60 >120 1.671 1.64 10. Six thousand miles from Nara, in the county of Nottinghamshire, England, the blue cheese makers are facing a different problem from that of the tofu maker in Nara. Since the outbreak of the Mad Cow Disease in the 1980’s, the government has banned the use of un-pasteurized milk in making cheeses. One maker wants to confirm the hypothesis that pasteurization degrades the flavour of her cheeses. So she made two batches of cheeses, one using raw milk and another using pasteurized milk. She then held a cheese tasting for her best customers in a secrete location. Let x1 , ..., x100 be the preferences (X) between the two types of cheese in 100 customers, where X = 1 if the customer liked the cheese made with pasteurized milk and X = 0 if he/she preferred the cheese made with raw milk. Suppose x1 , ..., x100 are iid Bernoulli(p), where p represents the among all customer who prefer cheese made Pproportion 100 1 x = 40. with pasteurized milk. Let x̄ = 100 i=1 i (a) State the MLE for p, based on the given data? (b) How many observations of X are there? (c) The cheese maker’s interest is to test the hypotheses: H0 : p ≥ 0.5 vs. H1 : p < 0.5. Are the hypotheses 1-sided or 2-sided? State your reason. (d) Find the p-value of the test. Write one sentence to explain what this p-value represent. Does the p-value suggest H0 may be untrue? (e) Based on the p-value, what is the conclusion of your test, if a 5% significance level is to be used? (f) Another way to carry out a 5% significance test is to use a test statistic. Repeat the 5% significance test using a test statistic. 5 11. Another cheese maker in the region decided to use a different method to determine whether customers prefer cheeses made with raw milk over those made with pasteurized milk. He recorded the time T (in weeks) between purchases of his raw milk cheeses from 40 of his customers: t1 , ..., t40 . Suppose t1 , ..., t40 are IID Exp(λ), where λ1 represents the mean P40 1 time between purchases. Let t̄ = 40 i=1 ti = 0.24 (in weeks). (a) State the MLE for λ1 , based on the given data? (b) How many observations of T are there? (c) The cheese maker noticed that the average time between purchases for his pasteurized milk cheeses is 0.3 week, so he is interested to test the hypotheses: 1 1 H0 : ≥ 0.3 vs. H1 : < 0.3. λ λ Are the hypotheses 1-sided or 2-sided? State your reason. (d) Find the p-value of the test. Write one sentence to explain what this p-value represent. Does the p-value you obtain suggest H0 may be untrue? (You may use the fact that var( λ1̂ ) = nλ1 2 , however, you should ask yourself why this is so.) (e) Based on the p-value, what is the conclusion of your test, if a 5% significance level is to be used? (f) Another way to carry out a 5% significance test is to use a test statistic. Repeat the 5% significance test using a test statistic. 12. A type of cheese that has the texture of tofu is mozzarella cheese. The best mozzarella cheese comes from Italy, where by law, has to be made from buffalo milk. The cost of making mozzarella cheese is very high because an average buffalo produces about half the milk of a cow and harvesting buffalo milk cannot be easily automated. Buffaloes are not indigenous to Italy and in fact, they are found widely in many parts of Asia. An Indian entrepreneur is venturing into the business of making mozzarella cheese in India. There are two possible sources to obtain buffalo milk, either through a co-operative or from the farmers directly. He wants to determine whether there is a difference in obtaining the supply of milk from the two sources. Let X be the daily amount of milk from the co-operative and Y be that from the farmers. Assume 2 2 X ∼ N (µX , σX ) and Y ∼ N (µY , σY2 ), where (µX , σX ) and (µY , σY2 ) are unknown. Over a course of n = 30 days, he obtained the following data on the amount (in 1000 liters) of milk from the two sources: n n 1X 1X 2 xi = 11.6, sx = (xi − x̄)2 = 27.6, x̄ = n i=1 n i=1 n 1X yi = 12.7, ȳ = n i=1 n s2y 1X = (yi − ȳ)2 = 32.4. n i=1 6 Assume all data are independent of each other. (a) He is interested to test the hypotheses: H0 : µX − µY = 0 vs. H1 : µX − µY 6= 0. Are the hypotheses 1-sided or 2-sided? State your reason. (b) Using the results from Question 5 from Chapter 8 Exercises, write down the test statistic for testing the hypotheses and carry out a 5% significance test. (c) Repeat (b) using a “t-test”1 by choosing the appropriate critical value from the table below. In this case, use a df = (n − 1) + (n − 1) = 2n − 2: 29 df = 2n − 2 critical value 2.045 30 2.042 40 2.021 60 >120 2.000 1.96 (d) He notices that the fluctuations of milk production from both sources are large, which are reflected in the large values of s2x and s2y . These large fluctuations are due to the fact that on some days, such as a holiday or when there is a village wedding, milk production is very low and following such days, production is unusually high. His interest is in the difference between the mean production from the two sources, not the daily fluctuations. In fact, the fluctuations are a distraction. To gain a better insight into the difference between the two sources, he takes the (daily) difference xi − yi = di , i = 1, ..., 30 and arrives at the following: n n 1X 1X d¯ = di = (xi − yi ) = 11.6 − 12.7 = −1.1, n i=1 n i=1 Pn ¯2 2 i=1 (di − d) = 7. sd = n−1 He also realises µ1 − µ2 is really the same as µD , where µD represents the mean of the daily differences been the two sources in the long run. Therefore, the hypotheses in (a) can be re-written as H0 : µD = 0 vs. H1 : µD 6= 0. Treating d1 , ..., d30 as a sample of observations of D, the daily difference, write down the test statistic for the re-written hypotheses and carry out a 5% significance test. (e) Repeat (d) using a “t-test”2 by choosing the appropriate critical value from the table below. In this case, use a df = n − 1 because the test is based on 30 observations of D: 1 2 This test is often called a two-sample t-test This test is often called a paired t-test 7 df = n − 1 29 critical value 2.045 30 2.042 40 2.021 60 >120 2.000 1.96 ANSWERS (1) We first set up the hypotheses. Since the question is asking whether the mean (µ) is or is not equal to 150 and no specification of the direction of the difference (from 150) if any, the hypotheses are two-sided: H0 : µ = 150 vs. H1 : µ 6= 150. We use the following test statistic: x̄ − µ0 137 − 150 = p = −2.37. Z∗ = p 2 σ̂ /n 302 /30 Since σ is replaced by an estimate σ̂ = s, we compare Z ∗ to a critical value from the t-table. For n = 30, df = n−1 = 30−1 = 29, which gives a critical value of 2.045. Since |Z ∗ | > 2.045, H0 should be rejected. (2) We use the following test statistic: 915 − 900 x̄ − µ0 = p = 1.45. Z∗ = p σ̂ 2 /n 802 /60 Since σ is replaced by an estimate σ̂ = s, we compare Z ∗ to a critical value from the t-table. For n = 60, df = n − 1 = 60 − 1 = 59, but there is no df that is exactly 59 from the table so we choose the next smallest df = 40 which gives a critical value of 1.684. Since |Z ∗ | < 1.684, H0 cannot be rejected. (3a) We use the following test statistic: x̄ − µ0 25 − 30 = p = −1.86 Z∗ = p 2 σ /n 62 /5 since |Z ∗ | = 1.86 is bigger than 1.64, the critical value for a one-sided test, H0 should be rejected. (b) Since the test is one-sided, the p-value of the test is P(Z > 1.86) = 0.0314. Hence the probability of wrongly rejecting H0 is 3.14%. (c) We use the following test statistic: x̄ − µ0 25 − 30 Z∗ = p =p = −2.03 σ̂ 2 /n 5.52 /5 8 Since σ is replaced by an estimate σ̂ = s, we use a critical value from the t-table. For n = 5, df = n−1 = 5−1 = 4, which gives a critical value of 2.132. Since |Z ∗ | = 2.03 < 2.132, H0 is not rejected. We notice that the conclusions using the test (a) and (c) are different. The different conclusions are due to the fact, that if we discard σ = 6 in favour of a sample estimate derived from a small sample, then we must make allowance for the uncertainties in that estimate, resulting in a more conservative test statistic and a different conclusion. In practice, if known population values are given (σ = 6 in this case), we should make use of the known population values and utilize methods that leverage on these known values. Hence (a) is the preferred test here. (4) We first set up the hypotheses. Let p be the probability of heads in a toss for the coin. A fair coin has equal probability of observing heads or tails in a toss, and hence, p0 = 0.5 for a fair coin. Since the question is asking whether the coin is fair (p0 = 0.5) or not fair and no specification of the direction of the difference (from a fair coin) if any, the hypotheses are two-sided: H0 : p = 0.5(= p0 ) vs. H1 : p 6= 0.5. Under the null hypothesis, if X represents the total number of heads in n = 15 tosses, then X is a random variable that follows a Binomial(n = 15, p = 0.5) distribution. The p-value for a two sided test is 2 × P(X ≥ 13) = 2 × [P(X = 13) + P(X = 14) + P(X = 15)] 15 15 15 13 2 14 1 15 0 =2× (0.5) (1 − 0.5) + (0.5) (1 − 0.5) + (0.5) (1 − 0.5) 13 14 15 ≈ 2 × 0.00369 ≈ 0.0074. Based on this p-value, if the coin is fair, there is a probability of 0.0074 (or about once in 136 times) that such an event (13 heads and 2 tails) is observed. Since this probability is quite small, we are inclined to believe this event is the result of an unfair coin and we reject the hypothesis that the coin is fair. (5) We first set up the hypotheses. Let p be the probability of new infections in a month after using bed-nets. The historic rate is 1 in 4 households which implies a probability of p0 = 0.25. Since the interest is in finding out whether bed-nets are effective, this implies a one-sided set of hypotheses: H0 : p ≥ 0.25(= p0 ) vs. H1 : p < 0.25. 9 Since the sample size n = 500 is rather large, we employ the one-sample test for proportions. The test statistic is: p̂ − p0 100/500 − 0.25 Z∗ = p =p ≈ −2.58. p0 (1 − p0 )/n 0.25(1 − 0.25)/500 Since |Z ∗ | = 2.58 is much larger than the one-sided critical value of 1.64, there is sufficient evidence to say that bed-nets reduce new infection rates. (6) If T is the download time under the new service and it follows an Exp(λ) distribution, then the average download time is E(T ) = 1/λ. The probability of error can be framed under a hypothesis testing setting. First the claim that the new service is faster can be tested by setting up the hypotheses as follows H0 : 1 1 ≥ 1.55 vs. H1 : < 1.55, λ λ where H0 states that the average download time is not better than the old service whereas H1 states that the average download time is now shorter. The observed time under the new service is 1.3 days. To use this observation to test the hypotheses, we assume H0 is true and then determine whether the outcome of 1.3 days is unlikely to be observed, in which case, we argue that our assumption that H0 may not be true. We re-write the hypotheses as H0∗ : 1 1 = 1.55 vs. H1 : < 1.55, λ λ Under H0∗ : λ1 = 1.55, outcomes that are more unusual than the observed time is all times, T , that are shorter than 1.3 days. The probability associated with these unusual events is: P(T < 1.3 days). To evaluate this probability, we need to recognise that T ∼ Exp(λ = 1/1.55) under H0 . Hence Z 1.3 1 −t/1.55 P(T < 1.3) = e dt = 1 − e−1.3/1.55 ≈ 0.57. 1.55 0 Therefore, the p-value of the test is 0.57. The p-value can be interpreted as follows: Even the new is not better than old service, there is a 57% chance that a particular download would take less than 1.3 days! This means there is no reason to believe the new service is faster on average. If he insists on claiming that the new service is faster, he has a 57% for making a false claim. (7a) MLE of p is p̂ = X̄ = Y n = 376 800 = 0.47. 10 (b) Each xi , i = 1, ..., 800 is an observation of X. Therefore, there are n = 800 observations. (c) Y is the total number of customers who prefer the tofu using the new coagulant. Y ∼ Bin(n = 800, p). There is only one observation of Y and its value is 376. (d) The hypotheses are 2-sided because in H1 , we are not interested in whether p > or < 0.5. (e) The MLE of p, p̂ is 0.47. We need to find out how unusual is the observed value of p̂ = 0.47, if H0 : p = 0.5 is true. According to the CLT, in a random sample of size n, p̂ ∼ N (p, var(p̂) = N (0.5, 0.5(1−0.5) = 0.0003125), if p = 0.5. 800 p(1−p) ) n = We can standardize the observed p̂ as a Z-score: 376 − 0.5 800 √ = −1.697. z = 0.0003125 ∗ If H0 : p = 0.5 is true, then outcomes that are at least as unusual as the observed data are those with a p̂ further away from 0.5, or those with a Z-score that is more extreme than -1.697, which are Z ≤ −1.697 and Z ≥ 1.697. The associated probabilities are P(Z ≥ 1.697) + P(Z ≤ −1.697) = 2 × P(Z > 1.697) = |2 × 0.0455 {z } = 0.091. from normal table Therefore, if p = 0.5, the probability is 0.091 (the p-value) that we will observe a value of p̂ 376 at least as unusual as 800 . Since 0.091 is not a very small value, there is only moderate evidence against H0 : p = 0.5. (f) A 5% significance test Rejects H0 if p-value < 0.05 Rejects H1 if p-value ≥ 0.05 In a 5% significance test, the probability of a type-one error is 0.05. (g) Since p-value= 0.091 > 0.05, therefore, H0 is NOT rejected. (h) We have already calculated the test statistic in (e): 376 − 0.5 800 √ z = = −1.697 0.0003125 ∗ In a 2-sided 5% significance test, the rule is to Reject H0 if |z ∗ | > 1.96 . Reject H1 if |z ∗ | ≤ 1.96 Since |z ∗ | = 1.697 < 1.96, therefore, we reject H1 . 11 (8a) The hypotheses are 1-sided because in H1 , we are interested in whether λ < 5. (b) There is one observation. The value of the observation is 1, which is the number of customers in one day. (c) The hypotheses of interest are: H0 : λ = 5 vs. H1 : λ < 5. We observed a single observation of 1. We want to find out how unusual is this observation, if H0 : λ = 5 is true. We can calculate the probabilities of various outcomes under a P oisson(λ = 5) distribution: k P(X = k) 0 0.00674 1 0.03369 2 0.08422 3 0.14037 4 0.17547 5 0.17547 6 0.14622 7 0.10444 8 0.06528 9 0.03627 Therefore, if λ = 5, the probability for outcomes as unusual as the observed value of 1 is: P(X ≤ 1) = 0.00674 + 0.03369 = 0.04043 = p-value. Since 0.04043 is a moderately small value, there is some evidence against H0∗ : λ = 5 (and against H0 ). (d) A 5% significance test Rejects H0 if p-value < 0.05 Rejects H1 if p-value ≥ 0.05 Since p-value= 0.04043 < 0.05, therefore, H0∗ (and H0 ) is rejected. They can conclude that there are fewer guests per day than before. (9a) MLE of µ and σ 2 are µ̂ = x̄ = 51529. 1 50 P50 i=1 xi = 1060 and σ̂ 2 = s2 = 1 50 P50 i=1 (xi − x̄)2 = (b) Each xi , i = 1, ..., 50 is an observation of X. Therefore, there are n = 50 observations. (c) The hypotheses are 1-sided because in H1 , we are interested in whether µ > 1000. (d) As we discussed in class, we can test the following hypotheses: H0∗ : µ = 1000 vs. H1 : µ > 1000. We want to determine how unusual is the observed µ̂ = 1060, if H0 : µ = 1000 is true. Using 2 2 the CLT, µ̂ ∼ N (µ, var(µ̂) = σn ) ≈ N (1000, σ̂50 ) = N (1000, 51529 ), if µ = 1000. We can 50 express the observed µ̂ = 1060 in Z-score: z∗ = 1060 − 1000 q = 1.869. 51529 50 ≥ 10 0.03182 12 Outcomes as unusual as the observed data are those with a Z-score at least as big, in absolute value term, which means Z ≥ 1.869 (Since this is a 1-sided test). The probability of these outcomes is P(Z ≥ 1.869) = 0.0308 = p-value. Since 0.0308 is a small value, there is some evidence against H0∗ : µ = 1000 (and against H0 ). (e) A 5% significance test Rejects H0 if p-value < 0.05 Rejects H1 if p-value ≥ 0.05 Since p-value= 0.0308 < 0.05, therefore, H0∗ (and H0 ) is rejected. (f) From (d), the test statistic is: z∗ = 1060 − 1000 q = 1.869. 51529 50 In a 1-sided 5% significance test, the rule is to Reject H0 if |z ∗ | > 1.64 . Reject H1 if |z ∗ | ≤ 1.64 Since |z ∗ | = 1.869 > 1.64, therefore, we reject H0∗ (and H0 ). (g) Using the t-test, we need to determine df = n − 1 = 49. However, the table does not give a critical value corresponding to df = 49. In that case, we can choose the critical value corresponding to df = 40, which is 1.684.3 Since the test statistic is 1.869 > 1.684, therefore the conclusion is the same as (f). It is not surprising that we obtain the same conclusions in (f) and (g), since the critical values in (f) and (g) are very similar. This problem highlights the fact that, unless n is really small and the test statistic is borderline significant, using (f) is often sufficient. (10a) MLE of p is p̂ = x̄ = 40 100 = 0.4. (b) Each xi , i = 1, ..., 100 is an observation of X. Therefore, there are n = 100 observations. (c) The hypotheses are 1-sided because in H1 , we are interested in whether p < 0.5. (d) As we discussed in class, we can test the following hypotheses: H0∗ : p = 0.5 vs. H1 : p < 0.5. 3 A general rule is, when we cannot find a df in the table that corresponds to the calculated df , then we should choose the critical value corresponding to the next lowest df that is available 13 The MLE of p, p̂ is 0.4. We need to find out how unusual is p̂ = 0.4, if H0∗ : p = 0.5 is true. According to the CLT, in a random sample, p̂ ∼ N (p, var(p̂) = N (0.5, 0.5(1−0.5) = 0.0025), if p = 0.5. 100 p(1−p) ) n = We can standardize the observed p̂ as a Z-score: 40 100 − 0.5 z = √ = −2. 0.0025 ∗ If H0∗ : p = 0.5 is true, then outcomes that are at least as unusual as the observed data are those with a p̂ further away from 0.5, or those with a Z-score that is more extreme than -2, which are Z ≤ −2 (Recall this is a 1-sided test). The associated probability is P(Z ≤ −2) = P(Z > 2) = 0.0228 | {z } . from normal table Therefore, if p = 0.5, the probability is 0.0228 (the p-value) that we will observe a value of 40 = 0.4. p̂ at least as unusual as 100 Since 0.0228 is a small value, there is evidence against H0∗ : p = 0.5 (and against H0 ). (e) A 5% significance test Rejects H0 if p-value < 0.05 Rejects H1 if p-value ≥ 0.05 Since p-value= 0.00228 < 0.05, therefore, H0∗ (and H0 ) is rejected. (f) We have already calculated the test statistic in (d): 40 100 − 0.5 = −2. z = √ 0.0025 ∗ In a 1-sided 5% significance test, the rule is to Reject H0 if |z ∗ | > 1.64 . Reject H1 if |z ∗ | ≤ 1.64 Since |z ∗ | = 2 > 1.64, therefore, we reject H0∗ (and H0 ). (11a) MLE of 1 λ is 1̂ λ = t̄ = 0.24. (b) Each ti , i = 1, ..., 40 is an observation of T . Therefore, there are n = 40 observations. (c) The hypotheses are 1-sided because in H1 , we are interested in 1 λ < 0.3. 14 (d) As we discussed in class, we can test the following hypotheses: H0∗ : The MLE of true. 1 λ is 1̂ λ 1 1 = 0.3 vs. H1 : < 0.3. λ λ = 0.24. We need to find out how unusual is According to the CLT, in a random sample, 0.00225), if λ1 = 0.3. 1̂ λ 1̂ λ = 0.24, if H0∗ : ∼ N ( λ1 , var( λ1̂ ) = 1 ) nλ2 1 λ = 0.3 is 2 = N (0.3, 0.3 = 40 Note: We determine var( λ1̂ ) as follows: 1̂ var( ) = var(T̄ ) λ T1 + ... + Tn = var n 1 = {var(T1 ) + ...var(Tn )} n2 nvar(T ) = 2 | n {z } T1 ,...,Tn iid 1 = nλ2 |{z} var(T )=1/λ2 for T ∼Exp(λ) We can standardize the observed 1̂ λ as a Z-score: 0.24 − 0.3 = −1.2649. z∗ = √ 0.00225 If H0∗ : 1 λ = 0.3 is true, then outcomes that are at least as unusual as the observed data are those with a λ1̂ further away from 0.3, or those with a Z-score that is more extreme than -1.2649, which are Z ≤ −1.2649 (This is a 1-sided test). The associated probability is P(Z ≤ −1.2649) = P(Z > 1.2649) = 0.102 | {z } . from normal table Therefore, if λ1 = 0.3, the probability is 0.102 (the p-value) that we will observe a value of at least as unusual as 0.24. Since 0.102 is a rather big value, there is no evidence against H0∗ : H0 ). 1 λ 1̂ λ = 0.3 (or against 15 (e) A 5% significance test Rejects H0 if p-value < 0.05 Rejects H1 if p-value ≥ 0.05 Since p-value= 0.102 > 0.05, therefore, H0∗ (and H0 ) is not rejected. (f) From (d): 0.24 − 0.3 = −1.2649. z∗ = √ 0.00225 In a 1-sided 5% significance test, the rule is to Reject H0 if |z ∗ | > 1.64 . Reject H1 if |z ∗ | ≤ 1.64 Since |z ∗ | = 1.2649 < 1.64, therefore, we do not reject H0∗ (and H0 ). (12a) The test is a 2-sided test since he has no preference of the direction of difference in H1 . (b) From Question 5 in Chapter 8 Exercises, we found that the MLE for µX − µY is x̄ − ȳ. σ2 σ2 Furthermore, if we let µ̂X−Y be the MLE of µX − µY , we showed that var(µ̂X−Y ) = X + Y , n n since the two samples have the same size. Therefore, from the CLT for MLE, we can deduce that a test statistic for the hypotheses is: x̄ − ȳ q 2 σX n + 2 σY n 11.6 − 12.7 = q 27.6 + 32.4 30 30 −1.1 √ 2 ≈ −0.7778 = Since this is a 2-sided test, the critical value is 1.96 for a 5% significance test. Furthermore, | − 0.7778| < 1.96, therefore, there is no evidence that there is any difference between the two sources. (c) Since n = 30, df = n + n − 2 = 58. However, there is no corresponding df from the table. Therefore, we use the critical value corresponding to df = 40, which is 2.021. Since | − 0.7778| < 2.021, the conclusion is identical to that in (b) (d) Based on the data, d1 , ...d30 can be seen as 30 observations of D, the daily difference. His interest is to determine whether the long run average of D, which is µD , is zero or not. Thus, the hypotheses can be tested using the same type of test statistics we have been using, viz.: d¯ −1.1 √ =p ≈ −2.2772. sd / n 7/30 16 This statistic gives a p-value < 0.05 since | − 2.2772| > 1.96, therefore H0 is rejected. The reason for the difference in the test statistic by (b) and (d) is as follows. Under (b), the hypotheses are tested by separately estimating µ1 and µ2 by x̄ and ȳ, both of which are bad estimates because there is a high variation in production of milk. Therefore, using them to test µ1 − µ2 is a bad choice. On the other hand, after taking paired differences, the large variations disappeared (Note s2d is much smaller than s21 , s22 ). Therefore, using d¯ gives a much more sensitive test of the hypotheses. There are three conditions for carrying out a test similar to the one in (d): 1. The sample size in both samples must be equal, i.e., n for both samples 2. Each observation in one sample is uniquely paired in some meaningful way to an observation in the second sample. In this case, production form the two sources on the same day are paired 3. The samples should be positively correlated. This situation can be determined by visual inspection or by observing that s21 , s22 are large relative to the value of s2d We notice the test in (d) essentially a test using the differences n observations of di ’s. So this test is not different in concept from the ones that we have been using, for example in Questions 1, 3-5. (e) There are 30 pairs of differences, so n = 30, df = n − 1 = 29. Therefore, we use the critical value 2.045. Since | − 2.2772| > 2.045, the conclusion is identical to that in (d)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 Chapter 10 Exercises 1. A sample of 30 observations has ¯x = 137