Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Degrees of freedom (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Foundations of statistics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Statistical hypothesis testing wikipedia , lookup
Resampling (statistics) wikipedia , lookup
MA238 Assignment 5 Solutions α = 0.05, and assign the problem solving team. Recall that the probability, computed assuming that H0 is true (i.e. that the machine is on target), that the test statistic would take a value as extreme or more extreme than that actually observed, is called the p-value for this test. The smaller the P-value, the stronger the evidence against H0 provided by the data. (i) Single sample tests. Question 1. (a) H0: μ = 1250 sq. ft HA: μ < 1250 sq. ft (b) H0: μ = 32 mpg HA: μ > 32 mpg In this question the p- value is calculated as twice the P(z * ≥ 2.174) , which from the Normal tables is estimated as 2(1-0.9850) = 0.03. Recall that we specified a two-sided test at the onset and hence we need to double the probability as the rejection region is split across both tails of the distribution. This is interpreted as a 3% chance of seeing a value of z* as large as we observed if the null hypothesis is actually true (i.e. 3 in a hundred of getting a test statistic as large as we did if indeed the null hypothesis was true). As this is less than specified α of 0.05 (i.e. any value of z more ‘extreme’ than 5 in a hundred will lead us to reject Ho) we reject the null hypothesis. (c) H0: μ = 5 mm HA: μ ≠ 5mm Question 2. (i) What are the null and alternative hypotheses for this study? H0: μ = 3 HA: μ ≠ 3 (vi) A 95% C.I. for the true mean diameter is calculated as x ± zα 2 (ii) In the context of this study, interpret making a Type I error; interpret making a Type II error. 3.005 ± 1.96 (iv) As the sample size is greater than 30 the CLT applies. As the level of significance α was specified as α = 0.05, the critical value for this two tailed test is zα/2 = z 0.025 = 1.96 resulting in a critical region of z > 1.96 and z < - 1.96. (v) z= Note that this interval does not contain the value 3mm, the mean value when the machine is on target as specified in the null hypothesis. As the interval estimate for the true mean is not in agreement with the null hypothesis we have further evidence that the null hypothesis is false. It is worth noting that the machine is not a lot of target i.e. on average between 0.0005mm to 0.0095mm. (vii) 2.997 - 3 = −1.30 and since -1.96 < z < 1.96, we do not reject the null .023 100 hypothesis at α = 0.05, and we do not assign the problem solving team. z= 3.005 - 3 = 2.174 and since z > 1.96, we reject the null hypothesis at .023 100 1 0.023 100 =[3.0005, 3.0095]. A Type II error (i.e. do not reject the null hypothesis H0 when it was in fact false) in this study would amount to deciding that the process is fine when in fact it isn’t fine i.e. don’t assign the problem solving team when there is actually a problem there for them to fix. This is the cost of a Type I error given the answer to part (ii) above. n where zα 2 = 1.96 as a 95% C.I. is required. This works out as A Type I error (i.e. reject the null hypothesis H0 when it was in fact true) in this study would amount to deciding that the process is out of control when it is in fact working fine i.e. assign the problem solving team to work on a problem that is not actually there. (iii) σ 2 In this question the p- value is calculated as twice the P(z * ≤ −1.30) , which from the Normal tables is estimated as 2(1-0.9032) = 0.19. This is interpreted as a 19% chance of seeing a value of z* as large as we observed if the null hypothesis is actually true. As this is greater than specified α of 0.05 we do not reject the null hypothesis and claim that the data we collected are quite consistent with the null hypothesis and the difference in the sample mean compared to the hypothesised value of 3mm is due to natural sampling variation. Note that as the sample size is large enough we substitute s (the sample standard deviation) for σ (the population standard deviation). 3. Distribution of TS if H0 true: As the sample size is large enough, we know from the Central Limit Theorem that if H0 is true, z has a Normal distribution with mean 0 and variance 1 i.e. if H0 is true, we would expect z to be somewhere around 0 and not expect it to be too far in either direction from 0. Question 3. 4. Decide on the ‘Significance Level’ α The information given in the question is as follows: A significance level was not specified by the question so it is up to you to choose one! Let’s choose a significance level of α=0.05 (i.e. you have a 5% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’). From your knowledge of the normal distribution you know that 5% of all observations having a N(0,1) distribution are to left of –1.65 (this is given to you in the formula sheet but you should be able to read this off the Z table) so a suitable decision rule would be to decide that any values of z less than –1.65 (i.e. to the left of it) are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. μ (the population mean) = 10.5 n (the sample size) = 120 x (the sample mean) = 8.9 s (the sample standard deviation) =5.2 and you are asked to decide, based on the sample statistics provided, whether the true mean age of delinquents is strictly less than 10.5 or not. Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.65 (see the figure below). Using the strategy outlined in the lectures: 1. State the Null and Alternative Hypotheses 1. H0: μ = 10.5 2. HA: μ < 10.5 95% of z scores 5% of z scores Note that the sociologist is only interesting in testing for a mean strictly less than 10.5 and therefore we have a one-sided test. This will become important when deciding on an appropriate Critical Region. 2. Calculate an appropriate test statistic (TS) As the sample size is greater than 30 the Central Limit Theorem applies and a suitable test statistic is x - μo z= σ Acceptance Region -1.65 Critical Region n which, for this example, works out as 8.9 - 10.5 z= = −3.37 5.2 120 3 4 5. Check whether the value of the TS is in the critical region and make a decision. 2. Calculate an appropriate test statistic (TS) In this example, z = -3.37 which is considerably less than –1.65, and hence there is convincing evidence that the null hypothesis is false. As the sample size is greater than 30 the Central Limit Theorem applies and a suitable test statistic is x - μo z= In this question the p- value is calculated as the P(z * ≤ −3.37) . . Recall that we specified a one-sided test at the onset and hence we need only consider the probability in one tail of the distribution. From the Normal tables this is estimated as (1-0.9996) = 0.0004. This is interpreted as you having 4 in 10,000 chance of seeing a value of z* as large as we observed if the null hypothesis is actually true. As this is much smaller than specified α of 0.05 we have very strong evidence against reject the null hypothesis and claim that the data we collected are not at all consistent with the null hypothesis and the difference in the sample mean compared to the hypothesised value of 10.5 is due to natural sampling variation. σ n which, for this example, works out as z= 114.27 - 108.65 = 6.77 . 8.3 100 Note that as the sample size is large enough we substitute s (the sample standard deviation) for σ (the population standard deviation). Conclusion. On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that the mean age of bicycle thieves is actually less than 10.5 years and not equal to 10.5 years as stated by the police chief. 3. Distribution of TS if H0 true: If H0 true then z has a Normal distribution with mean 0 and variance 1 (i.e. N(0,1) ). 4. Decide on the ‘Significance Level’ α Question 4. The information given in the question is as follows: μ (the population mean) = €108.65 n (the sample size) = 100 x (the sample mean) = €114.27 s (the sample standard deviation) =€8.30. and you are asked to decide, based on the sample statistics provided, whether the claim that the true average size of a delinquent charge account is different from €108.65 is true or not. Using the strategy outlined in the lectures: 1. State the Null and Alternative Hypotheses A significance level of α=0.05 is specified in the question (i.e. you have a 5% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’). Remember that you are interesting in testing for a difference in the population mean in both directions (i.e. the true mean could be bigger than that stated in the null hypothesis or less than that stated in the null hypothesis) and you want to have only a 5% chance of being wrong. From your knowledge of the normal distribution you know that 2.5% of all observations having a N(0,1) distribution are to left of –1.96 and to the right of +1.96 (this is given to you in the formula sheet) so a suitable decision rule would be to decide that any values of z less than –1.96 (i.e. to the left of it) or greater than +1.96 (i.e. to the right of it) are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. Notice that you have spread the 0.05 over the two tails of the distribution to give you an overall significance level of 0.05. Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.96 or z > + 1.96 (see the figure below). 95% of z scores 3. H0: μ = 108.65 4. HA: μ ≠ 108.55 Note that the retail credit association is interesting in testing for a mean delinquent charge account different from 108.65 and therefore we have a two-sided test. This will become important when deciding on an appropriate Critical Region. 2 ½ % of z scores 2 ½ % of z scores Acceptance Region -1.96 5 1.96 6 5. Check whether the value of the TS is in the critical region and make a decision. Using the strategy outlined in the lectures: In this example, z = 6.77 which is considerably more than 1.96, and hence there is convincing evidence that the null hypothesis is false. 1. State the Null and Alternative Hypotheses 5. H0: μ = 196 6. HA: μ < 196 Conclusion. On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that the true average size of a delinquent charge account is not £108.65 as claimed. The second part of the question asks you to provide a guess as to what you think the true average size of a delinquent charge account is likely to be given that you have just disputed the fact that it is €108.65. To do this you need to provide a guess at the true unknown mean using a confidence interval. You are specifically asked to calculate a 95% C.I. which, as the sample size is >30, is calculated as follows (see assignment 2): x ± zα 2 σ n where zα 2 = 1.96 as a 95% C.I. is required. This works out as 114.27 ± 1.96 Note that the player is interesting in testing for a mean bowling score less than 196 and therefore we have a one-sided test. This will become important when deciding on an appropriate Critical Region. 2. Calculate an appropriate test statistic (TS) As the sample size is greater than 30 the Central Limit Theorem applies and a suitable test statistic is x - μo z= σ n which, for this example, works out as 8. 3 z= 100 =[112.53, 116.01]. Hence, we can claim that it is quite likely that the true average size of a delinquent charge account is somewhere between €112.53 and €116.01. Notice the agreement between the confidence interval and the hypothesis test in that the parameter value specified in the null hypothesis (€108.65) is not contained in the interval that we claim is likely to contain the true value. 188 - 196 = −2.27 . 24.9 50 Note that as the sample size is large enough we substitute s (the sample standard deviation) for σ (the population standard deviation). 3. Distribution of TS if H0 true: If H0 true then z has a Normal distribution with mean 0 and variance 1. 4. Decide on the ‘Significance Level’ α Question 4. The information given in the question is as follows: μ (the population mean) = 196 pins n (the sample size) = 50 x (the sample mean) = 188 pins s (the sample standard deviation) = 24.9 and you are asked to decide, based on the sample statistics provided, whether the claim that the true average number of pins is less than 196 or not (i.e. has his average score worsened). 7 A significance level of α=0.01 specified by the question (i.e. you have a 1% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’). Remember that you are interesting in testing for a difference in population mean in one direction only (i.e. the true mean is less than stated in the null hypothesis). From your knowledge of the normal distribution you know that 1% of all observations having a N(0,1) distribution are to left of –2.33 (this is given to you in the formula sheet) so a suitable decision rule would be to decide that any values of z less than –2.33 (i.e. to the left of it) are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. Our critical region therefore (using α=0.01) comprises of any value of z that is < -2.33 (see the figure overleaf). 8 99% of z scores 1% of z scores and you are asked to decide, based on the sample statistics provided, whether there is evidence to suggest a difference in the average weight gain for all patients on 25mg dose compared to all patients on 50mg dose (i.e. does it look like one group gains more weight on average compared to the other group). Looking at the sample statistics alone it looks like the 50mg group has a higher average weight gain than the 25mg group (i.e. the 50mg group look like to have gained 3 pounds more on average) but this may be just due to sampling variation and consequently a formal hypothesis test is needed. Using the strategy outlined in the lectures: Acceptance Region 1. State the Null and Alternative Hypotheses -2.33 Critical Region 5. Check whether the value of the TS is in the critical region and make a decision. In this example, z = - 2.27 which is not less than –2.33, and hence there is no convincing evidence that the null hypothesis is false. Note that the z score is nearly in the critical region so even though we cannot claim that the null hypothesis is true (while having only a 1% chance of being wrong), the result is quite worrying in that the z score was quite extreme (although not extreme enough to reject H0). The bowler should keep account of a few more scores and then reanalyze the data and see how extreme his z score is then. 1. H0: μ1 = μ2 (i.e. there is no difference in the average weight gain for patients on 25mg dose compared to patients on 50mg dose) 2. HA: μ1 ≠ μ2 (i.e. there is a difference in the average weight gain for patients on 25mg dose compared to patients on 50mg dose) Note that you are interested in testing for a mean difference in both directions (i.e. the 25mg group could be bigger, less than or equal to the 50mg group) and therefore we have a two-sided test. This will become important when deciding on an appropriate Critical Region. 2. Calculate an appropriate test statistic (TS) Conclusion. On the basis of the hypothesis test above, there is no evidence (at α=0.01) that the bowler’s average number of pins has reduced significantly from 196 (i.e. that the new ball is affecting his performance on average). As both the samples are of size greater than 30 the Central Limit Theorem applies and a suitable test statistic is x1 − x 2 z= σ 12 n1 MA238 Assignment 5 Solutions (part b) which, for this example, works out as z= (ii) Two Sample Tests Question 6. 9 7 − 10 62 72 + 50 50 σ 22 n2 = −2.30 Note that as the sample size is large enough we substitute s1, s2 (the sample standard deviations) for σ1 and σ2 (the population standard deviations). The information given in the question is as follows: Sample size (n) Sample mean ( x ) Sample standard deviation (s) + Group 1 (25mg) 50 7 6 Group 2 (50mg) 50 10 7 3. Distribution of TS if H0 true: As the sample size is large enough, we know from the Central Limit Theorem that if H0 is true, z has a Normal distribution with mean 0 and variance 1 i.e. if H0 is true, we would expect z to be somewhere around 0 and not expect it to be too far in either direction from 0. 10 4. Decide on the ‘Significance Level’ α x 1 − x 2 ± zα ∗ A significance level was not specified by the question so it is up to you to choose one! Let’s choose a significance level of α=0.05 (i.e. you have a 5% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’). From your knowledge of the normal distribution you know that 2.5% of all observations having a N(0,1) distribution are to left of –1.96 and that 2.5% of all observations are to the right of 1.96 (this is given to you in the formula sheet but you should be able to read this off the Z table) so a suitable decision rule would be to decide that any values of z less than –1.65 or greater than 1.96 are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.96 or greater than 1.96 (see the figure below). 95% of z scores 2 ½ % of z scores 2 ½ % of z scores Acceptance Region -1.96 2 n1 + σ 22 n2 will provide you with a 100(1-α)% confidence interval for the difference in two population means (e.g. α equal to 0.05 will give you a 95% C.I., α equal to 0.01 will give you a 99% C.I. etc.). You were asked in this question to calculate a 95% confidence interval (i.e. use α = 0.05, consequently zα = 1.96 ) which amounts to evaluating 2 7 − 10 ± 1.96 ∗ 62 7 2 + 50 50 = [ -5.56, -0.44] . We interpret this interval as follows: our best guess at the true difference in average weight gains for 25mg group – 50mg group (make sure you are clear about the order!) is likely to be between –5.56 pounds and –0.44 pounds (i.e. the 50mg group are likely to gain on average between .44 and 5.56 pounds more compared to the 25mg group over the 12 week period). Notice that this interval does not contain 0 (which would represent ‘no difference’ in average weight gain between the two groups) and is in agreement with the hypothesis test. Question 7. 1.96 The information given in the question is as follows: 5. Check whether the value of the TS is in the critical region and make a decision. Sample size (n) Sample mean ( x ) Sample standard deviation (s) In this example, z = -2.30 which is considerably less than –1.96, and hence there is convincing evidence that the null hypothesis is false (i.e. convincing evidence that the 50mg group has in fact a greater average weight gain compared to the 25mg group). Conclusion. On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that the 50mg group has in fact a greater average weight gain compared to the 25mg group over the 12 week period. As before, a hypothesis test will only indicate whether there is evidence of a significant difference (i.e. a departure from the null hypothesis that is not due to sampling variation alone) but will not provide you with an estimate of what the difference is likely to be. In order to do this you need to calculate a confidence interval (using a required degree of confidence). We saw in lectures that 11 σ 12 Rocket 1 8 36 15 Rocket 2 10 52 18 and you are asked to decide, based on the sample statistics provided, whether there is evidence to suggest that the second kind of rocket is worse than the first in terms of it’s mean target error (i.e. on the basis of the sample data provided, does it look like Rocket 2 is worse than Rocket 1 in terms of mean target error). Looking at the sample statistics alone it looks like Rocket 2 has a higher mean target error of 16 units more than Rocket 1 but this may be just due to sampling variation and consequently a formal hypothesis test is needed. Using the strategy outlined in the lectures: 1. State the Null and Alternative Hypotheses H0: μ1 = μ2 (i.e. there is no difference in the mean target error for Rocket 1 compared to Rocket 2) 12 HA: μ1 < μ2 (i.e. the mean target error for Rocket 1 is strictly less than that for Rocket 2) Note that you are interested in testing for a mean difference in one direction only and therefore we have a one-sided test. This will become important when deciding on an appropriate Critical Region. 2. Calculate an appropriate test statistic (TS) As neither of the samples are of size greater than 30, we know from the Central Limit Theorem does not hold and we need to use the t-distribution. Remember that there are two crucial assumptions we have to make in order to use the t-distribution in this context and they are as follows: 1. Do both samples come from populations that are normally distributed? 2. Are the variances of both populations equal? knowledge of the t-distribution tables you know that 5% of all observations having a tdistribution with 7 (i.e. min(8-1, 10–1)) degrees of freedom distribution are to left of – 1.895 (this is given to you in the t tables by looking down the 0.05 column and across the 7 df row) so a suitable decision rule would be to decide that any values of z less than – 1.895 are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.895 (see the figure below). 95% of t scores 5% of t scores We can assume that assumption 1 is true as it is likely that the mean target error should be normally distributed given the nature of the measurement. We cannot be sure about the variance assumption but we do know (from the lectures) that if the sample sizes are similar then this assumption is not too important. Acceptance Region -1.895 Critical Region Given these decisions we can use x1 − x 2 t= s12 s2 2 + n1 n 2 as a suitable test statistic in order to compare two sample means to make statements about two population means, 5. Check whether the value of the TS is in the critical region and make a decision. resulting in Conclusion. t= 36 − 52 152 182 + 8 10 = −2.06 3. Distribution of TS if H0 true: We know that if H0 is true, t has a t-distribution with min(n1-1, n2 –1) degrees of freedom (i.e. if H0 is true, we would expect t to be somewhere around 0 and not expect it to be too far in either direction from 0. In this example, t = -2.06 which is less than –1.895, and hence there is convincing evidence that the null hypothesis is false (i.e. convincing evidence that Rocket 2 has a higher mean target error compared to Rocket 1). On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that Rocket 2 has a higher mean target error when compared to Rocket 1 and Rocket 1 should be used n practice as it is more accurate. Note you were not asked to provide a confidence interval in his question but you can make a simple guess of the difference in the mean target error by using the sample means i.e. the mean target error for Rocket 2 is probably around 16 units more than that for Rocket 1. Question 8. 4. Decide on the ‘Significance Level’ α (i) 1. State the Null and Alternative Hypotheses A significance level of α = 0.05 was specified in the question (i.e. you have a 5% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’. This would amount to you saying that Rocket 2 was worse than Rocket 1 when in fact there was no difference and you were just analysing two ‘extreme’ samples). From your 13 H0: μΑ = μΒ (i.e. there is no difference in the population mean talking time between the two batteries) 14 HA: μΑ ≠ μΒ (i.e. there is a difference in the population mean talking time between the two batteries) Note that you are interested in testing for a mean difference in both directions (i.e. the population mean for nickel-cadmium battery could be bigger, less than or equal to that for the nickel-metal hydride battery) and therefore we have a two-sided test. 2. Calculate an appropriate test statistic (TS) As neither of the samples are of size greater than 30, the Central Limit Theorem does not hold and the t-distribution is valid. There are two crucial assumptions to make in order to use the t-distribution in this context and these are as follows: 1. Do both samples come from populations that are normally distributed? 2. Are the variances of both populations equal? Assume that the mean talking time is normally distributed for both batteries and given that the sample sizes and sample variance are similar assume that the variance assumption is valid. Given these decisions use t= 4. Decide on the ‘Significance Level’ α A significance level of α = 0.01 was specified in the question (i.e. you have a 1% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’. This would amount to you saying that there was a difference in the average talking time when in fact there was not and you were just analysing two ‘extreme’ samples). From your knowledge of the t-distribution tables you know that half a percent (i.e. α = 0.005) of all observations having a t-distribution with 24 degrees of freedom distribution are to left of –2.797 and that 0.5% of all observations are to the right of 2.797 (this is given to you in the t tables by looking down the 0.005 column and across the 24 df row) so a suitable decision rule would be to decide that any values of z less than –2.797 or greater than – 2.797 are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. The critical region therefore (using α=0.01) comprises of any value of t that is < -–2.797 or > –2.797 (see the figure below). t (24 df) x1 − x 2 99% of t scores s12 s2 2 + n1 n 2 ½ % of t scores ½ % of t scores as a suitable test statistic in order to compare two sample means to make statements about two population means, which, for this example, works out as Acceptance Region –2.797 t= 2.797 70.75 − 79.23 13.992 15.032 + 25 25 t = −2.06 . 3. Distribution of TS if H0 true: We know that if H0 is true, t has a t-distribution with min(n1-1, n2 –1) degrees of freedom (i.e. if H0 is true, we would expect t to be somewhere around 0 and not expect it to be too far in either direction from 0 under a t distribution with min(25-1, 25–1) =24 degrees of freedom. 5. Check whether the value of the TS is in the critical region and make a decision. In this example, t = -2.06 which is not in the critical region, and hence there is no convincing evidence (at significance level α=0.01 ) that the null hypothesis is false i.e. it is quite plausible that we could get such a difference in sample means due to sampling variation alone if Ho was indeed true. Conclusion. No evidence of a significant difference (at α=0.01) in the true average talking time between the two battery types. (ii) 15 16 As neither of the samples are of size greater than 30, the Central Limit Theorem does not hold and the t-distribution is valid. There are two crucial assumptions to make in order to use the t-distribution in this context and these are as follows: 1. Do both samples come from populations that are normally distributed? 2. Are the variances of both populations equal? (iii) An approximate 100(1-α)% confidence interval for the true population mean difference is calculated as x1 − x 2 ± tα ∗ 2 s12 s2 2 + n1 n 2 where t has min(n1-1, n2 –1) degrees of freedom. Consequently, an approximate 99% confidence interval for the true population mean difference is calculated as Question 9. (i) The boxplots were missing from the assignment, my apologies! The summary statistics suggest that the mean angular velocity is higher in the Skilled group compared to the Novice group. The boxplots would give an indication as to whether the data plausibly arose from a normal distribution (one of the assumptions necessary to carry out the hypothesis test given the small samples). The standard deviations are not equal in each group but are ‘similar’. (ii). “Histograms of the data for each group suggested that there were no outliers present and that the data were reasonably symmetric”. This suggests that the mean is a useful measure to use to compare the two samples. If there were outliers present and a lack of symmetry you should consider using the median. (iii)This is an Observational study. We are observing two types of rowers. An experimental study would be one where we took a sample of rowers and randomly assigned them to two training methods and compared the improvement in fitness across the two methods. (iv) 13.992 15.032 (70.75 − 79.23) ± 2.797 ∗ + 25 25 = [ -19.97, 3.006]. Notice that this interval contains 0 (which would represent ‘no difference’ in the true average talking time between the two batteries) and is therefore in agreement with the hypothesis test. We interpret this interval as follows: our best guess at the true population mean difference in average talking time between the two batteries is likely to be between 19.97 units more on average for the nickel-metal hydride battery up to 3.006 units more on average for the nickel-cadmium batteries. Note that the interval is loaded towards being strictly negative suggesting that nickelmetal hydride batteries may indeed have a higher mean talking time than nickel-cadmium batteries which this study may not have enough power to detect. 1. State the Null and Alternative Hypotheses H0: μS = μN (i.e. there is no difference in the population mean angular velocity between the two categories of rowers) HA: μ S ≠ μ N (i.e. there is a difference in the population mean angular velocity between the two categories of rowers) 2. Calculate an appropriate test statistic (TS) As neither of the samples are of size greater than 30, the Central Limit Theorem does not hold and the t-distribution is valid. There are two crucial assumptions to make in order to use the t-distribution in this context and these are as follows: 3. Do both samples come from populations that are normally distributed? 4. Are the variances of both populations equal? (iv) A Type II error (i.e. do not reject the null hypothesis H0 when it was in fact false) in this study would amount to deciding that the mean life time for the two batteries was the same when in fact they were different i.e. the new battery outperforms the old. The likely consequence is a loss of income on not producing a longer life battery and the time used in production so far. From the evidence provided by the histograms we can assume that the mean talking time is normally distributed for both categories of rower and given that the sample sizes and sample variance are similar assume that the variance assumption is valid. 17 18 Given these decisions use t= 5. Check whether the value of the TS is in the critical region and make a decision. x1 − x 2 In this example, t = 5.28 which is in the critical region, and hence there is convincing evidence (at significance level α=0.05 ) that the null hypothesis is false i.e. it is unlikely that we could get such a difference in sample means due to sampling variation alone if Ho was indeed true. s12 s2 2 + n1 n 2 as a suitable test statistic in order to compare two sample means to make statements about two population means, which, for this example, works out as t= Conclusion. 4.18 − 3.01 Evidence of a significant difference (at α=0.05) in the true average mean angular velocity between the two categories of rowers. 0.482 0.512 + 10 10 t = 5.28 . (v) 3. Distribution of TS if H0 true: We know that if H0 is true, t has a t-distribution with min(n1-1, n2 –1) degrees of freedom (i.e. if H0 is true, we would expect t to be somewhere around 0 and not expect it to be too far in either direction from 0 under a t distribution with min(10-1, 10–1) =9 degrees of freedom. 4. Decide on the ‘Significance Level’ α An approximate 100(1-α)% confidence interval for the true population mean difference is calculated as x1 − x 2 ± tα ∗ 2 s12 s2 2 + n1 n 2 where t has min(n1-1, n2 –1) degrees of freedom. A significance level of α = 0.05 was specified in the question (i.e. you have a 5% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’. This would amount to you saying that there was a difference in the average angular velocity when in fact there was not and you were just analysing two ‘extreme’ samples). From your knowledge of the t-distribution tables you know that half a percent (i.e. α = 0.025) of all observations having a t-distribution with 9 degrees of freedom distribution are to left of –2.262 and that 2.5% of all observations are to the right of 2.262 so a suitable decision rule would be to decide that any values of t less than –2.262 or greater than 2.262 are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. The critical region therefore (using α=0.05) comprises of any value of t that is < –2.262 or > 2.262 (see the figure below). t (9 df) 95% of t scores 2½ % of t scores Consequently, an approximate 95% confidence interval for the true population mean difference is calculated as (4.18 − 3.01) ± 2.262 ∗ = [ 0.67, 1.67]. Notice that this interval does not contain 0 (which would represent ‘no difference’ in the true mean angular velocity between the two categories of rowers) and is therefore in agreement with the hypothesis test. We interpret this interval as follows: our best guess at the true population mean difference between the two categories of rowers is likely to be between 0.67 to 1.67 units more on average for Skilled rowers compared to the Novice rowers. It appears that the angular velocity is an important characteristic in rowing. 2½ % of t scores Acceptance Region 2.262 –2.262 19 0.482 0.512 + 10 10 20 H0: μΑ = μΒ (i.e. there is no difference in the mean amount charged between the two plans) distribution are to left of –1.96 and that 2.5% of all observations are to the right of 1.96 (this is given to you in the formula sheet but you should be able to read this off the Z table) so a suitable decision rule would be to decide that any values of z less than –1.65 or greater than 1.96 are unlikely to occur due to sampling variation alone and therefore represent an extreme result which is not in keeping with what we would expect if the null hypothesis was indeed true. HA: μΑ ≠ μΒ (i.e. there is a difference in the mean amount charged between the two plans) Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.96 or greater than 1.96. Question 10. 1. State the Null and Alternative Hypotheses Note that you are interested in testing for a mean difference in both directions and therefore we have a two-sided test. In this example, z = -1.48, which is in the acceptance region and there is no convincing evidence that the null hypothesis is false. 2. Calculate an appropriate test statistic (TS) As both the samples are of size greater than 30 the Central Limit Theorem applies and a suitable test statistic is x1 − x 2 z= σ1 2 n1 + σ2 2 n2 which, for this example, works out as z= 5. Check whether the value of the TS is in the critical region and make a decision. 1987 − 2056 3922 4132 + 150 150 = −1.48 Note that as the sample size is large enough we substitute s1, s2 (the sample standard deviations) for σ1 and σ2 (the population standard deviations). 3. Distribution of TS if H0 true: As the sample size is large enough, we know from the Central Limit Theorem that if H0 is true, z has a Normal distribution with mean 0 and variance 1 i.e. if H0 is true, we would expect z to be somewhere around 0 and not expect it to be too far in either direction from 0. 6. Calculate the P-value. Remember that the P-value is defined as the probability, computed assuming that H0 is true, that the test statistic would take a value as extreme or more extreme than that actually observed is called the P-value of the test. The smaller the P-value, the stronger the evidence against H0 provided by the data. In this question the p- value is calculated as twice the P(z * ≥ −1.48) = 2(1-0.9306) = 0.14. Recall that we specified a two-sided test at the onset and hence we need to double the pvalue. We could not know the value of the test statistic before we collected the data. This is interpreted as there is at least a 14% chance of seeing a value of z* as large as we observed if the null hypothesis is actually true. As this is not less than specified α of 0.05 we do not reject the null hypothesis. Conclusion. On the basis of the hypothesis test above, there is no evidence (at α=0.05) of a significant difference in the mean amount charged between the two plans. The result is not significant—there is no clear evidence that one proposal is a better incentive than the other. So we can just go with the one that is easier and cheaper to implement. But if there is no practical difference in cost to the bank, we might choose proposal B, since the data did lean a bit in that direction. (b) 4. Decide on the ‘Significance Level’ α A significance level was not specified by the question so it is up to you to choose one! Let’s choose a significance level of α=0.05 (i.e. you have a 5% chance of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’). From your knowledge of the normal distribution you know that 2.5% of all observations following a N(0,1) 21 Because the sample sizes are equal and large, the Central Limit Theorem applies and the test should be reliable in spite of some skewness. 22