Download Assignment 5 Soultions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
MA238 Assignment 5 Solutions
α = 0.05, and assign the problem solving team.
Recall that the probability, computed assuming that H0 is true (i.e. that the
machine is on target), that the test statistic would take a value as extreme or more
extreme than that actually observed, is called the p-value for this test. The smaller
the P-value, the stronger the evidence against H0 provided by the data.
(i) Single sample tests.
Question 1.
(a)
H0: μ = 1250 sq. ft
HA: μ < 1250 sq. ft
(b)
H0: μ = 32 mpg
HA: μ > 32 mpg
In this question the p- value is calculated as twice the P(z * ≥ 2.174) , which from
the Normal tables is estimated as 2(1-0.9850) = 0.03. Recall that we specified a
two-sided test at the onset and hence we need to double the probability as the
rejection region is split across both tails of the distribution. This is interpreted as a
3% chance of seeing a value of z* as large as we observed if the null hypothesis is
actually true (i.e. 3 in a hundred of getting a test statistic as large as we did if
indeed the null hypothesis was true). As this is less than specified α of 0.05 (i.e.
any value of z more ‘extreme’ than 5 in a hundred will lead us to reject Ho) we
reject the null hypothesis.
(c)
H0: μ = 5 mm
HA: μ ≠ 5mm
Question 2.
(i)
What are the null and alternative hypotheses for this study?
H0: μ = 3
HA: μ ≠ 3
(vi)
A 95% C.I. for the true mean diameter is calculated as
x ± zα 2
(ii)
In the context of this study, interpret making a Type I error; interpret making a
Type II error.
3.005 ± 1.96
(iv)
As the sample size is greater than 30 the CLT applies. As the level of
significance α was specified as α = 0.05, the critical value for this two tailed test
is zα/2 = z 0.025 = 1.96 resulting in a critical region of z > 1.96 and z < - 1.96.
(v)
z=
Note that this interval does not contain the value 3mm, the mean value when the
machine is on target as specified in the null hypothesis. As the interval estimate
for the true mean is not in agreement with the null hypothesis we have further
evidence that the null hypothesis is false. It is worth noting that the machine is
not a lot of target i.e. on average between 0.0005mm to 0.0095mm.
(vii)
2.997 - 3
= −1.30 and since -1.96 < z < 1.96, we do not reject the null
.023
100
hypothesis at α = 0.05, and we do not assign the problem solving team.
z=
3.005 - 3
= 2.174 and since z > 1.96, we reject the null hypothesis at
.023
100
1
0.023
100
=[3.0005, 3.0095].
A Type II error (i.e. do not reject the null hypothesis H0 when it was in fact false)
in this study would amount to deciding that the process is fine when in fact it isn’t
fine i.e. don’t assign the problem solving team when there is actually a problem
there for them to fix.
This is the cost of a Type I error given the answer to part (ii) above.
n
where zα 2 = 1.96 as a 95% C.I. is required. This works out as
A Type I error (i.e. reject the null hypothesis H0 when it was in fact true) in this
study would amount to deciding that the process is out of control when it is in fact
working fine i.e. assign the problem solving team to work on a problem that is not
actually there.
(iii)
σ
2
In this question the p- value is calculated as twice the P(z * ≤ −1.30) , which from
the Normal tables is estimated as 2(1-0.9032) = 0.19. This is interpreted as a 19%
chance of seeing a value of z* as large as we observed if the null hypothesis is
actually true. As this is greater than specified α of 0.05 we do not reject the null
hypothesis and claim that the data we collected are quite consistent with the null
hypothesis and the difference in the sample mean compared to the hypothesised
value of 3mm is due to natural sampling variation.
Note that as the sample size is large enough we substitute s (the sample standard
deviation) for σ (the population standard deviation).
3. Distribution of TS if H0 true:
As the sample size is large enough, we know from the Central Limit Theorem that if H0 is
true, z has a Normal distribution with mean 0 and variance 1 i.e. if H0 is true, we would
expect z to be somewhere around 0 and not expect it to be too far in either direction from
0.
Question 3.
4. Decide on the ‘Significance Level’ α
The information given in the question is as follows:
A significance level was not specified by the question so it is up to you to choose one!
Let’s choose a significance level of α=0.05 (i.e. you have a 5% chance of rejecting the
null hypothesis when it is in fact true, a so called ‘Type 1 Error’). From your knowledge
of the normal distribution you know that 5% of all observations having a N(0,1)
distribution are to left of –1.65 (this is given to you in the formula sheet but you should
be able to read this off the Z table) so a suitable decision rule would be to decide that any
values of z less than –1.65 (i.e. to the left of it) are unlikely to occur due to sampling
variation alone and therefore represent an extreme result which is not in keeping with
what we would expect if the null hypothesis was indeed true.
μ (the population mean) = 10.5
n (the sample size) = 120
x (the sample mean) = 8.9
s (the sample standard deviation) =5.2
and you are asked to decide, based on the sample statistics provided, whether the true
mean age of delinquents is strictly less than 10.5 or not.
Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.65
(see the figure below).
Using the strategy outlined in the lectures:
1. State the Null and Alternative Hypotheses
1. H0: μ = 10.5
2. HA: μ < 10.5
95% of z scores
5% of z scores
Note that the sociologist is only interesting in testing for a mean strictly less than 10.5
and therefore we have a one-sided test. This will become important when deciding on an
appropriate Critical Region.
2. Calculate an appropriate test statistic (TS)
As the sample size is greater than 30 the Central Limit Theorem applies and a suitable
test statistic is
x - μo
z=
σ
Acceptance Region
-1.65
Critical Region
n
which, for this example, works out as
8.9 - 10.5
z=
= −3.37
5.2
120
3
4
5. Check whether the value of the TS is in the critical region and make a decision.
2. Calculate an appropriate test statistic (TS)
In this example, z = -3.37 which is considerably less than –1.65, and hence there is
convincing evidence that the null hypothesis is false.
As the sample size is greater than 30 the Central Limit Theorem applies and a suitable
test statistic is
x - μo
z=
In this question the p- value is calculated as the P(z * ≤ −3.37) . . Recall that we specified
a one-sided test at the onset and hence we need only consider the probability in one tail
of the distribution. From the Normal tables this is estimated as (1-0.9996) = 0.0004.
This is interpreted as you having 4 in 10,000 chance of seeing a value of z* as large as we
observed if the null hypothesis is actually true. As this is much smaller than specified α
of 0.05 we have very strong evidence against reject the null hypothesis and claim that the
data we collected are not at all consistent with the null hypothesis and the difference in
the sample mean compared to the hypothesised value of 10.5 is due to natural sampling
variation.
σ
n
which, for this example, works out as
z=
114.27 - 108.65
= 6.77 .
8.3
100
Note that as the sample size is large enough we substitute s (the sample standard
deviation) for σ (the population standard deviation).
Conclusion.
On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that the
mean age of bicycle thieves is actually less than 10.5 years and not equal to 10.5 years
as stated by the police chief.
3. Distribution of TS if H0 true:
If H0 true then z has a Normal distribution with mean 0 and variance 1 (i.e. N(0,1) ).
4. Decide on the ‘Significance Level’ α
Question 4.
The information given in the question is as follows:
μ (the population mean) = €108.65
n (the sample size) = 100
x (the sample mean) = €114.27
s (the sample standard deviation) =€8.30.
and you are asked to decide, based on the sample statistics provided, whether the claim
that the true average size of a delinquent charge account is different from €108.65 is true
or not.
Using the strategy outlined in the lectures:
1. State the Null and Alternative Hypotheses
A significance level of α=0.05 is specified in the question (i.e. you have a 5% chance of
rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’).
Remember that you are interesting in testing for a difference in the population mean in
both directions (i.e. the true mean could be bigger than that stated in the null hypothesis
or less than that stated in the null hypothesis) and you want to have only a 5% chance of
being wrong. From your knowledge of the normal distribution you know that 2.5% of all
observations having a N(0,1) distribution are to left of –1.96 and to the right of +1.96
(this is given to you in the formula sheet) so a suitable decision rule would be to decide
that any values of z less than –1.96 (i.e. to the left of it) or greater than +1.96 (i.e. to the
right of it) are unlikely to occur due to sampling variation alone and therefore represent
an extreme result which is not in keeping with what we would expect if the null
hypothesis was indeed true. Notice that you have spread the 0.05 over the two tails of the
distribution to give you an overall significance level of 0.05. Our critical region therefore
(using α=0.05) comprises of any value of z that is < -1.96 or z > + 1.96 (see the figure
below).
95% of z scores
3. H0: μ = 108.65
4. HA: μ ≠ 108.55
Note that the retail credit association is interesting in testing for a mean delinquent charge
account different from 108.65 and therefore we have a two-sided test. This will become
important when deciding on an appropriate Critical Region.
2 ½ % of z scores
2 ½ % of z scores
Acceptance Region
-1.96
5
1.96
6
5. Check whether the value of the TS is in the critical region and make a decision.
Using the strategy outlined in the lectures:
In this example, z = 6.77 which is considerably more than 1.96, and hence there is
convincing evidence that the null hypothesis is false.
1. State the Null and Alternative Hypotheses
5. H0: μ = 196
6. HA: μ < 196
Conclusion.
On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that the
true average size of a delinquent charge account is not £108.65 as claimed.
The second part of the question asks you to provide a guess as to what you think the true
average size of a delinquent charge account is likely to be given that you have just
disputed the fact that it is €108.65. To do this you need to provide a guess at the true
unknown mean using a confidence interval. You are specifically asked to calculate a 95%
C.I. which, as the sample size is >30, is calculated as follows (see assignment 2):
x ± zα 2
σ
n
where zα 2 = 1.96 as a 95% C.I. is required. This works out as
114.27 ± 1.96
Note that the player is interesting in testing for a mean bowling score less than 196 and
therefore we have a one-sided test. This will become important when deciding on an
appropriate Critical Region.
2. Calculate an appropriate test statistic (TS)
As the sample size is greater than 30 the Central Limit Theorem applies and a suitable
test statistic is
x - μo
z=
σ
n
which, for this example, works out as
8. 3
z=
100
=[112.53, 116.01].
Hence, we can claim that it is quite likely that the true average size of a delinquent
charge account is somewhere between €112.53 and €116.01. Notice the agreement
between the confidence interval and the hypothesis test in that the parameter value
specified in the null hypothesis (€108.65) is not contained in the interval that we claim is
likely to contain the true value.
188 - 196
= −2.27 .
24.9
50
Note that as the sample size is large enough we substitute s (the sample standard
deviation) for σ (the population standard deviation).
3. Distribution of TS if H0 true:
If H0 true then z has a Normal distribution with mean 0 and variance 1.
4. Decide on the ‘Significance Level’ α
Question 4.
The information given in the question is as follows:
μ (the population mean) = 196 pins
n (the sample size) = 50
x (the sample mean) = 188 pins
s (the sample standard deviation) = 24.9
and you are asked to decide, based on the sample statistics provided, whether the claim
that the true average number of pins is less than 196 or not (i.e. has his average score
worsened).
7
A significance level of α=0.01 specified by the question (i.e. you have a 1% chance of
rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’).
Remember that you are interesting in testing for a difference in population mean in one
direction only (i.e. the true mean is less than stated in the null hypothesis).
From your knowledge of the normal distribution you know that 1% of all observations
having a N(0,1) distribution are to left of –2.33 (this is given to you in the formula sheet)
so a suitable decision rule would be to decide that any values of z less than –2.33 (i.e. to
the left of it) are unlikely to occur due to sampling variation alone and therefore represent
an extreme result which is not in keeping with what we would expect if the null
hypothesis was indeed true. Our critical region therefore (using α=0.01) comprises of
any value of z that is < -2.33 (see the figure overleaf).
8
99% of z scores
1% of z scores
and you are asked to decide, based on the sample statistics provided, whether there is
evidence to suggest a difference in the average weight gain for all patients on 25mg dose
compared to all patients on 50mg dose (i.e. does it look like one group gains more weight
on average compared to the other group). Looking at the sample statistics alone it looks
like the 50mg group has a higher average weight gain than the 25mg group (i.e. the 50mg
group look like to have gained 3 pounds more on average) but this may be just due to
sampling variation and consequently a formal hypothesis test is needed.
Using the strategy outlined in the lectures:
Acceptance Region
1. State the Null and Alternative Hypotheses
-2.33
Critical Region
5. Check whether the value of the TS is in the critical region and make a decision.
In this example, z = - 2.27 which is not less than –2.33, and hence there is no convincing
evidence that the null hypothesis is false. Note that the z score is nearly in the critical
region so even though we cannot claim that the null hypothesis is true (while having only
a 1% chance of being wrong), the result is quite worrying in that the z score was quite
extreme (although not extreme enough to reject H0). The bowler should keep account of a
few more scores and then reanalyze the data and see how extreme his z score is then.
1. H0: μ1 = μ2
(i.e. there is no difference in the average weight gain for patients
on 25mg dose compared to patients on 50mg dose)
2. HA: μ1 ≠ μ2
(i.e. there is a difference in the average weight gain for patients
on 25mg dose compared to patients on 50mg dose)
Note that you are interested in testing for a mean difference in both directions (i.e. the
25mg group could be bigger, less than or equal to the 50mg group) and therefore we have
a two-sided test. This will become important when deciding on an appropriate Critical
Region.
2. Calculate an appropriate test statistic (TS)
Conclusion.
On the basis of the hypothesis test above, there is no evidence (at α=0.01) that the
bowler’s average number of pins has reduced significantly from 196 (i.e. that the new
ball is affecting his performance on average).
As both the samples are of size greater than 30 the Central Limit Theorem applies and a
suitable test statistic is
x1 − x 2
z=
σ 12
n1
MA238 Assignment 5 Solutions (part b)
which, for this example, works out as
z=
(ii) Two Sample Tests
Question 6.
9
7 − 10
62 72
+
50 50
σ 22
n2
= −2.30
Note that as the sample size is large enough we substitute s1, s2 (the sample standard
deviations) for σ1 and σ2 (the population standard deviations).
The information given in the question is as follows:
Sample size (n)
Sample mean ( x )
Sample standard deviation (s)
+
Group 1
(25mg)
50
7
6
Group 2
(50mg)
50
10
7
3. Distribution of TS if H0 true:
As the sample size is large enough, we know from the Central Limit Theorem that if H0
is true, z has a Normal distribution with mean 0 and variance 1 i.e. if H0 is true, we would
expect z to be somewhere around 0 and not expect it to be too far in either direction from
0.
10
4. Decide on the ‘Significance Level’ α
x 1 − x 2 ± zα ∗
A significance level was not specified by the question so it is up to you to choose one!
Let’s choose a significance level of α=0.05 (i.e. you have a 5% chance of rejecting the
null hypothesis when it is in fact true, a so called ‘Type 1 Error’). From your knowledge
of the normal distribution you know that 2.5% of all observations having a N(0,1)
distribution are to left of –1.96 and that 2.5% of all observations are to the right of 1.96
(this is given to you in the formula sheet but you should be able to read this off the Z
table) so a suitable decision rule would be to decide that any values of z less than –1.65
or greater than 1.96 are unlikely to occur due to sampling variation alone and therefore
represent an extreme result which is not in keeping with what we would expect if the null
hypothesis was indeed true.
Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.96 or
greater than 1.96 (see the figure below).
95% of z scores
2 ½ % of z scores
2 ½ % of z scores
Acceptance Region
-1.96
2
n1
+
σ 22
n2
will provide you with a 100(1-α)% confidence interval for the difference in two
population means (e.g. α equal to 0.05 will give you a 95% C.I., α equal to 0.01 will give
you a 99% C.I. etc.). You were asked in this question to calculate a 95% confidence
interval (i.e. use α = 0.05, consequently zα = 1.96 ) which amounts to evaluating
2
7 − 10 ± 1.96 ∗
62 7 2
+
50 50
= [ -5.56, -0.44] .
We interpret this interval as follows: our best guess at the true difference in average
weight gains for 25mg group – 50mg group (make sure you are clear about the order!) is
likely to be between –5.56 pounds and –0.44 pounds (i.e. the 50mg group are likely to
gain on average between .44 and 5.56 pounds more compared to the 25mg group over the
12 week period). Notice that this interval does not contain 0 (which would represent ‘no
difference’ in average weight gain between the two groups) and is in agreement with the
hypothesis test.
Question 7.
1.96
The information given in the question is as follows:
5. Check whether the value of the TS is in the critical region and make a decision.
Sample size (n)
Sample mean ( x )
Sample standard deviation (s)
In this example, z = -2.30 which is considerably less than –1.96, and hence there is
convincing evidence that the null hypothesis is false (i.e. convincing evidence that the
50mg group has in fact a greater average weight gain compared to the 25mg group).
Conclusion.
On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that the
50mg group has in fact a greater average weight gain compared to the 25mg group over
the 12 week period.
As before, a hypothesis test will only indicate whether there is evidence of a significant
difference (i.e. a departure from the null hypothesis that is not due to sampling variation
alone) but will not provide you with an estimate of what the difference is likely to be. In
order to do this you need to calculate a confidence interval (using a required degree of
confidence). We saw in lectures that
11
σ 12
Rocket 1
8
36
15
Rocket 2
10
52
18
and you are asked to decide, based on the sample statistics provided, whether there is
evidence to suggest that the second kind of rocket is worse than the first in terms of it’s
mean target error (i.e. on the basis of the sample data provided, does it look like Rocket 2
is worse than Rocket 1 in terms of mean target error). Looking at the sample statistics
alone it looks like Rocket 2 has a higher mean target error of 16 units more than Rocket 1
but this may be just due to sampling variation and consequently a formal hypothesis test
is needed.
Using the strategy outlined in the lectures:
1. State the Null and Alternative Hypotheses
H0: μ1 = μ2
(i.e. there is no difference in the mean target error for Rocket
1 compared to Rocket 2)
12
HA: μ1 < μ2
(i.e. the mean target error for Rocket
1 is strictly less than that for Rocket 2)
Note that you are interested in testing for a mean difference in one direction only and
therefore we have a one-sided test. This will become important when deciding on an
appropriate Critical Region.
2. Calculate an appropriate test statistic (TS)
As neither of the samples are of size greater than 30, we know from the Central Limit
Theorem does not hold and we need to use the t-distribution. Remember that there are
two crucial assumptions we have to make in order to use the t-distribution in this context
and they are as follows:
1. Do both samples come from populations that are normally distributed?
2. Are the variances of both populations equal?
knowledge of the t-distribution tables you know that 5% of all observations having a tdistribution with 7 (i.e. min(8-1, 10–1)) degrees of freedom distribution are to left of –
1.895 (this is given to you in the t tables by looking down the 0.05 column and across the
7 df row) so a suitable decision rule would be to decide that any values of z less than –
1.895 are unlikely to occur due to sampling variation alone and therefore represent an
extreme result which is not in keeping with what we would expect if the null hypothesis
was indeed true.
Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.895
(see the figure below).
95% of t scores
5% of t scores
We can assume that assumption 1 is true as it is likely that the mean target error should
be normally distributed given the nature of the measurement. We cannot be sure about
the variance assumption but we do know (from the lectures) that if the sample sizes are
similar then this assumption is not too important.
Acceptance Region
-1.895
Critical Region
Given these decisions we can use
x1 − x 2
t=
s12 s2 2
+
n1 n 2
as a suitable test statistic in order to compare two sample means to make statements
about two population means,
5. Check whether the value of the TS is in the critical region and make a decision.
resulting in
Conclusion.
t=
36 − 52
152 182
+
8
10
= −2.06
3. Distribution of TS if H0 true:
We know that if H0 is true, t has a t-distribution with min(n1-1, n2 –1) degrees of freedom
(i.e. if H0 is true, we would expect t to be somewhere around 0 and not expect it to be too
far in either direction from 0.
In this example, t = -2.06 which is less than –1.895, and hence there is convincing
evidence that the null hypothesis is false (i.e. convincing evidence that Rocket 2 has a
higher mean target error compared to Rocket 1).
On the basis of the hypothesis test above, there is strong evidence (at α=0.05) that Rocket
2 has a higher mean target error when compared to Rocket 1 and Rocket 1 should be used
n practice as it is more accurate. Note you were not asked to provide a confidence
interval in his question but you can make a simple guess of the difference in the mean
target error by using the sample means i.e. the mean target error for Rocket 2 is probably
around 16 units more than that for Rocket 1.
Question 8.
4. Decide on the ‘Significance Level’ α
(i) 1. State the Null and Alternative Hypotheses
A significance level of α = 0.05 was specified in the question (i.e. you have a 5% chance
of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’. This
would amount to you saying that Rocket 2 was worse than Rocket 1 when in fact there
was no difference and you were just analysing two ‘extreme’ samples). From your
13
H0: μΑ = μΒ
(i.e. there is no difference in the population mean talking time
between the two batteries)
14
HA: μΑ ≠ μΒ
(i.e. there is a difference in the population mean talking time
between the two batteries)
Note that you are interested in testing for a mean difference in both directions (i.e. the
population mean for nickel-cadmium battery could be bigger, less than or equal to that
for the nickel-metal hydride battery) and therefore we have a two-sided test.
2. Calculate an appropriate test statistic (TS)
As neither of the samples are of size greater than 30, the Central Limit Theorem does not
hold and the t-distribution is valid. There are two crucial assumptions to make in order to
use the t-distribution in this context and these are as follows:
1. Do both samples come from populations that are normally distributed?
2. Are the variances of both populations equal?
Assume that the mean talking time is normally distributed for both batteries and given
that the sample sizes and sample variance are similar assume that the variance
assumption is valid.
Given these decisions use
t=
4. Decide on the ‘Significance Level’ α
A significance level of α = 0.01 was specified in the question (i.e. you have a 1% chance
of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’. This
would amount to you saying that there was a difference in the average talking time when
in fact there was not and you were just analysing two ‘extreme’ samples). From your
knowledge of the t-distribution tables you know that half a percent (i.e. α = 0.005) of all
observations having a t-distribution with 24 degrees of freedom distribution are to left of
–2.797 and that 0.5% of all observations are to the right of 2.797 (this is given to you in
the t tables by looking down the 0.005 column and across the 24 df row) so a suitable
decision rule would be to decide that any values of z less than –2.797 or greater than –
2.797 are unlikely to occur due to sampling variation alone and therefore represent an
extreme result which is not in keeping with what we would expect if the null hypothesis
was indeed true. The critical region therefore (using α=0.01) comprises of any value of t
that is < -–2.797 or > –2.797 (see the figure below).
t (24 df)
x1 − x 2
99% of t scores
s12 s2 2
+
n1 n 2
½ % of t scores
½ % of t scores
as a suitable test statistic in order to compare two sample means to make statements about
two population means, which, for this example, works out as
Acceptance Region
–2.797
t=
2.797
70.75 − 79.23
13.992 15.032
+
25
25
t = −2.06 .
3. Distribution of TS if H0 true:
We know that if H0 is true, t has a t-distribution with min(n1-1, n2 –1) degrees of freedom
(i.e. if H0 is true, we would expect t to be somewhere around 0 and not expect it to be too
far in either direction from 0 under a t distribution with min(25-1, 25–1) =24 degrees of
freedom.
5. Check whether the value of the TS is in the critical region and make a decision.
In this example, t = -2.06 which is not in the critical region, and hence there is no
convincing evidence (at significance level α=0.01 ) that the null hypothesis is false i.e. it
is quite plausible that we could get such a difference in sample means due to sampling
variation alone if Ho was indeed true.
Conclusion.
No evidence of a significant difference (at α=0.01) in the true average talking time
between the two battery types.
(ii)
15
16
As neither of the samples are of size greater than 30, the Central Limit Theorem does not
hold and the t-distribution is valid. There are two crucial assumptions to make in order to
use the t-distribution in this context and these are as follows:
1. Do both samples come from populations that are normally distributed?
2. Are the variances of both populations equal?
(iii)
An approximate 100(1-α)% confidence interval for the true population mean difference is
calculated as
x1 − x 2 ± tα ∗
2
s12 s2 2
+
n1 n 2
where t has min(n1-1, n2 –1) degrees of freedom.
Consequently, an approximate 99% confidence interval for the true population mean
difference is calculated as
Question 9.
(i) The boxplots were missing from the assignment, my apologies!
The summary
statistics suggest that the mean angular velocity is higher in the Skilled group
compared to the Novice group. The boxplots would give an indication as to whether
the data plausibly arose from a normal distribution (one of the assumptions necessary
to carry out the hypothesis test given the small samples). The standard deviations are
not equal in each group but are ‘similar’.
(ii). “Histograms of the data for each group suggested that there were no outliers present
and that the data were reasonably symmetric”. This suggests that the mean is a
useful measure to use to compare the two samples. If there were outliers present and
a lack of symmetry you should consider using the median.
(iii)This is an Observational study. We are observing two types of rowers. An
experimental study would be one where we took a sample of rowers and randomly
assigned them to two training methods and compared the improvement in fitness
across the two methods.
(iv)
13.992 15.032
(70.75 − 79.23) ± 2.797 ∗
+
25
25
= [ -19.97, 3.006].
Notice that this interval contains 0 (which would represent ‘no difference’ in the true
average talking time between the two batteries) and is therefore in agreement with the
hypothesis test.
We interpret this interval as follows: our best guess at the true population mean
difference in average talking time between the two batteries is likely to be between 19.97
units more on average for the nickel-metal hydride battery up to 3.006 units more on
average for the nickel-cadmium batteries.
Note that the interval is loaded towards being strictly negative suggesting that nickelmetal hydride batteries may indeed have a higher mean talking time than nickel-cadmium
batteries which this study may not have enough power to detect.
1. State the Null and Alternative Hypotheses
H0: μS = μN
(i.e. there is no difference in the population mean angular velocity
between the two categories of rowers)
HA: μ S ≠ μ N
(i.e. there is a difference in the population mean angular velocity
between the two categories of rowers)
2. Calculate an appropriate test statistic (TS)
As neither of the samples are of size greater than 30, the Central Limit Theorem does not
hold and the t-distribution is valid. There are two crucial assumptions to make in order to
use the t-distribution in this context and these are as follows:
3. Do both samples come from populations that are normally distributed?
4. Are the variances of both populations equal?
(iv)
A Type II error (i.e. do not reject the null hypothesis H0 when it was in fact false)
in this study would amount to deciding that the mean life time for the two batteries was
the same when in fact they were different i.e. the new battery outperforms the old. The
likely consequence is a loss of income on not producing a longer life battery and the time
used in production so far.
From the evidence provided by the histograms we can assume that the mean talking time
is normally distributed for both categories of rower and given that the sample sizes and
sample variance are similar assume that the variance assumption is valid.
17
18
Given these decisions use
t=
5. Check whether the value of the TS is in the critical region and make a decision.
x1 − x 2
In this example, t = 5.28 which is in the critical region, and hence there is convincing
evidence (at significance level α=0.05 ) that the null hypothesis is false i.e. it is unlikely
that we could get such a difference in sample means due to sampling variation alone if Ho
was indeed true.
s12 s2 2
+
n1 n 2
as a suitable test statistic in order to compare two sample means to make statements about
two population means, which, for this example, works out as
t=
Conclusion.
4.18 − 3.01
Evidence of a significant difference (at α=0.05) in the true average mean angular velocity
between the two categories of rowers.
0.482 0.512
+
10
10
t = 5.28 .
(v)
3. Distribution of TS if H0 true:
We know that if H0 is true, t has a t-distribution with min(n1-1, n2 –1) degrees of freedom
(i.e. if H0 is true, we would expect t to be somewhere around 0 and not expect it to be too
far in either direction from 0 under a t distribution with min(10-1, 10–1) =9 degrees of
freedom.
4. Decide on the ‘Significance Level’ α
An approximate 100(1-α)% confidence interval for the true population mean difference is
calculated as
x1 − x 2 ± tα ∗
2
s12 s2 2
+
n1 n 2
where t has min(n1-1, n2 –1) degrees of freedom.
A significance level of α = 0.05 was specified in the question (i.e. you have a 5% chance
of rejecting the null hypothesis when it is in fact true, a so called ‘Type 1 Error’. This
would amount to you saying that there was a difference in the average angular velocity
when in fact there was not and you were just analysing two ‘extreme’ samples). From
your knowledge of the t-distribution tables you know that half a percent (i.e. α = 0.025)
of all observations having a t-distribution with 9 degrees of freedom distribution are to
left of –2.262 and that 2.5% of all observations are to the right of 2.262 so a suitable
decision rule would be to decide that any values of t less than –2.262 or greater than
2.262 are unlikely to occur due to sampling variation alone and therefore represent an
extreme result which is not in keeping with what we would expect if the null hypothesis
was indeed true. The critical region therefore (using α=0.05) comprises of any value of t
that is < –2.262 or > 2.262 (see the figure below).
t (9 df)
95% of t scores
2½ % of t scores
Consequently, an approximate 95% confidence interval for the true population mean
difference is calculated as
(4.18 − 3.01) ± 2.262 ∗
= [ 0.67, 1.67].
Notice that this interval does not contain 0 (which would represent ‘no difference’ in the
true mean angular velocity between the two categories of rowers) and is therefore in
agreement with the hypothesis test.
We interpret this interval as follows: our best guess at the true population mean
difference between the two categories of rowers is likely to be between 0.67 to 1.67 units
more on average for Skilled rowers compared to the Novice rowers. It appears that the
angular velocity is an important characteristic in rowing.
2½ % of t scores
Acceptance Region
2.262
–2.262
19
0.482 0.512
+
10
10
20
H0: μΑ = μΒ
(i.e. there is no difference in the mean amount charged
between the two plans)
distribution are to left of –1.96 and that 2.5% of all observations are to the right of 1.96
(this is given to you in the formula sheet but you should be able to read this off the Z
table) so a suitable decision rule would be to decide that any values of z less than –1.65
or greater than 1.96 are unlikely to occur due to sampling variation alone and therefore
represent an extreme result which is not in keeping with what we would expect if the null
hypothesis was indeed true.
HA: μΑ ≠ μΒ
(i.e. there is a difference in the mean amount charged
between the two plans)
Our critical region therefore (using α=0.05) comprises of any value of z that is < -1.96 or
greater than 1.96.
Question 10.
1. State the Null and Alternative Hypotheses
Note that you are interested in testing for a mean difference in both directions and
therefore we have a two-sided test.
In this example, z = -1.48, which is in the acceptance region and there is no convincing
evidence that the null hypothesis is false.
2. Calculate an appropriate test statistic (TS)
As both the samples are of size greater than 30 the Central Limit Theorem applies and a
suitable test statistic is
x1 − x 2
z=
σ1
2
n1
+
σ2
2
n2
which, for this example, works out as
z=
5. Check whether the value of the TS is in the critical region and make a decision.
1987 − 2056
3922 4132
+
150 150
= −1.48
Note that as the sample size is large enough we substitute s1, s2 (the sample standard
deviations) for σ1 and σ2 (the population standard deviations).
3. Distribution of TS if H0 true:
As the sample size is large enough, we know from the Central Limit Theorem that if H0
is true, z has a Normal distribution with mean 0 and variance 1 i.e. if H0 is true, we would
expect z to be somewhere around 0 and not expect it to be too far in either direction from
0.
6. Calculate the P-value.
Remember that the P-value is defined as the probability, computed assuming that H0 is
true, that the test statistic would take a value as extreme or more extreme than that
actually observed is called the P-value of the test. The smaller the P-value, the stronger
the evidence against H0 provided by the data.
In this question the p- value is calculated as twice the P(z * ≥ −1.48) = 2(1-0.9306) = 0.14.
Recall that we specified a two-sided test at the onset and hence we need to double the pvalue. We could not know the value of the test statistic before we collected the data. This
is interpreted as there is at least a 14% chance of seeing a value of z* as large as we
observed if the null hypothesis is actually true. As this is not less than specified α of 0.05
we do not reject the null hypothesis.
Conclusion.
On the basis of the hypothesis test above, there is no evidence (at α=0.05) of a significant
difference in the mean amount charged between the two plans. The result is not
significant—there is no clear evidence that one proposal is a better incentive than the
other. So we can just go with the one that is easier and cheaper to implement. But if there
is no practical difference in cost to the bank, we might choose proposal B, since the data
did lean a bit in that direction.
(b)
4. Decide on the ‘Significance Level’ α
A significance level was not specified by the question so it is up to you to choose one!
Let’s choose a significance level of α=0.05 (i.e. you have a 5% chance of rejecting the
null hypothesis when it is in fact true, a so called ‘Type 1 Error’). From your knowledge
of the normal distribution you know that 2.5% of all observations following a N(0,1)
21
Because the sample sizes are equal and large, the Central Limit Theorem applies and the
test should be reliable in spite of some skewness.
22