Document related concepts
no text concepts found
Transcript
```UNIT 8: STATISTICAL HYPOTHESIS TESTING
8.1. Introduction
This chapter describes the statistical procedure for testing the hypotheses, which is a
very standard procedure that is commonly used by professionals in a wide variety of
disciplines. The two major activities of inferential statistics are the estimation of
population parameters and hypothesis testing. Hypothesis testing is important because
it provides an objective framework for making decisions using probabilistic methods
rather than relying on subjective impressions.
Definition 8.1.1
A statistical hypothesis is a claim or statement about the property of a population. It
is an assertion or conjecture concerning one or more populations. A hypothesis test
(or test of significance) is a standard procedure for testing a claim about the property
of a population.
Example 8.1.2
The following statements are typical of the hypotheses (claims) that can be tested by
the procedures we will develop later in this chapter:
(i) A computer Engineer claims that the life span of computers produced by a
certain Zambian company is less 10 years
(ii) Medical researchers claim that the mean body temperature of healthy adults is
less than 38℃
(iii)A food company produces peanuts weighing 336g (on average). Periodically,
the quality control department takes samples of peanut packets to determine
whether the packaging process is under control.
(iv) Mendel claims that under certain circumstances, the percentage of off-spring
peas with yellow pods exceeds 25%.
Before beginning to study this chapter, we should bare the following rule in mind:
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
1
Rare Event Rule for Inferential Statistics
If, under a given assumption, the probability of a particular observed event is exceptionally
small, we conclude that the assumption is probably not correct.
Following this rule, we test the claim by analyzing the sample data in an attempt to
distinguish between results that can easily occur by chance and results that are highly
unlikely to occur by chance.
8.2. Basics of Hypothesis Testing
In this section, we describe the formal components used in hypothesis testing: null
hypothesis, alternative hypothesis, test statistic, critical region, significance level, critical
value, 𝑝 − 𝑣𝑎𝑙𝑢𝑒, type I error, and type II error. In other words, the objectives of this
section are as follows:

Given a claim, identify the null hypothesis and the alternative hypothesis, and
express both in symbolic form.

Given a claim and sample data, calculate the value of the test statistic

Given a significance level, identify the critical value(s)

Given a value of the test statistic, identify the p – value.

State the conclusion of a hypothesis test in simple, non-technical terms

Identify the type I error and type II error that could be made when testing a given
claim.
Null and Alternative Hypotheses
Usually, a hypothesis takes the form of a claim, a belief or suspicion.
The null hypothesis (denoted by 𝐻0 ) is a statement that the value of a population
parameter (such as proportion, mean or standard deviation) is equal to some claimed
value. The symbolic statement of the null hypothesis usually contains an equal (=) sign.
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
2
In this sense, we can say that the null hypothesis is a statement asserting no change, no
effect or no difference.
The alternative hypothesis (denoted by 𝐻1 𝑜𝑟 𝐻𝑎 ) is the statement that the parameter
has a value that somehow differs from the null hypothesis. For the methods of this chapter
the symbolic form of the alternative hypothesis must use one of the symbols < 𝑜𝑟 >
𝑜𝑟 ≠.
Below are some typical null and alternative hypotheses in relation to Example 8.1.2.
(𝑖)𝐻0 : 𝜇 = 10
(𝑖𝑖)𝐻0 : 𝜇 = 38℃
(𝑖𝑖𝑖)𝐻0 : 𝜇 = 336𝑔
(𝑖𝑣)𝐻0 : 𝑃 = 0.25
𝐻1 : 𝜇 < 10
𝐻1 : 𝜇 < 38℃
𝐻1 : 𝜇 ≠ 336𝑔
𝐻1 : 𝑃 > 0.225
It should be noted that the above are examples of one-tailed (one – sided) tests while
except for (iii) which is a two - tailed (two - sided) test. In two tailed tests, the level of
significance,  is divided equally between the two tails that constitute the critical region.
Caution!
1. Although some text books use the symbols such as ≤ 𝑎𝑛𝑑 ≥ in the null hypothesis
(𝐻0 ), most professional journals use only the equal sign and that is what is
recommended in this text. We conduct the hypothesis test by assuming that the
proportion, mean or variance is equal to some specified value so that we can work
with a single distribution having a specific value.
2. If you are conducting a study and want to use a hypothesis test to support your
claim, the claim must be worded so that it becomes the alternative hypothesis. This
requires that your claim must be expressed using the symbols < 𝑜𝑟 > 𝑜𝑟 ≠. You
cannot use a hypothesis test to support a claim that some parameter is equal to
some specified value.
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
3
Test Statistic
The test statistic is a value computed from the sample data, and it is used in making the
decision about the rejection of the null hypothesis. It is found by converting the sample
statistic (such as the sample proportion, 𝑝̂ , or the sample mean, 𝑥̅ or the sample variance,
𝑠 2 ) to a score such as 𝑧, 𝑡 𝑜𝑟 2 with the assumption that the null hypothesis is true. The
test statistic can therefore be used for determining whether there is significant evidence
against the null hypothesis.
Critical region
The critical (rejection) region is the set of all values of the test statistic that cause us to
reject the null hypothesis. In other words, the critical region contains the critical value at
a given level of significance (and with specified degrees of freedom in a case of the t –
test, chi square test or F- test). A critical value is any value that separates the critical
region (where we reject the null hypothesis) from the values of the test statistic that do
not lead to the rejection of the null hypothesis. The critical value, also called the table
value depends on the nature of the null hypothesis, the sampling distribution that applies,
and the significance level.
When a decision is made about the null hypothesis, there are two possible errors that
may be committed.
(i)
Type I error: Rejection of the null hypothesis when it is actually true
(ii)
Type II error: Acceptance of the null hypothesis when it is false
The table below gives a summary of the possible situations:
We decide to reject 𝐻0
Decision
We fail to reject 𝐻0
True State of Nature (Reality)
𝐻0 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒
𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒
Type I error
committed
Type II error
committed
Level of significance
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
4
The level of significance (denoted by ) is the probability that the test statistic will fall in
the critical region when the null hypothesis is actually true. In other words,  = P(type I
error) and the probability of type II error is given by  = P(type II error). 1 -  is the
probability of rejecting 𝐻0 when it is false and it is called the power of the test. If the level
of significance is not specified, then it is mandatory to use 5%.
Probability value (p – value)
This is the probability of getting a value of the test statistic that is at least as extreme as
the one representing the sample data, assuming that the null hypothesis is true. The null
hypothesis is rejected if the p – value is very small (i.e. when p – value < ).
Decisions and conclusions
From example 8.1.2, we have seen that the original claim sometimes becomes the
alternative hypothesis. However, the standard procedure of hypothesis testing requires
that we always test the null hypothesis and so, the initial conclusion will always be one of
the following:
(i)
Reject the null hypothesis or
(ii)
Fail to reject the null hypothesis
The decision to reject or fail to reject the null hypothesis is usually made using either
the traditional (classical) method, p – value method or based on confidence intervals.
In recent years however, use of the traditional method has been declining partly
because statistical software packages are often designed for the p – value method.
(a) Traditional method: Reject 𝐻0 if the absolute value of the test statistic is greater
than the critical value. That is, if the test statistic falls within the critical region.
(b) P- value method: Reject 𝐻0 if 𝑝 − 𝑣𝑎𝑙𝑢𝑒 < 𝛼
(c) Confidence intervals: Because a confidence interval estimate of a population
parameter contains the likely values of that parameter, reject a claim that the
population parameter has a value that is not included in the confidence interval.
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
5
The conclusion of rejecting or failure to reject 𝐻0 is fine for those who have done a statistic
course, but it is important to use simple and non-technical terms in stating what the
conclusion really means. The figure below gives a summary of the wording of the final
conclusion.
Start
Wording of final
conclusion
Does the
original claim
contain the
condition of
equality?
Yes
Do you
reject
𝐻0 ?
Yes
No (fail to reject H0)
No (original
claim does not
contain
equality, so it
becomes H1)
Do you
reject
𝐻0 ?
Yes
No (fail to reject H0)
There
is
sufficient
evidence to warrant
rejection of the claim
that…………
There is no sufficient
evidence to warrant
rejection of the claim
that …..
The sample data
support the claim
that………
There is not sufficient
sample evidence to
support the claim
that…….
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
6
8.3. Tests about one population parameter
Now that we have understood the meanings of key concepts regarding hypothesis testing,
we can discuss the various tests that can be carried out regarding one population mean,
one population proportion and one population variance. The table below gives a summary
of the test statistics used in each case.
Parameter
Population
mean
Hypotheses
H0:  = 0 against
H1:   0 or   0 or
  0
Condition(s)
Test statistic
Population variance
known or population
variance unknown
with n  30
Unknown population
variance with n  30
𝑧=
𝑥̅ − 𝜇0
or
𝜎/√𝑛
𝑧=
𝑥̅ − 𝜇0
𝑆/√𝑛
𝑥̅ − 𝜇0
𝑡=
𝑆/√𝑛
Decision
Reject H0 if
|𝑧| > 𝑧𝛼 for a 1 – tailed
test and reject H0 if |𝑧| >
𝑧𝛼/2 for a 2 – tailed test
Reject H0 if
|𝑡| > 𝑡𝛼, 𝑛−1 for a 1 –
tailed test and reject H0 if
|𝑡| > 𝑧𝛼,𝑛−1 for a 2 –
2
Population
proportion
H0: P = p against
H1: P  p or P  p or
Pp
Population
variance
𝐻0 : 𝜎 2
𝐻0 : 𝜎 2
𝐻0 : 𝜎 2
𝐻0 : 𝜎 2
=
>
<
≠
𝜎0 2 against
𝜎0 2 or
𝜎0 2 or
𝜎0 2
𝑧=
𝑝̂ − 𝑝
√
2 =
𝑝𝑞
𝑛
(𝑛 − 1)𝑆 2
𝜎2
tailed test
Reject H0 if
|𝑧| > 𝑧𝛼 for a 1 – tailed
test and reject H0 if |𝑧| >
𝑧𝛼/2 for a 2 – tailed test
Reject H0 if
|2 | > 2 𝛼,𝑛−1 for a 1 –
tailed test and reject H0 if
|2 | > 2 𝛼,𝑛−1 for a 2 –
2
tailed test
If one needs to make a decision using a p – value method, then it should be noted that
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(𝑍 > |𝑧|)𝑓𝑜𝑟 𝑎 𝑜𝑛𝑒 𝑡𝑎𝑖𝑙𝑒𝑑 𝑎𝑛𝑑 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2𝑃(𝑍 > |𝑧|) 𝑓𝑜𝑟 𝑎 2 𝑡𝑎𝑖𝑙𝑒𝑑 𝑡𝑒𝑠𝑡.
In a case of a t – test,
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(𝑇𝑛−1 > |𝑡|)𝑓𝑜𝑟 𝑎 𝑜𝑛𝑒 𝑡𝑎𝑖𝑙𝑒𝑑 𝑎𝑛𝑑 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2𝑃(𝑇𝑛−1 > |𝑡|)
For chi – square test,
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(2 𝑛−1 > |2 |) 𝑎𝑛𝑑 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2𝑃(2 𝑛−1 > |2 |) 𝑓𝑜𝑟 𝑎 2 𝑡𝑎𝑖𝑙𝑒𝑑 𝑡𝑒𝑠𝑡.
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
7
Example 8.3.1
1. A random sample of 100 deaths recorded in the US during the past year showed
an average life span of 71.8 years. Assuming a population standard deviation of
8.9 years, does this seem to indicate that the average life span today is greater
than 70 years? Use 0.05 level of significance.
Working
Hypotheses;
𝐻0 : 𝜇 = 70, 𝑎𝑛𝑑 𝐻1 : 𝜇 > 70
Level of significance;
𝛼 = 0.05
Test statistic;
Since the sample size is greater than 30 and population variance is known, we use z as
a test statistic.
𝑧=
𝑥̅ − 𝜇0
𝜎/√𝑛
=
71.8 − 70
= 2.022
8.9
√100
Critical value;
𝑍𝛼 = 𝑍0.05 = 1.645
P – Value (optional);
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(𝑍 > |𝑧|) = 𝑃(𝑍 > |𝑧|) = 0.0217
Decision;
Since, |𝑧| > 𝑍𝛼 , we reject 𝐻0
Note that we could have made a decision using p – value method (i.e. since p - value is
less than the level of significance, we reject Ho).
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
8
Conclusion;
Rejecting 𝐻0 at 5% level of significance indicates that the sample data supports the claim
that the average life span today is greater than 70 years.
2. The Zambian Heart Association recommends that an individual’s cholesterol level
be under 200mg per 100ml. The following are the cholesterol readings of 16
women selected randomly from the Kitwe Herat Study:
233 197
192 179 174 217
188 209
196 167
186 221
238 179 196 191
At the 10% level of significance, do these readings suggest that women in Kitwe
have cholesterol readings below 200 mg on average? What assumptions are
required?
Working
From the given data set, 𝑥̅ = 197.6875 𝑎𝑛𝑑 𝑠 = 20.7066
Hypotheses;
𝐻0 : 𝜇 = 200, 𝑎𝑛𝑑 𝐻1 : 𝜇 < 200
Level of significance;
𝛼 = 0.1
Since the sample size is less than 30 and population variance is unknown, we use t as a
test statistic.
𝑡=
𝑥̅ − 𝜇0
𝑆/√𝑛
=
197.6875 − 200
= −0.4467  |𝑡| = 0.4467
20.7066
√16
Critical value;
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
9
𝑡𝛼,𝑛−1 = 𝑡0.1,15 = 1.341  |𝑡| < 𝑡𝛼,𝑛−1
P – Value (optional);
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝑃(𝑇15 > |𝑡|) = 𝑃(𝑇15 > 0.4467), giving 𝑝 − 𝑣𝑎𝑙𝑢𝑒 > 0.25  𝑝 − 𝑣𝑎𝑙𝑢𝑒 > 0.1
and so, 𝑝 − 𝑣𝑎𝑙𝑢𝑒 > 𝛼.
Decision;
Based on either the critical value (traditional) method or p – value method, we fail to reject
Ho.
Conclusion;
Failure to nullify Ho at 10% level of significance indicates that there is no sufficient sample
evidence to support the claim that women in Kitwe district have cholesterol readings
below 200mg on average.
3. A manufacturer of sports equipment has developed a new synthetic fishing line
that he claims has a mean breaking strength of 8kg with a standard deviation of
0.5kg. A random sample of 50 lines is tested and found to have a mean breaking
strength of 7.8kg. Is the claim valid at 1% level of significant?
Working
From the given data set, 𝑥̅ = 7.8, 𝜇 = 8, 𝜎 = 0.5 𝑎𝑛𝑑 𝑛 = 50
Hypotheses;
𝐻0 : 𝜇 = 8, 𝑎𝑛𝑑 𝐻1 : 𝜇 ≠ 8
Level of significance;
𝛼 = 0.01
Since the sample size is greater than 30 and population variance is known, we use
𝑧=
𝑥̅ − 𝜇0
𝜎/√𝑛
=
7.8 − 8
= −2.83  |𝑧| = 2.83
0.5
√50
Critical value;
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
10
𝑍𝛼 = 𝑍0.005 = 2.576  |𝑧| > 𝑍𝛼
2
2
P – Value (optional);
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2𝑃(𝑍 > |𝑧|) = 2𝑃(𝑍 > 2.83) = 0.0046,  𝑝 − 𝑣𝑎𝑙𝑢𝑒 < 0.01 and so, 𝑝 −
𝑣𝑎𝑙𝑢𝑒 < 𝛼.
Decision;
Based on either the critical value (traditional) method or p – value method, we reject
Ho.
Conclusion;
Rejecting Ho at 1% level of significance indicates that there is no sufficient evidence
to support the claim that the mean breaking strength was 8kg.
4. A distributor of cigarettes claims that 20% of the smokers in Myami prefer Kent
cigarettes. To test the claim, 20 smokers are selected at random and asked what
brand they prefer. If 6 of the 20 named Kent as their preference, what conclusion
do we draw?
Working
In this case, we need to carry out the test concerning proportion.
𝑝 = 20% = 0.2  𝑞 = 1 − 0.2 = 0.8 𝑎𝑛𝑑 𝑝̂ =
6
= 0.3
20
Hypotheses;
𝐻0 : 𝑃 = 0.2, 𝑎𝑛𝑑 𝐻1 : 𝑃 ≠ 0.2
Level of significance;
𝛼 = 0.05
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
11
Test statistic;
𝑧=
𝑝̂ − 𝑝
𝑝𝑞
√
𝑛
=
0.3 − 0.2
√(0.2)(0.8)
20
= 1.12
Critical value;
𝑍0.05 = 𝑍0.025 = 1.96  |𝑍| < 𝑍𝛼
2
2
P – Value (optional);
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2𝑃(𝑍 > |𝑧|) = 2𝑃(𝑍 > 1.12) = 0.2628,  𝑝 − 𝑣𝑎𝑙𝑢𝑒 > 0.05
and
so,
𝑝−
𝑣𝑎𝑙𝑢𝑒 > 𝛼.
Decision;
Based on either the critical value (traditional) method or p – value method, we fail to reject
Ho.
Conclusion;
Failure to reject Ho at 5% level of significance indicates that there is no sufficient evidence
to invalidate the claim that 20% of the smokers in Myami prefer Kent cigarettes.
5. A Cafein content of a certain brand of tea is known to be normally distributed with
variance of 1.3 mg. Test this claim using a random sample of 8 packets of tea with
standard deviation of 1.8mg at 5% level of significance.
Working
In this case, 𝜎 2 = 1.3, 𝑆 = 1.8 𝑎𝑛𝑑 𝑛 = 8
Hypotheses;
𝐻0 : 𝜎 2 = 1.3, 𝑎𝑛𝑑 𝐻1 : 𝜎 2 ≠ 1.3
Level of significance;
𝛼 = 0.05
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
12
Test statistic;
2 =
(𝑛 − 1)𝑆 2
(7)1.82
=
= 17.446
𝜎2
1.3
Critical value;
2 𝛼,𝑛−1 = 2 0.025,
2
7
= 16.013  |2 | > 2 𝛼,𝑛−1
2
Decision;
Since |2 | > 2 𝛼,𝑛−1 , we reject Ho
2
Conclusion;
Rejecting Ho at 5% level of significance indicates that the variance is not equal to 1.3.
8.4. Confidence intervals and hypothesis testing
The testing of Ho:  = o against H1:   o at % level of significance is equivalent to
computing a (100(1 - ) % confidence interval for . In this case, Ho is rejected if o is not
inside the confidence interval. If o is inside the confidence interval then the null
hypothesis is not rejected.
If we consider question 3 in example 3.3.1 then,
𝐻0 : 𝜇 = 8, 𝑎𝑛𝑑 𝐻1 : 𝜇 ≠ 8
At 1% level of significance, we can construct a 99% confidence interval as follows:
𝑥̅ ± 𝑍𝛼/2
𝜎
√𝑛
= 7.8 ± 𝑍0.005
0.5
√50
= (7.62, 7.98)
Since 8  (7.62, 7.98), we reject Ho at 1% level of significance and conclude that the
manufacturer’s claim is not valid.
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
13
8.5. Tests about two population parameters
In this section, we shall only concentrate on the test statistics for each of the three
parameters (mean, proportion and variance). The procedure for carrying out the test
remains the same as we did in the previous section.
8.5.1. Two population means
Test statistics and critical regions for two population means can be summarised as
follows:
To test 𝐻0 : 𝜇1 − 𝜇2 = 𝜇0 against 𝐻1 : 𝜇1 − 𝜇2 ≠ 𝜇0 or 𝐻1 : 𝜇1 − 𝜇2 > 𝜇0 or 𝐻1 : 𝜇1 − 𝜇2 <
𝜇0 , use;
(𝑖)𝑍 =
𝑥̅1 − 𝑥̅1 − 𝜇0
√
𝜎1 2 𝜎2 2
𝑛1 + 𝑛2
𝑖𝑓 𝑏𝑜𝑡ℎ 𝜎1 2 𝑎𝑛𝑑 𝜎2 2 𝑎𝑟𝑒 𝑘𝑛𝑜𝑤𝑛 𝑎𝑛𝑑 𝑍 =
𝑥̅1 − 𝑥̅1 − 𝜇0
𝑠1 2 𝑠 2
𝑛1 + 𝑛2
√
𝑖𝑓 𝜎1 2 𝑎𝑛𝑑 𝜎2 2 𝑎𝑟𝑒 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑎𝑛𝑑 𝑏𝑜𝑡ℎ 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒𝑠 𝑎𝑟𝑒 𝑔𝑟𝑒𝑎𝑡𝑒𝑟 𝑡ℎ𝑎𝑛 30.
(𝑖𝑖) 𝑡 =
𝑥̅1 − 𝑥̅1 − 𝜇0
𝑖𝑓 𝜎1 2 𝑎𝑛𝑑 𝜎2 2 𝑎𝑟𝑒 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑏𝑢𝑡 𝑎𝑠𝑠𝑢𝑚𝑒𝑑 𝑡𝑜 𝑏𝑒 𝑒𝑞𝑢𝑎𝑙.
1
1
𝑛1 + 𝑛2 )
𝐼𝑛 𝑡ℎ𝑖𝑠 𝑐𝑎𝑠𝑒, 𝑡ℎ𝑒 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚, 𝛾 = 𝑛1 + 𝑛2 − 2
(𝑛1 − 1)𝑆1 2 + (𝑛2 − 1)𝑆2 2
2
𝑎𝑛𝑑 𝑡ℎ𝑒 𝑝𝑜𝑜𝑙𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒, 𝑆𝑝 =
𝑛1 + 𝑛2 − 2
√𝑆𝑝 2 (
(𝑖𝑖𝑖) 𝑡 =
𝑥̅1 − 𝑥̅1 − 𝜇0
2
𝑠2
𝑠1
𝑛1 + 𝑛2
√
𝑖𝑓 𝜎1 2 𝑎𝑛𝑑 𝜎2 2 𝑎𝑟𝑒 𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑎𝑛𝑑 𝑛𝑜𝑡 𝑒𝑞𝑢𝑎𝑙.
𝐼𝑛 𝑡ℎ𝑖𝑠 𝑐𝑎𝑠𝑒, 𝑡ℎ𝑒 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑑𝑜𝑚 𝑖𝑠 𝑔𝑖𝑣𝑒𝑛 𝑏𝑦; 𝛾 =
2
𝑠 2 𝑠 2
( 𝑛1 + 𝑛2 )
1
2 2
𝑠
( 𝑛1 )
1
2
2
𝑠 2
( 𝑛2 )
2
𝑛1 − 1 + 𝑛2 − 1
For paired observations, the mean difference is tested using a t- test with the test value
given by;
𝑡=
̅ − 𝜇0
𝐷
𝑤𝑖𝑡ℎ 𝑡ℎ𝑒 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚 𝑔𝑖𝑣𝑒𝑛 𝑏𝑦 𝛾 = 𝑛 − 1.
𝑆𝑑
√𝑛
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
14
Example 8.5.1
1. An experiment was performed to compare the abrasive wear of two laminated
materials. 12 pieces of material 1 gave an average (coded) wear of 85 units with
a standard deviation of 4 while 10 pieces of material 2 gave an average of 81 and
a standard deviation of 5. Can we conclude that the abrasive wear of material 1
exceeds that of material 2 by more than 2 units? Assume that the populations are
approximately normal with equal variances.
2. Business schools A and B reported the following summary of GMAT (Graduate
Management Apptitude Test) verbal scores.
School
A
B
Sample size(n)
201
115
Sample mean
34.75
33.74
Sample variance
48.59
30.68
At 5% level of significance, is there sufficient evidence to believe that there is a
difference in the GMAT scores of the two schools? Calculate the p – value
associated with this test.
Working
1. Let 𝜇1 𝑎𝑛𝑑 𝜇2 𝑏𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛𝑠 𝑓𝑜𝑟 𝑚𝑎𝑡𝑒𝑟𝑖𝑎𝑙𝑠 1 𝑎𝑛𝑑 2 𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑖𝑣𝑒𝑙𝑦. Then
𝐻0 : 𝜇1 − 𝜇2 = 2 𝑎𝑛𝑑 𝐻1 : 𝜇1 − 𝜇1 > 2
Since population variances are unknown but assumed to be equal, then
𝑡=
𝑥̅1 − 𝑥̅1 − 𝜇0
√𝑆𝑝
2
1
1
(𝑛 + 𝑛 )
1
2
𝑤ℎ𝑒𝑟𝑒 𝑆𝑝
𝑡 =
2
(𝑛1 − 1)𝑆1 2 + (𝑛2 − 1)𝑆2 2
=
= 20.05
𝑛1 + 𝑛2 − 2
85 − 81 − 2
√20.05 ( 1 + 1 )
12 10
= 1.043
𝑡0.05,20 = 1.725  |𝑡| < 𝑡0.05,20
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
15
In this case, we fail to reject Ho at 5% level of significance and conclude that there is no
sufficient evidence to say that the abrasive wear of material 1 exceeds that of material 2
by more than two units.
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 2𝑃(𝑍 > 1.42) = 2(0.0778) = 𝟎. 𝟏𝟓𝟓𝟔
2. Let 𝜇1 𝑎𝑛𝑑 𝜇2 𝑏𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 𝑠𝑐𝑜𝑟𝑒𝑠 𝑓𝑜𝑟 𝑠𝑐ℎ𝑜𝑜𝑙𝑠 𝐴 𝑎𝑛𝑑 𝐵 𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑖𝑣𝑒𝑙𝑦. Then
𝐻0 : 𝜇1 − 𝜇2 = 0 𝑎𝑛𝑑 𝐻1 : 𝜇1 − 𝜇1 ≠ 0
In this case, both sample sizes are greater than 30, so we use;
𝑍=
𝑥̅1 − 𝑥̅1 − 𝜇0
= 1.42 𝑎𝑛𝑑 𝑍𝛼/2 = 1.96
𝑠 2 𝑠2
√ 1 +
𝑛1
𝑛2
𝑆𝑖𝑛𝑐𝑒 |𝑍| < 𝑍𝛼/2 we fail to reject Ho at 5% level of significance and conclude that
there is no sufficient evidence to believe that the two schools performed differently.
8.5.2. Tests about two population proportions
Tests about two population proportions are based on the sampling distribution of the of
𝑃1 − 𝑃2 .
1. For 𝐻0 : 𝑃1 − 𝑃2 = 0 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1 : 𝑃1 − 𝑃2 ≠ 0, or 𝑃1 − 𝑃2 > 0 or 𝑃1 − 𝑃2 < 0, the
following test statistic is used.
𝑍=
𝑝̂1 − 𝑝̂ 2 − 0
1
1
𝑛1 + 𝑛2 )
𝑤ℎ𝑒𝑟𝑒 𝑃 =
√𝑃𝑄 (
𝑥1 + 𝑥2
𝑛1 + 𝑛2
2. For 𝐻0 : 𝑃1 − 𝑃2 = 𝑃 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1 : 𝑃1 − 𝑃2 ≠ 𝑃, or 𝑃1 − 𝑃2 > 𝑃 or 𝑃1 − 𝑃2 < 𝑃, the
following test statistic is used.
𝑍=
𝑝̂1 − 𝑝̂ 2 − 𝑃
√
𝑝̂1 𝑞̂1 𝑝̂2 𝑞̂2
𝑛1 + 𝑛2
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
16
Example 8.5.2
A poll is taken to compare the proportion of town and county voters favoring the proposal
of constructing a chemical plant. Is the proportion of town voters favoring the proposal
higher than the proportion of county voters if 120 of 200 town voters and 240 of 500
county voters favour the proposal? Use 2.5% level of significance.
Working
𝐻0 : 𝑃1 − 𝑃2 = 0 (𝑖. 𝑒. 𝑃1 = 𝑃2 )𝑎𝑛𝑑 𝐻0 : 𝑃1 − 𝑃2 > 0(𝑖. 𝑒. 𝑃1 > 𝑃2 )
In this case;
𝑝̂1 − 𝑝̂2 − 0
𝑍=
√𝑃𝑄 (
1
1
+
𝑛1 𝑛2 )
𝑎𝑛𝑑 𝑝̂ 2 =
𝑤ℎ𝑒𝑟𝑒 𝑃 =
𝑥1 + 𝑥2 120 + 240
120
=
= 0.51, 𝑝̂1 =
= 0.6
𝑛1 + 𝑛2 200 + 500
200
240
= 0.48  𝑍 =
500
0.6 − 0.48 − 0
√(0.51)(0.49) ( 1 + 1 )
200 500
= 2.87
𝑍0.025 = 1.96  |𝑍| > 𝑍𝛼 and so, we reject Ho.
Rejecting Ho at 2.5% level of significance implies that there is sufficient sample evidence
to support the claim that the proportion of town voters favoring the proposal higher than
the proportion of county voters.
8.5.3. Tests about the difference in two population variances
To test 𝐻0 : 𝜎1 2 = 𝜎2 2 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1 : 𝜎1 2 ≠ 𝜎2 2 or 𝜎1 2 > 𝜎2 2 or 𝜎1 2 < 𝜎2 2 use 𝐹 =
𝑆1 2
𝑆2 2
and
reject Ho if 𝐹 > 𝑓𝛼 (𝑛1 − 1, 𝑛2 − 1) for one – tailed test and 𝐹 > 𝑓𝛼/2 (𝑛1 − 1, 𝑛2 − 1) for a
two- tailed test.
Example 8.5.3
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
17
In the abrasive wear example, we assumed that the two unknown population variances
were equal. Were we justified in making that assumption? Use 0.10 level of significance.
Material 1: 𝑛1 = 12, 𝑥̅1 = 85 𝑎𝑛𝑑 𝑠1 = 4
Material 2: 𝑛2 = 10, 𝑥̅2 = 81 𝑎𝑛𝑑 𝑠2 = 5
𝐻0 : 𝜎1 2 = 𝜎2 2 𝑎𝑔𝑎𝑖𝑛𝑠𝑡 𝐻1 : 𝜎1 2 ≠ 𝜎2 2
In this case;
𝐹=
𝑆1 2
𝑆2 2
16
= 25 = 0.64 𝑎𝑛𝑑 𝑓0.05 (11, 9) =
3.14+3.07
2
= 3.105
Since 𝐹 < 𝑓0.05 (11, 9) we fail to reject Ho and conclude that the assumption of equal
unknown population variances is justified.
Activity
1. The average life of 6 car batteries is 30 months with standard deviation of 4 months.
The manufacturer claims that an average life is 3 years for his batteries and a customer
claims that the manufacturer is exaggerating. If you were in a position of a customer,
would you believe the manufacturer’s claim? Test the claim at 5% level of significance.
2. A test was given to a large group of boys who scored on average 64.5. The same test
was given to a group of 400 boys who scored on average 62.5 with standard deviation
of 12.5. Examine if the difference is significant at 5% level of significance.
3. A poultry farmer is investigating ways of improving the profitability of his operation.
Using a standard diet turkeys grow to a mean mass of 4.5kg at age 4 months. A sample
of 20 turkeys which were given a special enriched diet had an average mass of 4.8kg
after 4 months. The sample standard deviation was 0.5kg. Using 5% level of
significance, test whether the new diet is effectively increasing the mass of the turkeys.
4. Aircrew escape systems are powered by a solid propellant. The burning rate of this
propellant is an important product characteristic. Specifications require that the mean
burning rate must be 50 centimeters per second. We know that the standard deviation
of burning rate is 𝜎 = 2 centimeters per second. The experimenter decides to specify
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
18
a type I error probability of 0.05 and selects a random sample of 25 and obtains a
sample average burning rate of 51.3 centimeters per second. What conclusions should
be drawn?
5. The mean water temperature downstream from a power plant cooling tower discharge
pipe should be no more than 100°F. Past experience has indicated that the standard
deviation of temperature is 2°F. The water temperature is measured on nine randomly
chosen days, and the average temperature is found to be 98°F.
(a) Should the water temperature be judged acceptable with 5% level of significance?
(b) What is the P-value for this test?
6. The means of two large samples of sizes 2000 and 1000 are 68.0 and 67.5
respectively. Can the two samples be regarded as drawn from the same population of
standard deviation of 2.25.
7. To study the effect of a special study programme, 14 students were selected and
paired according to IQ and scholastic performance. One student from each pair was
randomly selected to participate in the special programme, while the other student
participated in the standard programme. Shortly thereafter, the students took the
national exam and obtained the following scores:
Special programme
66 82 96 72 78
82 67
Standard programme
60 79 92 73 75
80 69
Is there any difference in the mean scores under the two programmes? Why?
8. An administrator at a large university stated that there was a difference in the mean
grade point average of graduating males and females. A random sample of 45
graduating males gave a mean grade point average of 2.10 and a variance of 0.64,
while a random sample of 50 graduating females gave a mean grade point average of
2.45 and a variance of 0.70. By constructing a 95% confidence interval or testing the
hypothesis at 5% level of significance, would you conclude that the data support the
9. Two varieties of maize are being tested in a developing country. 12 test plots are given
identical treatment. Six plots are sown in variety 1 and the other six plots in variety 2.
In an experiment in which the crop scientist hope to determine whether there is a
significant difference between the yields using 5% level of significance. The results
were:
Variety 1:
Variety 2:
1.5
1.6
1.9
1.8
1.2
2.0
1.4
1.8
2.3 1.3
2.3
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
19
One of the plots planted with variety 2 was accidentally given extra dossal of fertilizer
so the result was discarded. Would you conclude that there was a significant difference
in the yield between the two varieties of maize? Assume equal population variances.
10. A coin is tossed 256 times and 132 heads are observed. Is there sufficient evidence to
conclude that the coin is biased?
11. Twenty people were affected by cholera and out of them only 18 survived. Would you
reject the hypothesis that the survival rate if affected by cholera is 85% in favour of the
hypothesis that it is more at 5% level of significance?
12. A manufacturing company claims that at least 95% of its products supplied conforms
to the specification. Out of a sample of 200 members, 18 are found to be defective.
Test the claim at 5% level of significance.
13. A certain geneticist is interested in the proportion of males and females in the
population that have a certain minor blood disorder. In a random sample of 1000 males,
250 are found to be afflicted, whereas 275 of 1000 females tested appeared to have a
disorder. Is there sufficient evidence to conclude that the proportion of females with a
minor blood disorder was higher than that of males? Test the hypothesis at 1% level
of significance
14. In a random sample of 1000 persons from city A, 400 are found to be consumers of
wheat. In another sample of 800 persons from city B, 400 are found to be consumers
of wheat. Do these data reveal a significant difference in the proportion of consumers
of wheat between the two cities?
Lecture Notes on Statistical Hypothesis Testing/Compiled by Angel Mukuka (PhD)
20
```