Download 13 Testing the mean of a population (Small sample).

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
3.1
The Binomial Distribution
The binomial distribution applies to a series of independent trials satisfying the following conditions:
1. A trial has only two possible mutually exclusive outcomes. These outcomes may be called 'success'
and 'failure' or 'defective' and 'not defective' or any other convenient terms.
2. The probability, p, that a trial results in a success, is constant and does not change from one trial to the
next. (the probability that a trial results in a failure is q = 1-p)
3. The outcomes of successive trials are mutually independent. This requirement is approximately met,
for example, when items are selected from a large batch and classified as defective or not defective.
When the above conditions are met and we carry out n trials then the binomial distribution states that the
probabilities of obtaining r successes ( where r = 0,1,2,...n) in the n trials are given by the terms in the binomial
expansion of (p + q)n
P(r,n), the probability of getting r successes in n trials, is given by
P ( r , n) 
n!
n!
p r q nr 
p r (1  p) n r
(n  r )! r!
(n  r )! r!
Example 6. It is known that 10% of the components produced by a certain process are defective. From a large
batch of components 5 are selected at random. Find the probabilities of obtaining 0,1,2,3,4 and 5 defective
components.
Answer. let p = probability a component is defective. Then p = 0.1 so that q = 1- p = 0.9 is the probability the
component is not defective. Selecting the 5 components is the same as saying the number of trials n = 5.
(q  p)5  q5  5q4 p  10q3 p 2  10q2 p 3  5qp 4  q5
Each term in the above expansion gives a probability, for example 10q2p3 tells us the probability of having 2 'not
defective' and 3 ‘defective’ components out of the 5 selected.
Notice that since q + p = 1 the sum of all the n+1 individual probability terms in the expansion must also add to 1.
We can now write down the answers required (in each term the power of p is equal to the number of ‘defectives’:
1  ( q  p)5  q5  5q4 p  10q3 p 2  10q2 p 3  5qp 4  q5
P(no defectives) is given by the term q5 = (0.9)5 = 0.59049
P(1 defective) is given by 5q4 p = 5(0.9)4(.1) = 0.32805
P(2 defectives) is given by the term 10q 3 p2 = 0.07290
P(3 defectives) is given by the term 10q 2p3 = 0.00810
P(4 defectives) is given by the term 5qp 4 = 0.00045 and finally
P(5 defectives) is given by the term p5 = 0.00001.
For the binomial distribution the mean number of successes in n trial is np and the standard deviation of the
number of successes is npq .
74
3.2
The Poisson Distribution
In problems dealing with rare, isolated events, which occur independently and randomly in time or space, it is
common practice to model the probabilities using the Poisson distribution. For example the number, X, of
telephone calls arriving at a switchboard in one minute is sometimes modeled by the Poisson distribution.
X is Poisson if
P( X  x)  e  
x
x!
, where x = 0,1,2...... 
(12)
 is the mean (or average) value of X.
The number of cars passing a point on a road each minute, the number of accidents happening in a town each day
and the number of defective items in a large batch may also be modeled with the Poisson distribution:
Example 7
A process is known to produce on average 1% of defective items. A sample of 200 items is selected at random
from a large batch of these items. Find the probability of obtaining 0,1,2 and 3 defective items in the sample
assuming that the number of defectives is given by a Poisson distribution.
Answer. Assume the number of defective items in the sample of 200 is X.
The mean value of X is 1% of 200 =2. Put  = 2 defectives per 200 items in [12] then:
20
 0.1353
0!
1
2 2
P( X = 1) = e
 0.2706
1!
2
2 2
 0.2706
P( X = 2) = e
2!
3
2 2
 0.1804
P( X = 3) = e
3!
P( X = 0) =
e 2
3.1 Mean and Variance of a Poisson Distribution.
The mean value of X = . The variance of X is also .
3.2 Questions
Question 1.
A sales manager receives 6 telephone calls on average between 9.30 am and 10.30 am on a weekday.
poisson distribution to find the probability that:
Use the
(a) He will receives 2 or more calls between 9.30 and 10.30 on Tuesday.
(b) He will receive exactly 2 calls between 9.30 and 9.40 am on Wednesday.
Answer (a) 0.9826 (b) 0.1839
Question 2.
A batch of mass-produced integrated circuits contains a certain proportion of defectives. The batch is tested for
quality by selecting a random sample of 15 circuits; the batch is rejected if this sample contains more than 2
defective IC's. Find the probability that
(a) a batch containing 5% defective circuits will be rejected.
(b) a batch containing 20% defective circuits will be accepted.
Answer (a) 0.0405 (b) 0.4232
75
3.3 Total Probability for a Poisson distribution.
In this section we verify that the total probability is indeed equal to 1 for a Poisson Distribution.
Suppose X follows a Poisson distribution, we know that X can only take one value when it is measured and the
possible values for X are X = 0, X = 1, X = 2,...etc.
Let us now calculate the total probability of all possible outcomes,
Total Probability = P(X = 0) + P(X = 1) + P(X = 2) + ........
Use (12) to substitute for the probabilities
Total Probability =
Recall that there is a power series for e



2 3
e  1   
  ........
2! 3!


in the form
e  1   
2
2!

3
3!
 .......
 
Thus Total Probability = e e  1 as required.
STATISTICS
The subject of statistics originally dealt with the collection of data and its organisation and presentation in the form
of charts and tables. Often the data was collected in the form of information about the Nation State, hence the
name 'statistic'. The data was often collected from the people or 'population' of the state and the word 'population'
has stuck; it is now applied to mean the set of all things being considered, thus we may have a population of
resistors, for example.
The raw data collected was usually processed and used to calculate new numbers, such as the mean or average
value of the data. Any number calculated from the data is called a 'statistic'. Important statistics include mean,
median and mode, which are measures of central tendency; and standard deviation and variance, which are
measures of dispersion. We shall also meet statistics called, t, F and
2
(chi squared).
Statistics have been used for thousands of years but Probability only came on the scene much later in the 17th
century. By incorporating the ideas of probability into statistics it became possible to use statistics for decision
making and forecasting.
In Engineering statistics is used in Quality Control to help manufacturers identify systematic and random variations
in the product of a repetitive manufacturing process. In Telecommunications it is used in Traffic Engineering in
the design of Telecommunication networks and Power Stations use statistics to monitor demand for power at
different times of the day.
4.
Populations and samples.
Often in practice we are interested in drawing valid conclusions about a large group of individuals or objects.
Instead of examining the entire group, called the population, which may be difficult or impossible to do, one may
arrive at the idea of examining a small part of this population, which is called a sample. The aim of the exercise is
to infer certain facts about the population from results found in the sample. This process is known as statistical
inference and the process of obtaining samples is called sampling.
76
We may wish for example to find the mean resistance of a large batch containing thousands of nominal 100 
resistors. We may take a sample of, say, 30 resistors and measure the resistance of each one and calculate the
mean value of the sample of 30 resistors. As long as the sample was random there is a good chance that the
sample mean is not much different from the mean of the large batch. The large batch constitutes the population
and the 30 resistors is the sample. The sample mean is a statistic (a number calculated from the data), the true
unknown mean of the large batch is called a population parameter. The sample mean statistic is used to estimate
the unknown population parameter.
4.1 Random samples.
Clearly the reliability of conclusions drawn about a population depends on whether the sample is properly chosen
so as to represent the population sufficiently well.
One way to do this for finite populations is to make sure that each member of the population has the same chance
of being included in the sample, which is then called a random sample.
4.2 Population parameters.
The mean  and the standard deviation  for the whole population are called population parameters. Population
parameters are usually given Greek letters. For any given population there is only one mean and one standard
deviation, both are usually unknown quantities.
4.3 Sample statistics.
We can take any number of random samples from the population and then use these samples to obtain values
which serve to estimate the population parameters. Any quantity obtained from a sample for the purpose of
estimating a population parameter is called a sample statistic. Any given sample will only have one mean and one
standard deviation but the values of sample mean and sample standard deviation will differ from sample to sample.
4.4 Population mean, variance and standard deviation.
Consider a population of N members
x1 , x2 , x3 .....xN .( where N may be infinite). By definition the population
mean, , and variance,  , are given by:
2

1
N
N
 xi (13)
and
2 
i 1
1
N
N
 (x  )
i 1
2
i
(14).
The population variance 2 = the square of the population standard deviation, .
4.5 Sample mean, variance and standard deviation.
Consider a sample of n members taken from a large population. The sample mean, m, and variance, s, are given by
m
1 n
 xi (15)
n i 1
and
s2 
1 n
( xi  m) 2

n  1 i 1
(16).
2
In (16) notice that we have divided by n-1 instead of n. This is because then it can be shown that s is an unbiased
estimator of  as will be explained in class.
Note: some calculators have a key n for calculating  and another key n-1 for calculating s.
2
77
5.
The normal population. (revision)
The figure below shows the normal probability density function with mean  = 3 and standard deviation  = 1:
 x   
2
1
2
The equation of the normal curve is given by f ( x,  ,  ) 
e 2 and it can be shown that the total
 2
area under the curve is 1. Most of the area lies between   3 and   3 . In fact table 3 shows that
99.73% of the area is contained within these limits.
In the next figure we compare two normal distributions having the same mean,  = 3 , but having different 's,
 = 1 and  = 0.5
:
Note that the area under each curve is still 1 but the curve with the bigger value of  is fatter (more spread out)
than the other.
Example 8: Consider a population of resistors with nominal value  = 100  and standard deviation  = 5.
Take one resistor at random, what is the probability its resistance is greater than 106 ?
Answer: The resistors are normally distributed with  = 100 and  = 25, we say they are N(100,25). We have
tables for N(0,1). To use the tables we need to calculate the value of u 
78
x

u
106  100
 1.2
5
From table 3 the area to the right of u = 1.2 is 0.1151. The probability of getting a resistance greater than 106 
is thus 0.1151.
6.
Sampling distributions of mean values.
Suppose we have a population with mean  and variance 2. Take one random sample of size n and calculate the
mean of the sample. Now replace this sample back into the population and select a second random sample, find
its mean. As we continue to take new samples and find the means we are slowly building up a new population
comprising of these of sample means. It is a fundamental result in statistics that the population of sample means
itself has a mean  , (the same as the original population) but the variance of the population of sample means is not
 2 but
2
n
.
In other words the 'spread' of the sample means is less than the spread of the original single
observations.
Example 9
A population consists of the five numbers, 2,3,6,8,11. Consider the population of all possible samples of size two,
which can be drawn with replacement from this population. Find
(a) the mean of the original population
(b) the standard deviation of the original population
(c) the mean of the population of sample means and
(d) the standard deviation of the population of sample means.
Answer. (a) μ = 6.0 (b) σ = 3.29 (c)

_
= 6.0 (d)
x
 2  5.40
_
(note: σ2 /N = 5.40)
x
Working for Example 9.
Use your calculator to show that (a) μ =6.0 and (b) σ = 3.29 . (Remember to use the σ key or the σn key (
whichever your calculator uses) for this population standard deviation)
(c)
There are 5 x 5 =25 samples of size 2 which can be drawn with replacement (since any one of the five
numbers on the first draw can be associated with any one of the five numbers on the second draw). These are:
(2,2) (2,3) (2,6) (2,8) (2,11)
(3,2) (3,3) (3,6) (3,8) (3,11)
(6,2) (6,3) (6,6) (6,8) (6,11)
(8,2) (8,3) (8,6) (8,8) (8,11)
(11,2) (11,3) (11,6) (11,8) (11,11)
The corresponding sample means are
2.0
2.5
4.0
5.0
2.5
3.0
4.5
5.5
4.0
4.5
6.0
7.0
5.0
5.5
7.0
8.0
6.5
7.0
8.5
9.5
The mean of the sampling distribution of means is
 
sum of
_
x
This result illustrates the fact that

_
6.5
7.0
8.5
9.5
11.0
[A]
all sample means in [ A] above 125

 6.0
25
25
= μ = 6.0
x
79
(d) The variance

2
_
of the sampling distribution of sample means is obtained by subtracting the mean 6.0 from
x
each number in [A], squaring the result, adding all 25 numbers thus obtained and dividing by 25. The final result
is

2
135
 5.40 so that  _ =
25
x
=
_
x
5.40
= 2.32.
This result illustrates the fact that for large (infinite) populations or for finite populations involving sampling with
replacement,

_
x
3.29
since the right hand side of [B] is
=

[B]
n
= 2.32, agreeing with the result for
2

_
= 2.32 in part (d).
x
Example 10
A population is N(100,25). A sample of size 9 is taken at random. Find the probability that the sample mean is
greater than 106.
Answer. The population of sample means is N(100,25/9)
The standard deviation of the population of sample means is

=
n
_
Calculate the value of
u
x 


5
 1.667
3
106  100
 3.6
1.667
n
From Table 3, the area to the right of u = 3.6 is 0.00016. The probability that the sample mean exceeds 106 is
thus very unlikely at 0.00016.
Here is another illustration of this important result about the sampling distribution of mean values. Mathcad has
been used to generate a large number of random samples from a normal population, calculate the sample mean of
each and draw a histogram.
80
Sampling Distribution of mean values
Here we take N random samples (N large) of size n = 25 from a normal population with population
mean = 100 and population standard deviation= 5. For each sample we calculate the sample
mean and when all N sample means have been found we calculate the standard deviation of these N
sample means. If N is large this should be close to / n. Remember that the population of
sample means is less spread out than the individual values are.
n  25
sample size is n
  5
j  0  N  1
N  1000
Number of samples is N
  100
The N samples are numbered 0 to N-1and labelled by j
s amplej  rnorm n     
This generates n random numbers from the normal population
The vector of N mean values is called v
vj  mean s amplej
Now we estimate the standard deviation of the population of sample means using the sample
standard deviation of the vector of sample means,v
est  Stdev( v)

est  1.04
1
n
k  10
k =number of bins required in histogram
v1  rnorm 1000     
lower1  floor( min( v) )
lower  floor( min( v1) )
upper1  ceil( max( v) )
upper  ceil( max( v1) )
j  0  k
h 
upper lower
k
h1 
intj  lower  h j
upper1 lower1
k
int1j  lower1  h1 j
Finally we draw two histograms, the wide one on the left is for the individual x values and the narrow one
on the right is for the sample means.
Histogram of individual values
Histogram of sample means (less spread out)
individual values
s ample means
300
hist ( 10  v1 )
400
200
hist ( 10  v )
200
100
0
80 85 90 95 100 105 110 115 120
0
80 85 90 95 100 105 110 115 120
int
int1
Stdev( v1)  4.939
Stdev( v)  1.04
max( v1)  114.613
max( v)  103.024
min( v1)  83.949
min( v)  96.945
81
7.
Estimating the Population mean from a sample.
Setting Confidence Limits on 
Suppose we want to estimate the mean value, μ, of a population. To do this we take a random sample and find the
_
_
mean value, x , of the sample. Our best estimate for μ is then x .
Suppose, for example, we take a random sample of 60 resistors from a large batch (population) and we find the
_
sample mean x = 97 Ω.
Our best estimate of μ is thus 97 Ω. Such an estimate is called a point estimate.
It is better to use an interval estimate, in which an indication of the precision or accuracy of the estimate can be
given, i.e. we would like to be able to say something like " I am 95% certain that μ lies somewhere in the range
97  3 Ω".
We can do this if we use the fact that in a normal distribution of sample means 95% of the distribution lies
between μ - 1.96

and μ + 1.96

n
.
n
_
This means that if a sample mean, x , is taken at random there is a 95% probability that
μ - 1.96

_
< x < μ + 1.96
n
_
_
that the population mean μ lies somewhere between the limits x
_
 1.96

.
n
Conversely if a random value of x is taken at random and we add
The interval between x

 1.96
 1.96

to it, we can be 95% confident
n

.
n
is called the 95% confidence interval for the population mean
(17)
n
Example 11 Find the 95% confidence interval for the population mean if the sample of size 60 has mean 97 Ω
and s = 3 Ω.
Answer In this example strictly speaking we do not know the value of σ but the sample size is reasonably large ( n
= 60). For large samples ( n = 30 or more) it is considered acceptable to use the known value of s in place of the
unknown σ and proceed as if σ were known .
Thus the 95 % Confidence Interval is
97  1.96
3
to
60
97  1.96
3
60
We are 95% certain that the population mean lies between 96.2 and 97.8 Ω.
Question: Find the 99% confidence interval for example 11.
82
8.
Small Samples and Student's 't'.
In the previous example we used s in place of σ. We did this because we didn’t know the true value of σ. If we did
know the true value of σ then there would be no problem and the results obtained for the confidence interval would
be accurate but by using s instead of σ we inevitably introduce a source of error into the procedure. It transpires
that the confidence limits obtained are only a good approximation if the sample size, n, is sufficiently large,
generally taken to mean n  30. In the previous example n was equal to 60 so the approximation was a good one.
If, however, n < 30 the sample is considered to be a small sample and the confidence interval obtained with normal
tables using s instead of σ will be inaccurate and the true limits should actually be wider apart than the normal
approximation suggests. A statistician named W.S. Gosset studied this problem in connection with his work in
quality control. Publishing his findings under the pen name ‘Student’ he produced an alternative set of statistical
tables, now called ‘t’ tables, and showed that that when we use s instead σ it is more accurate to use these Student's
't' tables instead of the Normal tables.
_
x.
t
s/ n
For large values of n, the t distribution is Normal.
8.1 Degrees of Freedom.
When estimating the value of a population parameter, such as σ, we sometimes use the value of s, calculated from
the sample data. If k is the number of population parameters estimated from the data in a sample of size n, then
we say the number of degrees of freedom is ν = n - k. If we use s to estimate σ, we have n-1 degrees of freedom.
The number of degrees of freedom is needed to find the areas under the 't' distribution. (Table 7).
9.
Confidence Intervals with small samples.
Thee formula for the confidence limits using 't' is similar to (17)
_
x  t / 2
s
(18)
n
But ' t / 2 ' has to be found from tables according to the number of degrees of freedom ν and the significance level
α. (if α = 5%, then the degree of confidence is 95%)
Example 12.
A sample of size n = 9 gives x = 97 Ώ and s = 3 Ώ. Find the 95% confidence limits for μ.
Answer: This is a small sample, σ is not known, so we shall use the ‘t’ tables.
Number of degrees of freedom ν = 9-1 = 8
α = 0.05. α/2 = 0.025. From Table 7 we find that
Using (18) the confidence limits are 97
t / 2 = 2.306
 2.306 Ω .
Confidence limits are 97  2.306.
Confidence Interval is from 94.7 Ώ to 99.3 Ώ.
Note: small sample  wider confidence interval
If we used normal tables and calculated u using s we would have found the confidence limits to be
97  1.96
3
= 97  1.96.
9
Notice that the true limits obtained using ‘’t’ tables , 97
 2.306, are thus wider apart than this as expected.
83
10
Significance Testing.
In statistics we are often called upon to make decisions about populations on the basis of information obtained
from a sample of the population.
For instance, we may wish to decide whether a coin is fair, or whether one type of microprocessor is more resistant
to high humidity than another, or whether a large batch of resistors has a population mean of 100  etc.
10.1 Statistical Hypotheses - The Null and Alternative Hypotheses.
In attempting to reach decisions about a population it is useful to make assumptions or guesses about it.
assumptions, which may or may not be true, are called Statistical Hypotheses.
Such
In many instances we formulate a statistical hypothesis for the sole purpose of rejecting or nullifying it. For
example if we want to decide whether a given coin is loaded we might formulate the hypothesis that the coin is
fair. i.e. p = 0.5. Similarly if we want to decide whether one procedure is better than another we might formulate
the hypothesis that there is no difference between the procedures. Such hypotheses are called Null Hypotheses
and denoted by Ho .
Any hypothesis which differs from a given hypothesis is called an Alternative Hypothesis and denoted by H1 .
10.2 Tests of Hypotheses and Significance.
Suppose we take a particular hypothesis to be true; for example we might assume that  = 100 is true. Suppose we
then take a sample and find that the results of the sample are very different from what we would expect if  = 100.
i.e. we might have the sample 120,121,120,119; such a sample might have come from a population with mean 
=100 but since the sample values do not have much spread we think it unlikely. When this happens we say the
observed differences are significant and we would be inclined to reject the hypothesis that  = 100. (or at least we
would not accept it on the basis of the evidence obtained from the sample)
Another example: We hypothesize a coin is fair, we toss it 20 times and get 20 heads. We would be inclined to
reject the hypothesis that the coin is fair. Note that we may be wrong in rejecting the Hypothesis that the coin is
fair because even a fair coin might give 20 heads, but the probability of 20 heads from a fair coin is very very
20
1
small, its   or roughly 1 in a million.
2
Procedures that help us to reject or accept a hypothesis are called hypothesis tests or significance tests.
Type I and Type II Errors.
If we reject a hypothesis when it is in fact true we say we make a Type I error.
If we accept a hypothesis when it is in fact false we say we make a Type II error.
In either case a wrong decision has been made.
10.3 Level of Significance ()
The maximum probability with which we are willing to risk a Type I error is called the significance level of the
test, denoted by . We choose the significance level before any samples are drawn so that the sample results will
not influence our choice of . In practice significance levels of  = 0.05 or  = 0.01 are customary.
For example if the significance level is 0.05, or 5% level of significance, then there are about 5 chances in 100 that
we would reject the hypothesis when we should have accepted it. We are thus 95% confident we have made the
right decision.
84
11
Testing the mean of a population (large sample)
To illustrate the ideas above we shall carry out a hypothesis test on a population mean using a large sample mean
as our test statistic.
standard deviation

(Remember that for a large sample the mean
x is normally distributed with mean  and
.)
n
Example 13.
A manufacturer claims that the mean lifetime of fluorescent light bulbs produced by the company is 1600 hours.
A sample of 100 bulbs is found to have a mean lifetime x = 1570 hours and the sample standard deviation
s = 120 hours.
If  = the mean lifetime of all bulbs produced by the company, test the hypothesis that  = 1600 hours against the
alternative hypothesis that   1600 hours.
Answer
Before carrying out the test let us consider the sort of number we would if expect of x if  really was equal to
1600 hours.:
Assume for the moment that that  = 1600 hours and estimate  to be 120 hours. This means that

= 120/10 = 12 hours.
n
From normal tables 95% of the time we would expect x to lie between 1600 - 1.96x12 hours and 1600 + 1.96x12
hours. i.e 95% of the times that we perform the experiment the sample mean will be between 1576 hours and
1624 hours if  = 1600 hours. Only iwill In about 5% of the experiments the sample means lie outside these limits.
A sample mean smaller than 1576 hours or bigger than 1624 hours is fairly unlikely to occur if  = 1600 hours,
although it would be expected to occur about 5 times in a 100.
If our sample mean turns out to be either smaller than 1576 hours or larger than 1624 hours we would be inclined
to doubt that  = 1600 hours. In this case the sample mean is actually 1570 hours which is smaller than we would
expect on the assumption that  = 1600 hours and we thus doubt that  = 1600 hours.
Now let us carry out the test using the standard procedure:
1.
Ho :  = 1600 hours
H1 :   1600 hours.
2.
Choose level of significance:
3.
Calculate the test statistic (for this test we use the standardized normal variate
 = 0.05
u
x

n
u
4.
1570  1600
 2.5
120
100
Determine the critical region or region of rejection or critical region:
u
-1.96
1.96
85
.
From normal tables the area in the two tails = 0.05. The shaded area is called the critical region. If u falls inside
the shaded area the test is significant and we reject Ho at the 5% level of significance.
5.
Determine whether u is in the critical region:
u=-2.5
u
-1.96
1.96
u = -2.5, which is in the critical region.
6.
Make your decision:
reject the hypothesis that  = 1600 hours at the 5% level of significance.
Example 14
The currents at which fuses of a certain type blow have mean  and standard deviation  = 0.8 A. The fuses are
designed to blow at 14 A. 50 fuses are selected at random and tested, and the mean blowing current calculated for
the sample and found to be 14.5 A. Carry out a hypothesis test to determine if  = 14 A.
Answer
1.
Ho :
H1 :
 = 14 A
  14 A
2. Choose level of significance: This time we choose  = 1%.
3. Calculate the test statistic:
u
x

.
n
u
14.5  14
 4.42
0.8
50
4. Determine the critical region for  = 1% using normal tables (Use Table 4)
u
2.5758
-2.5758
86
The critical region is shaded in the figure above.
5. Determine whether our value of u falls in the critical region:
u = 4.42 lies in the right hand tail and is in the critical region.
6. Make your decision: Reject H ,  is not equal to 14 A.
12.
One-Tailed and Two-Tailed Tests.
In the above tests we were interested in the extreme values of u on both sides of the mean, i.e. we shaded in both
'tails' of the distribution. For this reason such tests are called 'Two-Tailed' Tests.
Often, however, we may be interested only in extreme values of u to one side of the mean., i.e. in one 'tail' of the
distribution, as for example when we are testing the hypothesis that the mean is less than 14 A (which is different
from testing  = 14 A). Such tests are called 'One-Tailed' tests . In a one-tailed test the critical region is a region
to one side of the mean, with area equal to , the level of significance. (remember in a two-tailed test the area in
each tail is /2).
The table below gives the critical values for one- and two-tailed tests using the Normal
distribution. There is no need to remember these values, you should obtain them from tables each time. Use your
tables to verify the results below:
 = .05
Critical Values of u
13
=.01
one- tailed test
-1.645 or 1+1.645
-2.33 or +2,33
two-Tailed test
-1.96 or + 1.96
-2.58 or +2.58
Testing the mean of a population (Small sample).
When samples are small (n < 30) we proceed exactly as for large samples but we must use the test statistic 't'
instead of 'u' and find the areas from Table 7 (‘t’ tables)and not Table 3 (normal tables).
_
t
x
(19)
s/ n
In (19) x is the mean of the sample of size n and s is the standard deviation of the sample. The sample data has
been used once to estimate , so the number of degrees of freedom is n-1. Instead of using Table 3 we now use
Table 7 for finding areas
For large values of n the areas in the ‘t’ tables are very close to the corresponding areas in the normal tables.
Example 15
In the past a machine has produced metal sheet with a mean thickness of 1.250 mm. To determine whether the
machine is in proper working order a sample of 10 sheets is taken for which the mean thickness is 1.325 mm and s
= 0.075 mm. Test the hypothesis that the machine is in proper working order.
Answer
This is a small sample so we use 't' from (19)
1.
Ho :  = 1.250 mm
H1 :   1.250 mm.
2.
Choose  = 5%
87
t
3.
1.325  1.250
 3.16
0.075
10
using (19)
4.
From 't' tables with 9 degrees of freedom and a two-tailed test, the critical values of 't' are  2.26.
5.
't' is in the critical region, the test is significant at the 5% level.
6.
Reject Ho . The machine is not in working order.
For this example you should show that at the 1% level the test is not significant. When we can reject at the 5%
level but not at the 1% level we say that the sample result is probably significant and it would be advisable to check
the machine or at least take another sample.
14.
The significance of the correlation coefficient.
In section 1.4 we calculated the correlation coefficient and promised to return to the question of its significance.
Now that we have studied Hypothesis tests we are in a position to do this.
To test the strength of the correlation between two variables we test the hypothesis that, , the coefficient of
correlation of the population, is zero. (i.e. that the true value of the correlation coefficient is zero)(note the greek
letter used for this population parameter)
The corresponding pairs of values of x and y can be thought of as being a random sample from a population which
contains all pairs of values. The correlation coefficient of this population is . The sample is used to calculate the
sample correlation coefficient, which we called r. Even if  = 0 it is still possible that, by chance, r  0.
Table 10 is used to give values of the correlation coefficient for different levels of significance. The probabilities
at the head of the columns refer to the two-tail test of significance and give the probability that
r
will be greater
than the tabulated value given that the true value of  = 0. When using table 10 the number of degrees of freedom
is df = n-2
If we return to section 1.4 we recall that r = 0.9944 with n = 8 experimental points.
significance of this correlation coefficient at the 0.01 level as follows:
Let us now test the
(a) df = n-2 = 8-2 = 6
(b) From Table 10 at the 0.01 level of significance the critical value of r is
rCR = 0.8343
(c) The calculated value of r is 0.9944. Since the calculated value of r ( = 0.9944) is greater than the critical
value of r ( = 0.8343) we reject the Hypothesis that  = 0 and conclude that linear correlation exists between the
two variables.
Example 16
With 14 experimental points, what value would r need to exceed to be significant at the 1% level?
Answer. df = 14-2 = 12. Use Table 10 to find the critical value of r = rCR = 0.6614. Thus any value of r greater
than 0.6614 would be required in order to be significant at the 1% level.
88
15.
Goodness of fit and the 2 test.
Suppose that an Engineer is intending to write a computer program to simulate white noise and he first writes a
subroutine to generate uniformly distributed random numbers. The purpose of the subroutine is to generate the
whole numbers 0,1,2,...9 at random with equal probability.
Before writing the remainder of the program he may want to test whether or not his random number generator is
working properly i.e. is it really random with no preference for particular digits?
To do this he might now generate, say, 250 random digits and look at them to see how well they fit his expectation
that they are random. For example if his random number generator produces 250 sixes he would have grave doubts
about his program. Assuming as the null Hypothesis
Ho :
The numbers are random,
He would expect on average to have 25 zeros, 25 ones, 25 twos etc.
It is most unlikely, however, that he would actually observe exactly 25 zeros, 25 ones etc, it is more likely that
there will be some random fluctuations or sampling error and some differences between the observed and expected
values are expected.
Suppose that the experiment of generating 250 digits is now carried out and the observed values are as follows:
Digit
0
1
2 3
4
5
6
7
8
9
Observed frequency (O)
17 31 29 18 14 20 35 30 20 36
Expected frequency (E)
25 25 25 25 25 25 25 25 25 25
The question arises 'are these observed numbers random?' with the differences from 25 being explained as chance
variations, or is there a built in bias with some numbers more likely than others?
The Engineer would now like to carry out a statistical test to determine if the observed values do indeed 'fit' the
expected values on the assumption (hypothesis) that the numbers are random. This is called a 'goodness of fit test'
and in this case we shall use the 2 test. First, however, we must define what is meant by the statistic 2 as
explained below.
15.1 The 2 statistic.
If O is an observed frequency and E is an expected frequency then the statistic 2 is defined by:
2  
(O  E )2
E
When the observed values are close to the expected values the value of 2 will be close to zero. On the other
hand large values of 2 tell us that the observed values differ from the expected values so a large value of 2
suggests a bad fit.
In the 2 test we generally calculate the value of 2 from the data on the basis of a null hypothesis.
calculated value is greater than some critical value (such as

2
0.05,
or

2
0.01,
If the
, which are the critical values of 2 at
the 5% and 1% significance levels), we would conclude that the observed frequencies differ significantly from the
expected frequencies and we would reject Ho at the corresponding level of significance. This procedure is called
the 2 test of hypothesis or significance.
The appropriate critical values of 2 are obtained from Table 8, which tabulates the percentage points of the 2
distribution and the dependence on the number of degrees of freedom.
The number of degrees of freedom =  = k - 1, where k is the total number of possible events that can occur. In
this case k = 10 because there are 10 possible digits, so  = 9.
89
(In more complicated problems if the expected frequencies can be calculated only by estimating m population
parameters from sample statistics then  = k-1-m)
If there is a good fit we expect that 2 will be small. A Large value of 2 means that there is a big difference
between the observed and expected values and consequently a bad fit. We should also view with suspicion those
cases where the value of 2 is too close to zero because it is rare that observed frequencies agree too well with
expected values. To examine such situations where the fit appears 'too good' we can determine whether the
calculated value of 2
is less than
2
 0.95,

or
2
 0.99,

in which case we would decide that the agreement is too
good at the 5% or 1% levels of significance respectively.
15.2 The 2 test.
Now carry out the test in detail for the random number data in section 15.1 at the 1% level of significance.
First obtain the calculated value of 2 :
2 =
(17  25)2 (31  25)2 (29  25)2
(36  25)2


 .....
25
25
25
25
2 = 23.3 (observed)
Next use Table 8 to find the tabulated ( or critical) value of 2
2
 0.01,9
(for  =10-1 = 9 degrees of freedom):
= 21.666 (critical)
We see that the observed value of 2 is greater than the critical value. The test statistic falls in the critical region
and we reject the null hypothesis, i.e. suspicion is cast upon the randomness of the numbers generated.
15.3 Conditions for the application of the 2 test.
Before carrying out a 2 test two conditions need to be satisfied in order to make the testing procedure valid:
1. The total frequency of the distribution must be reasonably large - at least 50 (in our example it was 250)
2. The theoretical frequency of each class should be at least10. If frequencies fall below this number two or more
lasses should be combined. (in our example the smallest expected frequency was 25 so there was no need to
combine any classes)
16.
The F-Test.
16.1 The purpose of the F-Test.
In some problems we wish to decide whether two samples of size m and n respectively, whose measured variances
2
2
are s1 and s2 respectively, do or do not come from the same normal parent population. The F-Test is used to
make this decision. Usually we take a Null Hypothesis that there is no difference between the population
variances In such cases we use the F-statistic defined by
F
s12
s22
90
We expect that if the population variances are equal then F will be a number close to 1. Values of F much bigger
than 1 or much smaller than 1 tend to suggest that the population variances are not the same. Critical values of F
are obtained from Table 9 where it is necessaary to know 1 , the number of degrees of freedom in the numerator
and 2 , the degrees of freedom in the denominator.
16.2 Example of the F-Test.
Suppose an instructor has two classes of students in a particular subject. Class A has 25 students and Class B has
15 students. On the same examination, although there was no significant difference in the mean mark. Class A
had a sample standard deviation of 12 marks while Class B had a sample standard deviation of 9. The question
arises 'is the variability of Class A greater than Class B? (In other words 'is the standard deviation of Class A
greater than that of Class B?')
Answer:
Let us choose a significance level of 1%
We have to decide at the 1% level between the Hypotheses
Ho : 1 = 2
i.e. any observed variability is due to chance
H1 : 1 > 2
i.e. the variability of Class A is greater than Class B.
and
The decision must be based on a one-tailed test of the F-distribution. For the samples in question
F
144
s12
=
= 1.778
2
81
s2
The number of degrees of freedom for the numerator 1 = 25-1 = 24
The number of degrees of freedom for the denominator 2 =15-1 = 14
From Table 9 the critical value of F is 3.43. Since the observed value of F (1.778) is less than the critical value of
F (3.43) we cannot reject Ho at the 1% level.
91