Download Statistical inference

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistical inference
Distribution of the sample mean
In many cases we need to infer information about a population based on
the values obtained from random samples of the population.
Consider the following procedure:
•
•
•
•
•
Take a random sample of n independent observations from a population.
Calculate the mean of these n sample values. This is know as the sample
mean.
Repeat the procedure until you have taken all possible samples of size n,
calculating the sample mean of each.
Form a distribution of all the sample means.
The distribution that would be formed is called the sampling distribution
of means.
Mean and variance of the sampling
distribution of means
Consider a population X in which E  X    and Var  X    2 .
Suppose we take a total of m samples, each of n independent observations, on the random variable X .
Each sample will have a first observation, a second observation, and so on. Denote by X i , the random
variable "the ith observation".
Then, the sample distribution of means is
X  X2  Xn
X 1
n
1
1
1
 X1  X 2   X n
n
n
n
1
1
1

E  X   E  X1  X 2   X n 
n
n
n

1 
1

1

 E  X1   E  X 2    E  X n 
n 
n

n

1
1
1
 E  X 1   E  X 2    E  X n  (since E  aX   aE  X  )
n
n
n
1
1
1
    
n
n
n

1
1
1

Var  X   Var  X 1  X 2   X n 
n
n
n

1 
1

1

 Var  X 1   Var  X 2    Var  X n 
n 
n

n

1
1
1
 2 Var  X 1   2 Var  X 2  
Var  X n  (since Var  aX   a 2Var  X  )
n
n
n2
1
1
1
 2  2   2
The standard deviation of the sampling distribution is
n
n
n


2
n
This is known as the standard error of the mean.
2
n
, sometimes written

n
.
The distribution of X for normal X
If X
N   ,  2  then X
 2 
N  , 
n 

Read Example 9.13 & 9.14, pp.438-440
The distribution of X for non-normal X
Central limit theorem
For samples taken from a non-normal population with mean  and variance  2,
by the Central Limit Theorem, X is approximately normal and
 2 
X N  , 
n 

provided that the sample size, n, is large (n  30, say).
This theorem holds when the population of X is discrete or continuous!!
Read Example 9.15, pp.442-443
Do Exercise 9C, pp.443-444
Unbiased estimates of population
parameters
Suppose that you do not know the value of a particular population parameter of
a distribution. It seems sensible to take a random sample from the distribution
and use it in some way to make estimates of unknown parameters.
The estimate is unbiased if the average (or expectation) of a large number
of values taken in the same way is the true value of the parameter.
Point estimates
Read Example 9.17, p.448
If the random sample taken is of size n,
 the best unbiased estimate of , the population mean, is ˆ where
x

ˆ
x 
(x is the mean of the sample)
n
 the best unbiased estimate of  2 , the population variance, is ˆ 2 where
n
ˆ 2 
 s2
(s 2 is the variance of the sample)
n 1
2
2


x
x

x




1 


2
2
2

Alternatively, use ˆ 
or ˆ 
x 

n 1
n 1 
n 


Unbiased estimates of population
parameters
Interval estimates
Another way of using a sample value to estimate an unknown population
parameter is to construct an interval, known as a confidence interval.
This is an interval that has a specified probability of including the
parameter.
The interval is usually written (a, b), with a and b known as confidence limits.
The probabilities most often used in confidence intervals are 90%, 95% and 99%.
For example, to work out a 95% confidence interval for the unknown mean 
of a particular population, we construct an interval (a, b) such that
P  a    b   0.95
The interval constructed uses the value of the mean, x, of a random sample of
size n taken from the population.
Three questions
•
•
•
Is the distribution of the population normal or not?
Is the variance of the population known?
Is the sample size large or small?
Confidence interval for µ
(normal population, known population variance, any sample size)
The goal is to calculate the end-values of a 95% confidence interval. We can adapt
this approach for other levels of confidence.
 2 

For random samples of size n, if X N   ,   , then X N   ,  .
n 

X 
Standardizing, Z 
, where Z N  0,1 .
 n
For a 95% confidence interval, we find the z -values between which 95% of the
distribution lies.


X 
So P  1.96 
 1.96   0.95
 n



 

i.e. P  X  1.96
   X  1.96

n
n

P  Z  z   0.975

z   1 (0.975)
 1.960
The values of z are  1.96.
Read Example 9.19, pp.451-452
If x is the mean of a random sample of any size n taken from a normal population with
known variance  2, then a 95% confidence interval for  is given by

 

x

1.96
,
x

1.96


n
n

Confidence interval for µ
(normal population, known population variance, any sample size)
This computer simulation shows 100 confidence intervals constructed
at the 95% level. On average 5% do not include µ.
In other words, on average, 95% of the intervals constructed will
include the true population mean.
Critical z-values in confidence intervals
The z-value in the confidence interval is known as the critical value.
90% confidence interval

 

, x  1.645
 x  1.645

n
n

95% confidence interval

 

, x  1.96
 x  1.96

n
n

99% confidence interval

 

, x  2.576
 x  2.576

n
n

Read Examples 9.20 – 9.22, pp.454-457
Confidence interval for µ
(non-normal population, known population variance, large sample size)
In this case, since the sample size is large (say, n  30), the Central Limit Theorem
may be used.
i.e. X is approximately normal and X
 2 
N  , .
n 

If x is the mean of a random sample of size n, where n is large (n  30),
taken from a non-normal population with known variance  2 , then a
95% confidence interval for  is given by

 

x

1.96
,
x

1.96


n
n

Note •
•
•
for a given sample size, the greater the level of confidence, the wider the
confidence interval;
for a given confidence level, the smaller the interval width, the larger the sample
size required;
for a given interval width, the greater the level of confidence, the larger the sample
size required.
Confidence interval for µ
(any population, unknown population variance, large sample size)
In this case, ˆ 2 is used as an estimate for  2 . Ideally, the distribution of X
should be normal, but an approximate confidence interval may also be given
when the distribution of X is not normal.
Provided that n is large, (n  30, say), a 95% confidence interval for  is
ˆ
ˆ 

x

1.96
,
x

1.96

,
n
n

n 2
s , (s 2 is the sample variance)
n 1
2


x


1 

2
2
.
ˆ 
x 

n 1 
n 


where ˆ 2 
or
Read Examples 9.23 & 9.24, pp.458-459
Do Exercise 9e, pp.460-461
Confidence interval for µ
(normal population, unknown population variance, small sample size)
When calculating confidence intervals, we have already encountered
the situation when large samples (n  30) are taken from a normal population
with unknown variance  2 .
X 
 Z , where Z
ˆ n
For large samples:
But if the sample size is small (n  30),
N  0,1
X 
no longer has a normal distribution.
ˆ n
For small samples:
X 
 T , where T has a t -distribution.
ˆ n
Confidence interval for µ
(normal population, unknown population variance, small sample size)
The t-distribution
The distribution of T is a member of a family of t -distributions. All t -distributions are
symmetric about zero and have a single parameter  which is a positive integer.
 is known as the number of degrees of freedom of the distribution and we write
T t ( ) or T t .
This diagram shows t  2  and t 10  along with N  0,1.
Note that as  increases, t  v  gets closer and
closer to N  0,1 .
For samples of size n, it can be shown that
X 
ˆ n
follows a t -distribution with (n  1) degrees
of freedom.
T
Probability density function of the
t-distribution
f  t  
  1 


2


 1
2
   t 
     1  
 2   

  z    t z 1et dt
0
  n    n  1!, n 
2
,   n 1
Read Examples 9.26 & 9.27, pp.466-468
Do Exercise 9f, p.468
Confidence interval for µ
(normal population, unknown population variance, small sample size)
If x and s 2 are the mean and variance of a small sample (n  30) from a normal population
with unknown mean  and unknown variance  2 , then a 95% confidence interval for  is given by
ˆ
ˆ 
n 2

2
ˆ
x

t
,
x

t
,
where


s


n 1
n
n

and t is the value from a t  n  1 distribution such that P  t  T  t   0.95,
i.e.  t , t  encloses 95% of the t  n  1 distribution.
To find the required value of t, known
as the critical value, the t -distribution
tables are used.
For a 95% confidence interval, look
under column 0.975.
For a 90% confidence interval, look
under column 0.95.
For a 99% confidence interval, look
under column 0.995.
Confidence intervals for µ
Read Examples 9.30, pp. 474-475
Read Examples 9.32 & 9.33, pp. 476-478
Do Exercise 9h, Q1-Q4, Q7, Q10-Q12, Q15, Q17-Q21, pp.478-481
Hypothesis testing- example
A machine fills ice-packs with liquid. The volume of liquid follows a normal
distribution with mean 524 ml and standard deviation 3 ml. The machine breaks
down and is repaired. It is suspected the machine now overfills each pack, so a
sample of 50 packs is inspected. The sample mean is found to be 524.9 ml. Is
the machine over-dispensing?
Is the sample mean high enough to say that the mean volume of all packs has
increased? A hypothesis (or significance) test enables a decision to be made
on this question.
Let X be the volume of liquid now dispensed. Let the unknown mean of X be .
Assume the standard deviation has remained unchanged i.e X N   ,  2  , where   3.
The hypothesis is made that  is 524 ml, i.e. the mean is unchanged . This is
known as the null hypothesis, H 0 and is written
H 0 :   524
Since it is suspected the mean has increased, the alternative hypothesis, H 1, is
that the mean is greater than 524 ml, written
H1 :   524
The test is now carried out.
Hypothesis testing- example
To carry out the test, the focus shifts from X , the volume of liquid in a pack, to X ,
the mean volume of a sample of 50 packs. In this test, X is known as the test statistic
and its distribution is needed.
 2 
We know, for a sample of size n, X N   , . The test starts by assuming

n 

32 
the null hypothesis is true, so   524. Hence X N  524,  .
50 

The result of the test depends on the location in the sampling distribution of the
test value 524.9 ml.
If it is close to 524 then it is likely to have come from a distribution
with mean 524 ml and there would not be enough evidence to say
the mean volume has increased.
If it is far away from 524 (i.e. in the upper tail of the distribution) then
it is unlikely to have come from a distribution with mean 524 ml and
the mean is likely to be higher that 524 ml.
A decision needs to be made about the cut off point, c, known
as the critical value, which indicates the boundary of the region
where values of x would be considered too far away from 524 ml
and therefore unlikely to occur . The region is known as the
critical region or rejection region.
Hypothesis testing- example
Here, if x lies in the critical region, we reject the null hypothesis, H 0 (that the mean
is 524 ml), in favour of the alternative hypothesis, H 1 (that the mean is greater than 524 ml).
If x does not lie in the critical region, there is not enough evidence to reject H 0, so
H 0 is accepted. (In this case, x  c is known as the acceptance region.)
For a significance level of  %, if the sample mean lies in the critical (or rejection)
region, the result is said to be significant at the  % level.
Since the distribution of X is normal, instead of finding c, the critical x value, it is possible
to use the standardized N  0,1 distribution and find the z -value that gives 5% in the upper tail.
We require P  Z  z   0.05, and the standard normal table yields z  1.645.
We can now state the rejection criterion: reject H 0 if z  1.645, where
x   x  524
z

.
 n 3 50
Note the rejection criterion should be established before the sample is taken.
524.9  524
For the sample taken x  524.9, so z 
 2.12. Since z  1.645, H 0 is rejected
3 50
in favour of H1.
Conclusion: There is evidence, at the 5% level, that the mean volume of liquid being
dispensed by the machine has increased.
One-tailed and two-tailed tests
Say that the null hypothesis is   0.
In a one - tailed test, the alternative hypothesis H1 looks for an increase or a
decrease in  :
For an increase, H1 is   0 and the critical region is in the upper tail.
For a decrease, H1 is   0 and the critical region is in the lower tail.
In a two - tailed test, the alternative hypothesis H1 looks for a change without
specifying an increase or decrease and H1 is   0 . The critical region is in
two parts:
Critical z-values and rejection
criteria
Example: At the 1% level
One-tailed tests
Two-tailed tests
Stages in the hypothesis test
Testing the mean, µ, of a population
(normal population, known variance, any sample size)
When testing the mean of a normal population X with known variance  2
for samples of size n, the test statistic is

2 
X , where X N  0 ,  .
n 

In standardized form, the test statistic is
Z
X  0
where Z
 n
N  0,1 .
Read Examples 11.1 & 11.2, pp.514-517
Testing the mean, µ, of a population
(non-normal population, known variance, large sample size)
When testing the mean of a non-normal population X with known variance  2 ,
provided that the sample size n is large, the test statistic is
X , where X is approximately normal, X
In standardized form, the test statistic is
Z
X  0
where Z
 n
N  0,1 .
Read Examples 11.3, pp.517-518

2 
N  0 ,
.
n


Read Examples 11.4, pp.519-520
Testing the mean, µ, of a population
(any population, unknown variance, large sample size)
When testing the mean of a non-normal population X with unknown
variance  2 , provided that the sample size n is large, the test statistic is
X , where X

ˆ 2 
N  0 ,
 and
n 

n 2
s , (s 2 is the sample variance)
n 1
2


x


1

  x2 
.
or ˆ 2 
n 1 
n 


In standardized form, the test statistic is
ˆ 2 
Z
X  0
where Z
ˆ n
N  0,1 .
Do Exercise 11A, Q1-Q6, Q8-Q10, Q13, pp.522-523
Testing the mean, µ, of a population
(normal population, unknown variance, small sample size)
When testing the mean of a normal population X with unknown
variance  2 , when the sample size n is small, the test statistic is
T , where T 
X  0
and T
ˆ n
t  n  1 with
n 2
s , (s 2 is the sample variance)
n 1
2


x


1 

2
2
.
ˆ 
x 

n 1 
n 


ˆ 2 
or
Read Examples 11.6 &11.7, pp.524-526
Do Exercise 11b, pp.527-528
Testing the difference in means, 1  2 ,
of two normal populations
Consider two normal populations X 1 and X 2 with unknown means, 1 and 2 .
So X 1 N  1 ,  12  and X 2 N  2 ,  2 2  and we want to test the difference
between the means of these populations.
The hypothesis might be:
H 0 : 1  2 
H1 : 1  2 
(or 1  2 
or 1  2 
)
Often the test involves the null hypothesis that the means are the same i.e. H 0 : 1  2
or H 0 : 1  2  0.
Take a random sample of size n1 from X 1 and find its sample mean x1.
Take a random sample of size n2 from X 2 and find its sample mean x2 .
The test statistic is X 1  X 2 and to consider this sampling distribution we use
E  X 1  X 2   E  X 1   E  X 2   1  2
Var  X 1  X 2   Var  X 1   Var  X 2  
 12
n1

 22
n2
Testing 1  2 of two normal populations
2
2
(variances   and  2 known )
If the variances  12 and  22 are known, the test statistic is

 12  22 
X 1  X 2 where X 1  X 2 N  1  2 ,

.
n1 n2 

In standardized form, the test statistic is
Z
X 1  X 2   1  2 

2
1
n1


2
2
where Z
N  0,1 .
n2
The 95% confidence limits for 1  2 are  x1  x2   1.96
 12
n1

 22
n2
.
Testing 1  2 of two normal populations
2
(common known variance  )
If there is a common population variance  2 (   12   22 )
and  2 is known, then the test statistic is

1 
2 1
X 1  X 2 where X 1  X 2 N  1  2 ,      .
 n1 n2  

In standardized form, the test statistic is
Z
X 1  X 2   1  2 

1 1

n1 n2
where Z
N  0,1 .
The 95% confidence limits for 1  2 are  x1  x2   1.96
1 1
 .
n1 n2
Pooled two-sample estimate of 
2
If the common population variance,  2 , is unknown, then an unbiased
estimate, ˆ 2 , is used instead. This is known as a pooled two - sample
estimate, where
n1s12  n2 s22
ˆ 
n1  n2  2
2
(s12 and s22 are the sample variances)
An alternative format for ˆ 2 is
 x  x    x2  x2 
ˆ 2   1 1
2
n1  n2  2
2
2
2


1
1
2
2
x

x

x

x
 i n   i    i n   i  
1
 
2


n1  n2  2
The distribution of X 1  X 2 depends on whether the samples taken
are large or small.
Testing 1  2 of two normal populations
2
(common unknown variance  , large sample )
For large samples the distribution of X 1  X 2 is approximately normal.
The test statistic is

1 
2 1
X 1  X 2 where X 1  X 2 N  1  2 , ˆ     .
 n1 n2  

In standardized form, the test statistic is
Z
X 1  X 2   1  2 
ˆ
1 1

n1 n2
where Z
N  0,1 .
The 95% confidence limits for 1  2 are  x1  x2   1.96ˆ
1 1
 .
n1 n2
Testing 1  2 of two normal populations
2
(common unknown variance  , small sample )
For small samples the standardized form of the distribution of X 1  X 2
follows a t -distribution. The test statistic is
T
X 1  X 2   1  2 
ˆ
1 1

n1 n2
where T
t  n1  n2  2  .
Read Examples 11.11 - 11.15, pp.536-542
The 95% confidence limits for 1  2 are  x1  x2   t ˆ
1 1
 ,
n1 n2
where t is such that P T  t   0.975 for t  n1  n2  2  .
Do Exercise 11d, Sections A & B, pp.543-546
Paired samples
It is widely thought that people’s reaction times are shorter in the morning and increase as
the day goes on. A light is programmed to flash at random intervals and the experimental
subject has to press a buzzer as soon as possible and the delay is recorded.
Experiment 1: Two random samples of 40 students are selected from the school
register. One of these samples, chosen at random, uses the apparatus during the
first period of the day, while the second sample uses the apparatus during the last
period of the day. The means of the two samples are compared.
Experiment 2: A random sample of 40 students is selected from the school
register. Each student is tested in the first period of the day, and again in the last
period. The difference in reaction times between the two periods for each student is
calculated. The mean difference is compared with zero.
Experiment 1 requires a standard two-sample comparison of means, assuming a
common variance. There is nothing wrong with this procedure but we could be
misled. Firstly, suppose that all the bookworms were in the first sample and all the
athletes in the second- we might conclude that reaction times decrease over the
course of the day! More subtly, the variations between the students may be much
greater than any changes in individual students over the time of day; these changes
may pass unnoticed.
In Experiment 2 the variability between the students plays no part. All that matters is
the variability of the changes within each student’s readings. The problems with
Experiment 1 have vanished! Experiment 2 is a paired-sample test.
The paired-sample comparison of
means
Let d denote the mean of the distribution of differences between the paired values.
Let H 0 : d  0
H1 : d  0 (or d  0 or d  0 as appropriate)
We have a single set of n pairs of values and are interested in the differences d1 , d 2 ,
which assuming H 0 , are a random sample from a population with mean 0.
An unbiased estimate of the unknown variance of the population is given by
2


d


1

i
2
2
  di 

ˆ d 

n 1 
n


We have created a single sample situation, so previous methods now apply!
For example, if the differences can be assumed to have a normal distribution,
or if n is sufficiently large that a normal approximation can be used, then a 95%
confidence interval for d is provided by:

ˆ d2
ˆ d2 
 di
, d  1.96
 d  1.96
 , where d 

n
n 
n

Alternatively, if the differences can be presumed to have a normal distribution,
but n is small, then a t  n  1 distribution can be used.
, dn,
Paired sample- an example
Suppose that Experiment 2 on the reaction times is carried out, with the following results
(in units of 0.001 seconds):
We analyse these data at the 1% significance level.
The summary statistics are
d
i
 266,
d
2
i
 9574 so that d  6.650 and ˆ d2  200.1308
The hypotheses being compared are: H 0 : d  0
H1 : d  0
Since 2.97  2.326, H 0 is rejected.
There is evidence, at the 1% level,
that reaction times increase through the day.
The test statistic is z, given by:
z
d
ˆ d2
n

6.650
 2.97
2.237
Distinguishing between the
paired-sample and two-sample cases
• If the two samples are of unequal size then they
are not paired.
• Two samples of equal size are paired only if we
can be certain that each observation from the
second sample is associated with a
corresponding observation from the first sample.
Questions
A significance test for the
product-moment correlation coefficient
H 0 :   0 (no correlation)
H1 :   0 (positive correlation)
H 0 :   0 (no correlation)
H1 :   0 (negative correlation)
H 0 :   0 (no correlation)
H1 :   0 (some correlation)
n6
n 8
n  10 
Read Examples 13.1, pp.603-604
Do Exercise 13a, p.604