Download Testing the Population Variance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Confidence interval wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Statistical inference wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
Chapter 12
Inference About
One Population
Introduction

We shall develop techniques to
estimate and test three population
parameters.
mean m
 Population variance s2
 Population proportion p
 Population
Inference About a Population Mean
When the Population Standard
Deviation Is Unknown
Recall that when s is known we use the following
statistic to estimate and test a population mean
z
xm
s
n
When s is unknown, we use its point estimator s,
and the z-statistic is replaced then by the t-statistic
The t - Statistic
t
The t distribution is mound-shaped,
and symmetrical around zero.
d.f. = v2
v1 < v2
d.f. = v1
0
xm
s
n
The “degrees of freedom”,
(a function of the sample size)
determine how spread the
distribution is (compared to the
normal distribution)
How to calculus sample variance
From the data we have
 xi ,
2
x
 i , thus

x

x 

n

2
s
2
2
i
i
n 1
Testing m when s is unknown

Example 1
 In
order to determine the number of workers
required to meet demand, the productivity of
newly hired trainees is studied.
 It
is believed that trainees can process and
distribute more than 450 packages per hour
within one week of hiring.
 Can
we conclude that this belief is correct,
based on productivity observation of 50
trainees
Testing m when s is unknown

Example 1 – Solution
 The
problem objective is to describe the
population of the number of packages
processed in one hour.
 The data are interval.
H0:m = 450
H1:m > 450
 The t statistic
t
x m
s
n
d.f. = n - 1 = 49
Testing m when s is unknown

Solution continued (solving by
hand)
 The
rejection region is From the data we have
t > ta,n – 1
ta,n - 1 = t.05,49
@ t.05,50 = 1.676.
 x i  23,019
2
x
 i  10,671,357, thus
23,019
x
 460 .38, and
50

x

x 

n

2
s2
2
i
i
n 1
s  1507 .55  38.83
 1507 .55.
Testing m when s is unknown
Rejection region

The test statistic is
t
x m
s
n
1.676

460.38  450
38.83
50
1.89
 1.89

Since 1.89 > 1.676 we reject the null
hypothesis in favor of the alternative.

There is sufficient evidence to infer that the
mean productivity of trainees one week after
being hired is greater than 450 packages at .05
significance level.
Estimating m when s is unknown

Confidence interval estimator of m when s
is unknown
x  ta
s
2
n
d.f .  n  1
Estimating m when s is unknown

Example 2
 An
investor is trying to estimate the return
on investment in companies that won
quality awards last year.
 A random sample of 83 such companies is
selected, and the return on investment is
calculated had he invested in them.
 Construct a 95% confidence interval for the
mean return.
Estimating m when s is unknown

Solution (solving by hand)
 The
problem objective is to describe the
population of annual returns from buying
shares of quality award-winners.
 The data are interval.
x  15 .02 s 2  68 .98
s  68 .98  8.31
 Solving by hand

From the data we determine
x  ta
2, n 1
s
@ 15 .02  1.990
n
t.025,82@ t.025,80
8.31
83
 13 .19,16 .85 
Checking the required conditions
We need to check that the population is
normally distributed, or at least not
extremely nonnormal.
 There are statistical methods to test for
normality (one to be introduced later in the
book).
 From the sample histograms we see…

A Histogram for Example 1
14
12
10
8
6
4
2
0
400
425
450
475
500
525
550
Packages
A Histogram for Example 2
30
575
More
25
20
15
10
5
0
-4
2
8
14
Returns
22
30
More
Summary of Test Statistics to be Used in a
Hypothesis Test about a Population Mean
Yes
s known ?
Yes
n > 30 ?
No
Yes
Use s to
estimate s
s known ?
Yes
z
x m
s/ n
No
x m
t
s/ n
x m
z
s/ n
No
Popul.
approx.
normal
?
No
Use s to
estimate s
x m
t
s/ n
Increase n
to > 30
Example 1
Solution
Example 2
Solution
Example 3
Solution
Inference About a Population Variance
Sometimes we are interested in making
inference about the variability of
processes.
 Examples:

 The
consistency of a production process for
quality control purposes.
 Investors use variance as a measure of risk.

To draw inference about variability, the
parameter of interest is s2.
Inference About a Population Variance


The sample variance s2 is an unbiased,
consistent and efficient point estimator for s2.
(n  1)s 2
The statistic
has a distribution
2
s
called Chi-squared, if the population is
normally distributed.
d.f. = 5
2 
(n  1)s 2
s
2
d.f. = 10
d.f .  n  1
Testing the Population Variance

Example 3 (operation management application)
 A container-filling
machine is believed to fill 1
liter containers so consistently, that the variance
of the filling will be less than 1 cc (.001 liter).
 To test this belief a random sample of 25 1-liter
fills was taken, and the results recorded
 Do these data support the belief that the
variance is less than 1cc at 5% significance level?
Testing the Population Variance

Solution
 The
problem objective is to describe the population
of 1-liter fills from a filling machine.
 The data are interval, and we are interested in the
variability of the fills.
 The complete test is:
H0: s2 = 1
2
2
H1: s <1
(n  1)s
2
The test statistic is  
The rejection
region
.
s
is  2  12a ,n1
2
Testing the Population Variance
• Solving by hand
– Note that (n - 1)s2 = S(xi - x)2 = Sxi2 – (Sxi)2/n
– From the sample, we can calculate Sxi = 24,996.4,
and Sxi2 = 24,992,821.3
– Then (n - 1)s2 = 24,992,821.3-(24,996.4)2/25 =20.78
2
(
n

1
)
s
20.78
2
 
 2  20.78,
2
s
1
12a ,n1  .295,251  13.8484.
There is insufficient evidence
to reject the hypothesis that
the variance is less than 1.
Since 13.8484  20.78, do not reject
the null hypothesis.
Testing the Population Variance
a = .05
1-a = .95
Rejection
region
 2  13.8484
13.8484 20.8
2
.295,251
Do not reject the null hypothesis
Testing and Estimating a
Population Variance

From the following probability statement
P(21-a/2 < 2 < 2a/2) = 1-a
we have (by substituting 2 = [(n - 1)s2]/s2.)
(n  1)s 2
 2a / 2
 s2 
(n  1)s 2
12a / 2
Example 4
Solution
Example 5


During annual checkups physician routinely send their
patients to medical laboratories to have various tests
performed. One such test determines the cholesterol
level in patients’ blood. However, not all tests are
conducted in the same way. To acquire more information,
a man was sent to 10 laboratories and in each had his
cholesterol level measured. The results are listed here.
Estimate with 95% confidence the variance of these
measurements.
4.70 4.83 4.65 4.60 4.75 4.88 4.68 4.75 4.80 4.90
Solution
Inference About a Population Proportion
When the population consists of nominal
data, the only inference we can make is
about the proportion of occurrence of a
certain value.
 The parameter p was used before to
calculate these probabilities under the
binomial distribution.

Inference About a Population Proportion

Statistic and sampling distribution
 the
statistic used when making inference
about p is:
x
p̂  where
n
x  the number of successes .
n  sample size .
– Under certain conditions, [np > 5 and n(1-p) > 5],
p̂ is approximately normally distributed, with
m = p and s2 = p(1 - p)/n.
Testing and Estimating the Proportion

Test statistic
for p
p̂  p
Z
p(1  p) / n
where np  5 and n(1  p)  5

Interval estimator for p (1-a
confidence level)
p̂  z a / 2 p̂(1  p̂) / n
provided np̂  5 and n(1  p̂)  5
Example 6
Solution
Selecting the Sample Size to
Estimate the Proportion

Recall: The confidence interval for the proportion
is
pˆ  za / 2 pˆ (1  pˆ ) / n

Thus, to estimate the proportion to within W, we
can write
W  za / 2 pˆ (1  pˆ ) / n
Selecting the Sample Size to
Estimate the Proportion

The required sample size is
 za / 2 pˆ (1  pˆ )
n
W




2
Selecting the Sample Size
Two methods – in each case we choose a value for
solve the equation for n.
Method 1 : no knowledge of even a rough value of
a ‘worst case scenario’ so we substitute = .50
then
. This is
Method 2 : we have some idea about the value of . This is
a better scenario and we substitute in our estimated
value.
12.40