Download Set 8: Inference for the Mean of a Population

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Business Statistics for Managerial
Decision
Inference for the Population Mean
Test for a Population Mean

There are four steps in carrying out a
significance test:




State the hypothesis.
Calculate the test statistic.
Find the p-value.
State your conclusion by comparing the p-value
to the significance level .
Example: Blood pressures of executives

The medical director of a large company is concerned
about the effects of stress on the company’s younger
executives. According to the National Center for health
Statistics, the mean systolic blood pressure for males 35 to
44 years of age is 128 and the standard deviation in this
population is 15. The medical director examines the
records of 72 executives in this age group and finds that
their mean systolic blood pressure is X  129.93. Is this
evidence that the mean blood pressure for all the
company’s young male executives is higher than the
national average? Use  = 5%.
Example: Blood pressures of executives

Hypotheses:
H0:  = 128
Ha:  > 128

Test statistic:
z

x   0 129.93  128

 1.09
 n
15 72
P-value:
P  p( z  1.09)  1  .8621  .1379
Example: Blood pressures of executives

Conclusion:

About 14% of the time, a
SRS of size 72 from the
general male population
would have a mean blood
pressure as high as that of
executive sample. The
observed X  129.93 is not
significantly higher than the
national average.
Example: A company-wide health
promotion campaign

The company medical director institutes a health
promotion campaign to encourage employees to
exercise more and eat a healthier diet. One
measure of the effectiveness of such a program is
a drop in blood pressure. Choose a random sample
of 50 employees, and compare their blood
pressures from annual physical examination given
before the campaign and again a year later. The
mean change in blood pressure for these n = 50
employees is X  6 . We take the population
standard deviation to be  = 20. Use  = 5%.
Example: A company-wide health
promotion campaign

Hypotheses:
H0:  =0
Ha:  < 0

Test statistic:
z

x  0
60

 2.12
 n 20 50
P-value:
P  P( z  2.12)  0.0170
Example: A company-wide health
promotion campaign

Conclusion:

A mean change in blood
pressure of –6 or better
would occur only 17 times
in 1000 samples if the
campaign had no effect on
the blood pressures of the
employees. This is
convincing evidence that
the mean blood pressure in
the population of all
employees has decreased.
Example:Testing Pharmaceutical
products

The Deely Laboratory analyzes pharmaceutical products to
verify the concentration of active ingredients. Such
chemical analyses are not perfectly precise. Repeated
measurements on the same specimen will give slightly
different results. The results of repeated measurements
follow a Normal distribution quite closely, the analysis
procedure has no bias, so that the mean  of the population
of all measurements is the true concentration in the
specimen. The standard deviation of this distribution is a
property of the analytical procedure and is known to be
 = 0.0068 gram per liter. The laboratory analyzes each
specimen three times and reports the mean results.
Example:Testing Pharmaceutical
products

A client sends a specimen for which the
concentration of active ingredient is
supposed to be 0.86%. Deely’s three
analyses give concentrations
0.8403

0.8363
0.8447
Is there significant evidence at the 1% level
that the true concentration is not 0.86%?
Example:Testing Pharmaceutical
products

Hypotheses:
H0:  = 0.86
Ha:   0.86

Test Statistic: The mean of the three
analyses is X  .8404 . The one sample z test
statistic is therefore
z
x  0 .8404  .86

 4.99
 n .0068 3
Example:Testing Pharmaceutical
products


We do not need to find the exact P-value to assess
significance at the  = 0.01 level. Look in the
table A under tail area 0.005 because the
alternative is two-sided. The z-values that are
significant at the 1% level are z > 2.575 and
z < -2.575.
Our observed z = -4.99 is significant
P-value versus fixed 




In our example , we concluded that the test
statistic z = -4.99 is significant at the 1% level.
The observed z is far beyond the critical value for
 = 0.01, and the evidence against H0 is far
stronger than 1% significance suggests.
The P-value P = .0000006 (from a statistical
software) gives a better sense of how strong the
evidence is.
The P-value is the smallest level  at which the
data are significant.
Inference for the Mean of a Population





Both confidence intervals and tests of significance
for the mean  of a Normal population are based
on the sample mean X , which estimates the
unknown .
The sampling distribution of X depends on
.
There is no difficulty when  is known.
When  is unknown, we must estimate it.
The sample standard deviation s is used to
estimate the population standard deviation .
The t-distribution


Suppose we have a simple random sample of size
n from a Normally distributed population with
mean  and standard deviation .
The standardized sample mean, or one-sample z
statistic
x
z

0

n
has the standard Normal distribution N(0, 1).
When we substitute the standard deviation of the
mean (standard error) s /n for the /n, the
statistic does not have a Normal distribution.
The t-distribution


It has a distribution called t-distribution.
The t-distribution

Suppose that a SRS of size n is drawn from a N(, )
population. Then the one sample t statistic
t
x
s n
has the t-distribution with n-1 degrees of freedom.
 There is a different t distribution for each sample size.
 A particular t distribution is specified by giving the
degrees of freedom.
The t-distribution




We use t(k) to stand for t
distribution with k degrees of
freedom.
The density curves of the tdistributions are symmetric
about 0 and are bell shaped.
The spread of t distribution is a
bit greater than that of standard
Normal distribution.
As degrees of freedom k
increase, t(k) density curve
approaches the N(0, 1) curve.
The one –Sample t Confidence Interval

Suppose that an SRS of size n is drawn from a
population having unknown mean . A level C
confidence interval for  is

s
x t*
n
Where t* is the value for the t (n-1) density curve with
area C between –t* and t*. The margin of error is
t*

s
n
This interval is exact when the population distribution
is Normal and is approximately correct for large n in
other cases.
Example: Estimating the level of Vitamin C

The following data are the amount of vitamin C,
measured in milligram per 100 grams (mg/100g)
of the corn soy blend (dry basis), for a random
sample of size 8 from a production run:
26 31 23 22 11 22 14 31
We want to find a 95% confidence interval for ,
the mean vitamin C content of the corn soy blend
(CSB) produced during this run.
Example: Estimating the level of Vitamin C

The sample mean X  22.50 and the standard
deviation s = 7.19 with degrees of freedom
n-1 = 8-1 =7. The standard error of X is
SEx 

s
s

 2.54
n
8
From table we find t* = 2.365. The 95%
confidence interval is
s
7.19
 22.5  2.365
 22.5  (2.365)( 2.54)
n
8
 22.5  6  (16.5,28.5)
x t*
The one-sample t test:Summary
Example: Is the Vitamin C level correct?

The specifications for the CSB state that the
mixture should contain 2 pounds of vitamin
premix fro every 2000 pounds of product. These
specifications are designed to produce a mean ()
vitamin C content in the final product of
40mg/100 g. We test the null hypothesis that the
mean vitamin C content of the production run in
the previous example conforms to these
specifications. Use  = 5%.
Example: Is the Vitamin C level correct?

Hypotheses:
H0:  = 40
Ha:   40

Test statistic:
t

x   0 22.5  40

 6.88
s n
7.2 8
P-value:
P  2 P(t  6.88)

Because the degrees of freedom are n-1 =7, this t
statistic has t(7) distribution.
Example: Is the Vitamin C level correct?

From the largest entry in
the df =7 line of the table
we see that
P(t  5.408)  .0005


We conclude that the Pvalue is less than
20.0005, or P < .001.
We reject H0 and conclude
that the vitamin C content
for this run is below the
specification.
Matched Pairs t procedures



Comparative studies are usually preferred to
single-sample investigations because of the
protection they offer against confounding.
In a matched pairs study, subjects are matched in
pairs and the outcomes are compared within each
matched pair.
One situation calling for matched pairs is beforeand-after observations on the same subjects.
Matched Pairs t procedures



A matched pair analysis is needed when there are
two measurements or observations on each
individual and we want to examine the change
from the first to the second.
For each individual, subtract the “before” measure
from the “after” measure.
Analyze the difference using the one-sample
confidence interval and significance testing
procedures.
Example: The effect of language instruction

A company contracts with a language institute to provide
individualized instruction in foreign languages for its
executives who will be posted overseas. Is the instruction
effective? Last year 20 executives studied French. All had
some knowledge of French, so they were given the Modern
Language Association’s listening test of understanding of
spoken French before the instruction began. After several
weeks of immersion in French, the executives took the
listening test again. The following table gives the pretest
and posttest scores.
Example: The effect of language instruction
Example: The effect of language instruction

To analyze these data




Subtract pretest score from the posttest score.
These differences appear in the “gain” column in
previous table.
These 20 differences form a single sample.
To assess whether the institute significantly improved
the executives’ comprehension of spoken French we
test
H0 :  = 0
Ha:  > 0
Example: The effect of language instruction



Here  is the mean improvement that would be
achieved if the entire population of executives
received similar instruction.
The null hypothesis says that no improvement
occurs, and the alternative hypothesis says that
posttest scores are higher on the average.
The 20 differences have
X  2.5 and
s  2.893
Example: The effect of language instruction

The one sample t statistic is therefore
t

x 0
2.5  0

 3.86
s n 2.893 20
P-value is found from the t(19) distribution.
P  P(t  3.86)

T-table shows that 3.86 lies between the upper .001 and
.0005 critical values of the t(19) distribution. The
P-value therefore lies between these values.
Example: The effect of language instruction

Conclusion:


The improvement in scores was significant. We
have strong evidence that the posttest scores are
systematically higher.
A statistically significant but very small
improvement in language ability would not
justify the expense of the individualized
instruction. A confidence interval allows us
to estimate the amount of improvement.
Example: The effect of language instruction

Find a 90% confidence interval for the
mean improvement in the entire population.


The critical value t* = 1.729 from t-table for
90% confidence.
The confidence interval is:
s
2.893
 2.5  1.729
n
20
 2.5  1.12
 (1.38, 3.62)
x t*
Related documents