Download 8 - Employees

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
8.1 Estimating  When  is Known (Page 1 of 25)
8.1 Estimating  When  is Known
Assumptions about the random variable x
1. We have a simple random sample of size n drawn from the
population of x values.
2. The value of  , the population standard deviation is known.
3. If the x distribution is normal, then our methods work for any
sample size n.
4. If x has an unknown distribution, then we require the sample
size n  30 . However, if the x distribution is not moundshaped, then a sample size of 50 or 100 may be needed.
Point Estimate
An estimate of a population given by a single number is called a
point estimate of that population parameter.
For Example:
x is a point estimate for  .
s is a point estimate for  .
Margin of Error
The margin of error in using x as a point estimate for  is given
by E  x   .
A point estimate is not very useful unless we have some kind of
measure of how “good” it is. This “measure of goodness” is
expressed as a confidence interval.
8.1 Estimating  When  is Known (Page 2 of 25)
Confidence Interval and Level of Confidence
Suppose 100 students at Palomar were randomly chosen and their
heights were measured yielding a [sample] mean of 5.72 ft with a
margin of error of 0.08 ft. Consider the following statements:
1. The population mean is approximately 5.72 feet.
2. There is a 95% probability that the population mean is
between 5.64 ft and 5.80 ft. P(5.64 ft    5.80 ft)  0.95
3. At a 95% level of confidence the population mean is between
5.64 ft and 5.80 ft.
4. The population mean   5.72  0.08 feet at a 95% level of
confidence.
Confidence levels and confidence intervals provide a measure of
how “good” a point estimate estimates a population parameter.
8.1 Estimating  When  is Known (Page 3 of 25)
Confidence Interval for 
A c-percent confidence interval for the population mean  is
an interval computed from sample data in such a way that c is the
probability of generating an interval containing the actual value of
 . That is,
P(x  E    x  E)  c ,
Shaded Area = c
xE
xE
x -axis
The probability that 
is on this interval is c
where E is the maximum margin of error when estimating  with
x.
In words, P(x  E    x  E)  c means . . .
1. The probability that the population mean  is between x  E
and x  E is c.
2. The population mean  is between x  E and x  E at a
confidence level of c.
3. The population mean is   E at a c-percent level of
confidence.
5. If we repeat the experiment many times with the same sample
size, then c proportion of the intervals calculated will contain
the population mean  . Thus, 1 c proportion of the intervals
will not contain  .
8.1 Estimating  When  is Known (Page 4 of 25)
Example 1
Jackie has been jogging 2 miles a day for years and she records her
times. A sample of 90 of these times has a mean of 15.60 minutes
and a known standard deviation of 1.80 minutes.
a. Find a 95% confidence interval for
 . Draw and label the normal
distribution illustrating the
confidence interval. Solve without
using the ZInterval function (see
below).
b. Find E, maximum error in estimating  with x at the
confidence level c.
c. Write the conclusion in probability notation.
i.e. P(x  E    x  E)  c
d. Summarize your conclusion in one sentence [relvant to the
application].
8.1 Estimating  When  is Known (Page 5 of 25)
Using the TI-83/84 ZInterval Function
The ZInterval (STAT / TESTS / 7: ZInterval) function computes a
confidence interval for and unknown population mean  when the
population standard deviation is known.
Input:
Output:
 , x , and c-level
STATS
The interval from x  E to x  E , where
1
E  (interval length)
2
Example 2
a. Compute the 95% confidence interval in example 1. Use the
ZInterval function.
b. Summarize your results in a complete sentence relevant to this
application.
At a 95% level of confidence, the population mean  of all 2mi jogging times for Jackie is between 15.23 and 15.97
minutes.
8.1 Estimating  When  is Known (Page 6 of 25)
Section 8.1 Homework Instructions
Steps to find a c% confidence interval for 
1. Sketch a normal curve illustrating
the c% confidence interval for  .
Label x  E , x  E , and x .
Where E is the margin of error
when estimating  with x at a
confidence level of c.
xE
xE
2. Without using the ZInterval function, compute the c%
confidence interval for the population mean. That is,
x  E  invNorm(area to the left of x  E,  x ,  x )
x  E  invNorm(area to the left of x  E,  x ,  x )
Use the estimate x   x , and  x   / n .
3. Find E, is the maximum error in estimating  with x at a
confidence level of c. It is computed as follows
E = “half the interval length” from step 2
E  12 (x  E)  (x  E)
4. Write the confidence interval in probability notation.
i.e. P(x  E    x  E)  c
5. Summarize your results in a concise, complete sentence
relevant to the problem. That is,
At the c% confidence level the population mean  of all
____________________ is between _____ and _____ [units].
x -axis
8.1 Estimating  When  is Known (Page 7 of 25)
Guided Exercise 2
Jason jogs 3 miles per day and records his times. A sample of 90
of these times has a mean of 21.50 minutes and a known standard
deviation of 2.11 minutes. Find the 99% confidence interval for
the population mean by completing steps 1-5 above.
a. Sketch a normal curve to illustrate
the 99% confidence interval for
the mean in his application. Label
the axis.
b. Without using the ZInterval
function, find the 99% confidence
interval for the population mean.
c. Find E = “half the interval length”
d. Write the confidence interval in probability notation.
i.e. P(x  E    x  E)  c
e. Summarize your results in a concise, complete sentence
relevant to the problem.
At a ____% confidence level, the mean of _______________
_________________________________________________
is between ____________ and _____________.
8.1 Estimating  When  is Known (Page 8 of 25)
Guided Exercise 3
An automobile loan company wants to estimate the amount of the
average car loan during the past year. A random sample of 200
loans had a mean of $8225 and a known standard deviation of
$762. Find the 95% confidence interval for the population mean by
completing steps 1-5 above.
a. Sketch a normal curve to illustrate
the 95% confidence interval for
the mean in his application. Label
the axis.
b. Without using the ZInterval
function, find the 95% confidence
interval for the population mean.
c. Find E = “half the interval length”
d. Write the confidence interval in probability notation.
i.e. P(x  E    x  E)  c
e. Summarize your results in a concise, complete sentence
relevant to the problem.
At a ____% confidence level, the mean of _______________
_________________________________________________
is between ____________ and _____________.
8.1 Estimating  When  is Known (Page 9 of 25)
8.1 (was 8.4) Estimating the Sample Size n
Critical Value
z c is called the critical value for a confidence level c if
P(zc  z  zc )  c
That is, z c is the z-score such that the area under the standard
normal curve between zc and z c is c.

In words we say . . .
a. “the probability that a randomly selected z-value is between
zc and z c is c.” Or
b. “at a c-percent level of confidence we can say that a randomly
chosen z will be between zc and zc .”
Shaded Area = c
zc
zc
z-axis
For Example
1.
If c  0.90 , then P(z0.90  z  z0.90 )  0.90 . Compute z0.90 .
2.
If c  0.95 , then P(z0.95  z  z0.95 )  0.95 . Compute z0.95 .
3.
If c  0.99 , then P(z0.99  z  z0.99 )  0.99 . Compute z0.99 .
Estimating Sample Size n for Estimating 
8.1 Estimating  When  is Known (Page 10 of 25)
If, with a confidence level of c, we want our point estimate x to be
within E units of  , then we choose the sample size n to be
zc   2
n  

 E  ,
where z c is the critical value for a confidence level of c.
Example 6

A sample of 50 salmon is caught and weighed. The sample
standard deviation of the 50 weights is 2.15 lb. How large of a
sample should be taken to be 97% confident that the sample mean
is within 0.20 lb of the mean weight of the population? Find z c (to
the nearest thousandth) and n. Then summarize your results in a
complete sentence relevant to this application.
8.1 Estimating  When  is Known (Page 11 of 25)
Example 7
An efficiency expert wants to determine the mean time it takes an
employee to assemble a switch on an assembly line. A preliminary
study of 45 observations found a sample standard deviation of 78
seconds. How many more observations are needed to be 92%
certain that the mean of the sample will vary from the true mean by
no more than 15 seconds? Find z c (to the nearest thousandth) and
n. Then summarize your results in a complete sentence relevant to
this application.
Guided Exercise 6
The dean wants to estimate the average teaching experience (in
years) of the faculty members. A preliminary random sample of
60 faculty yields a sample standard deviation of 3.4 years. How
many more faculty should be sampled to be 99% confident that the
sample mean does not differ from the true mean by more than 0.5
years? Find z c (to the nearest thousandth) and n. Then summarize
your results in a complete sentence relevant to this application.
8.2 Estimating  When  is Unknown (Page 12 of 25)
8.2 Estimating  When  is Unknown
When the population standard deviation  is unknown, it is
approximated by the sample standard deviation s. The TInterval
function works with what is called the Student’s t-distribution
where all statistical “fudge factors” necessary to accommodate
approximating  with s are built into the function.
The TInterval function (TI-83: STAT / TESTS / 8: TInterval)
Input:
STATS
DATA
Output:
The interval from x  E to x  E , where
1
E  (interval length)
2
x, s, n, and c-level
data list and c-level
or
Homework Instructions for Section 8.2
1. Omit exercises #1-4
2. When asked to find a confidence interval, do the following:
a. Find the c% confidence interval for the mean  . Write it
in probability notation
b. Summarize your results in a complete sentence relevant to
the application.
8.2 Estimating  When  is Unknown (Page 13 of 25)
Example 4
An archeologist discovered a new, but extinct, species of miniature
horse. The only seven known samples show shoulder heights (in
cm) of 45.3, 47.1, 44.2, 46.8, 46.5, 45.5, and 47.6. Find the 99%
confidence interval for  (the mean height of the entire population
of ancient horses) and the error E. Then summarize your results in
a complete sentence relevant to this application.
a. Find the 99% confidence interval for the mean  . Write it in
probability notation
b. Summarize your results in a complete sentence relevant to the
application.
Guided Exercise 3
A company produced a trial production run of 37 artificial
sapphires. The mean weight is 6.75 carats and the standard
deviation is 0.33 carats. Find the 95% confidence interval for the
mean weight  of all artificial sapphires and the error E. Then
summarize your results in a complete sentence relevant to this
application.
8.3 Estimating p in a Binomial Experiment (Page 14 of 25)
8.3 Estimating p in a Binomial Experiment
Large Sample Size Assumption
If np > 5 and nq > 5, then the sample size n is large enough so that
the binomial distribution can be approximated by a normal
distribution, and a c% confidence interval for p is expressed as
P( pˆ  E  p  pˆ  E)  c
where pφ is the point estimate for p.

TI-83 1-PropZInt function: STAT / TESTS / A: 1-PropZInt
Input:
x = r = number of successes
n = number of trials
c-level = confidence level
Output:
( pˆ  E, pˆ  E), pˆ , n
Where E (the maximum error in using pφ as a point estimate for p
for the 
given confidence level) is one-half the interval length.
8.3 Estimating p in a Binomial Experiment (Page 15 of 25)
Example 5
Suppose 800 students were given flu shots and 600 did not get the
flu. Assuming all 800 were exposed to the flu:
a. What is S, n, and r (note: r is
S=
input as variable x on the TI-83)?
r=
n=
b. What are the point estimates for p
and q (i.e. pφ and qφ)?
p̂ 
q̂ 
c. Is n large enough to approximate
the binomial distribution with a
normal distribution? Why?
np̂ 
d. Find the 99% confidence interval
for p.
P( p̂  E  p  p̂  E)  0.99
e. Summarize your results in a
complete sentence relevant to this
application.
nq̂ 
8.3 Estimating p in a Binomial Experiment (Page 16 of 25)
Guided Exercise 4
A random sample of 195 books at a bookstore showed that 68 of
the books were nonfiction.
a. Find S and pφ.
b. Is the sample size large enough to approximate a normal
distribution with a binomial distribution? Why?
c. Find the 90% confidence interval for p to the nearest
thousandth (3 decimal places).
d. Summarize your results in a complete sentence relevant to this
application.
Homework Instructions for Section 8.3 Problems
When asked to find the c% confidence interval for p, do the
following four steps.
1. Find S and pφ
2. Determine if the sample size is large enough to approximate a
normal distribution with a binomial distribution?
3. Find the c% confidence interval for p to the nearest thousandth
(3 decimal places).
4. Summarize your results in a complete sentence relevant to this
application.
8.3 Estimating p in a Binomial Experiment (Page 17 of 25)
A Margin of Error, E, is the maximum error when using a point
estimate for a population parameter at a given confidence level.
General Interpretation of Poll Results
1. When a poll states the results of a survey, the proportion
reported is pφ (the sample estimate of the population
proportion).
2. The margin of error is the maximal error E of a [95%, usually]
confidence interval for p.
3. If pφ is obtained from a poll, Then a 95% confidence interval
for the population proportion p is pφ E  p  pφ E .
Guided Exercise 5
A random sample of 315 households were surveyed. Chances are
19 of 20 that if all adults had been surveyed, the findings would
differ from the poll results by no more than 2.6% in either
direction. One question was asked: “Which party would do a
better job handling education?” The possible responses were
Democrats, Republicans, neither, or both. The poll reported that
32% responded Democrat.
a. What confidence level corresponds to the phrase “chances are
19 of 20 that if . . . .”
b. What is S, n, and the sample statistic pφ for the proportion
responding Democrat?
c. Find E. Find the 95% confidence interval for p those who
would respond Democrat.
d. Summarize your results in a complete sentence relevant to this
application.
8.3 Estimating p in a Binomial Experiment (Page 18 of 25)
8.3 Estimating Sample Size n for Estimating p
(a) If, with a confidence level of c, we want our point estimate pφ
to be within E units of p, then we choose the sample size n to
be
z 
n  pφ qφ  c 
 E
2
where z c is the z-score corresponding to a confidence level of
c.
(b)
If no estimate for p is available, we can say with a confidence
level of at least c that the point estimate pφ will be within E
units of p by choosing
zc 2
n  0.25  
E 
Example 8
A buyer for a popcorn company wants to estimate the probability p

that a kernel purchased
from a particular farm will pop. Suppose a
random sample of n kernels is taken and r of these kernels pop.
The buyer wants to be 95% certain that the point estimate pφ will
be within 0.01 units of p.
a. Find z c and E.
b. If no estimate for p is available, how large a sample should the
buyer use? (i.e. how large should n be)?
c. A preliminary study showed that p was approximately 0.86.
Now, how large a sample should be used?
8.3 Estimating p in a Binomial Experiment (Page 19 of 25)
Guided Exercise 7
The health department wants to estimate the proportion of children
who require corrective lenses for their vision. They want to be
99% sure that the point estimate for p will have a maximum error
of 0.03.
a. If no other information is known, find E and z c . Estimate the
sample size required.
b. Suppose a preliminary random sample of 100 children
indicates that 23 require corrective lenses. How large should n
be?
8.4 Estimating
1  2
and
p1  p2
(Page 20 of 25)
8.4 Estimating 1  2 and p1  p2
Independent and Dependent Samples
In order to make a statistical estimate about the difference between
two populations, we need to have a sample from each population.
Two samples are independent if the sample from one population
is unrelated to the sample from the other. However, if each
measurement in one sample can be naturally paired with
measurements of another sample, the two samples are said to be
dependent (such as before and after samples).
Guided Exercise 8
Classify the pairs of samples as dependent or independent.
a. In a medical experiment, one group is given a treatment and
another group is given a placebo. After a period of time both
groups are measured for the same condition.
b. A group of Math students is given a test at the beginning of a
course and the same group is given the same test at the end of
the course.
8.4 Estimating
1  2
and
p1  p2
(Page 21 of 25)
Theorem 8.1
Let x1 and x2 have normal distributions. If we take independent
random samples of size n1 from x1 and n2 from x2 , then the
variable x1  x2 has
1.
a normal distribution
2.
a mean of 1  2
3.
a standard deviation of
 12
n1

 22
n2
Estimating 1  2 When  1 and  1 are Known
A c% confidence interval for 1  2 is expressed as
(x1  x 2 )  E  1  2  (x1  x2 ) E
This interval is the output of the TI-83 function 2-SampZInt.
TI-83function 2-SampZInt (STAT / TESTS / 9: 2-SampZInt)
1,  2 , x1, n1, x2 , n2 , c level
Input:
Output: Interval from (x1  x 2 )  E to (x1  x2 ) E
Where E
is one half the interval length output by the 2-SampZInt
function.

8.4 Estimating
1  2
and
p1  p2
(Page 22 of 25)
Example 9
Suppose a biologist is studying data from Yellowstone streams
before and after a 1988 fire. A random sample of 167 fishing
reports in the years before the fire showed the average catch per
day of 5.2 trout with   1.9 trout. After the fire a sample of 125
fishing reports showed the average catch per day of 6.8 trout with
  2.3 trout.
a. Are the sample independent?
b. Compute a 95% C.I. for 1  2 .
At a 95% level of confidence ________ < 1  2 < _______.
c. Explain the meaning of part b.
Estimating 1  2 When  1 and  1 are Unknown
A c% confidence interval for 1  2 is expressed as
(x1  x 2 )  E  1  2  (x1  x2 ) E
This interval is the output of the TI-83 function 2-SampTInt.
TI-83function 2-SampTInt (STAT / TESTS / 0: 2-SampTInt)
x1 , s1 , n1 , x2 , s2 , n2 , c-level, pooled: yes
Input:
Output: Interval from (x1  x 2 )  E to (x1  x2 ) E
Where E is one half the interval length output by the 2-SampTInt
function.

8.4 Estimating
1  2
and
p1  p2
(Page 23 of 25)
Example 10
Suppose that a random sample of 29 college students was divided
into two groups. The first group had 15 people and was given 1/2
liter of red wine before going to sleep. The second group of 14
people was not given alcohol before going to sleep. Both groups
went to sleep at 11 p.m. The average brain wave activity (in hertz)
between 4 and 6 a.m. was measured for each participant. The
results follow:
Group 1
16.0 19.6 19.9 20.9 20.3 20.1 16.4 20.6
20.1 22.3 18.8 19.1 17.4 21.1 22.1
x1  19.65 hz , s1  1.86 hz
Group 2
8.2
7.6
5.4 6.8
10.2 6.4
6.5
8.8
4.7
5.4
5.9
8.3
2.9
5.1
x2  6.59 hz , s2  1.91 hz
a. Are the groups independent?
b. Compute the 90% C.I. for 1  2 and write it in probability
notation.
c. Summarize the results of part b in a single sentence relevant to
this application.
8.4 Estimating
1  2
and
p1  p2
(Page 24 of 25)
Guided Exercise 9
a. A study reported a 90% confidence interval for the difference
of the means to be 10  1  2  20 . What can you conclude
about the values of 1 and  2 .
b. A study reported a 95% confidence interval for the difference
of proportions to be 0.32  p1  p2  0.16 . What can you
conclude about the values of p1 and p2 .
8.4 Estimating
1  2
and
p1  p2
(Page 25 of 25)
Confidence Interval for p1  p2 (Large Samples)
If n1 pφ1  5 , n1qφ1  5 , n2 pφ2  5 and n2 qφ2  5 , then the c%
confidence interval for p1  p2 is expressed as
( pφ1  pφ2 )  E  p1  p2  ( pφ1  pφ2 )  E
where E is the maximum error in using pφ1  pφ2 as an estimate for
p1  p2 at a c% confidence level.
TI-83 function 2-PropZInt (STAT / TESTS / B: 2-PropZInt)
r1  x1 , n1 , r2  x2 , n2 , c-level
Input:
Output:
Interval from ( pφ1  pφ2 )  E to ( pφ1  pφ2 )  E
Where E is one half the interval length.
Exercise 14
The burn center at Community hospital is experimenting with a
new plasma compress treatment. A random sample of 316 patients
with minor burns received the plasma compress treatment. Of these
patients, 259 had no visible scars after treatment. Another random
sample of 419 patients with minor burns received no plasma
compress treatment. Of this group, 94 had no visible scars. Let p1
be the proportion of patients who received the plasma compress
treatment and had no visible scars after treatment. Let p2 be the
proportion of patients who did not receive the plasma compress
treatment but still had no visible scars.
a. Find the 95% confidence interval for p1  p2 .
b. Summarize the results in a single sentence relevant to this
application.