Download Chapter 9

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Estimation and Confidence
Intervals
Chapter 9
McGraw-Hill/Irwin
©The McGraw-Hill Companies, Inc. 2008
GOALS






2
Define a point estimate.
Define level of confidence.
Construct a confidence interval for the population
mean when the population standard deviation is
known.
Construct a confidence interval for a population
mean when the population standard deviation is
unknown.
Construct a confidence interval for a population
proportion.
Determine the sample size for attribute and variable
sampling.
Point and Interval Estimates

A point estimate is the statistic, computed from
sample information, which is used to estimate
the population parameter.
–

A confidence interval estimate is a range of
values constructed from sample data so
that the population parameter is likely to occur
within that range at a specified probability.
–
3
Sample mean for population mean.
The specified probability is called the level of
confidence.
Confidence Interval for a Mean (σ Known)

We can get 95% (or 99%) confidence interval
for a population mean, based on the CLT.
–
–
Mean of the sampling distribution is normally
distributed with the mean μ and the variance σ2/n.
95% of the sample means selected from a
population will be within 1.96 standard deviations
from the population mean μ.

–
From this, we can construct the 95% confidence
interval for population mean (μ ).

4
Here, the standard deviation means that of the sample
means, i.e. the standard error.
X(bar) +(-) 1.96 σ/n1/2
Factors Affecting Confidence Interval
Estimates
The factors that determine the width
of a confidence interval are:
1.The sample size, n.
2.The variability in the population, usually
σ estimated by s.
3.The desired level of confidence.
5
Confidence Interval for a Mean (σ Known)

In general, we can construct confidence
interval for any confidence level (0-100%), by
finding the corresponding value z.
–
–
6
1.96 for 95% confidence level.
2.58 for 99% confidence level.
Interval Estimates- Example and
Interpretation (pp. 298-299)
Random sample of 256 managers’ income.
A 95% confidence interval implies that about 95% of the similarly
constructed intervals will contain the parameter being estimated.
That is, the true population mean will belong to such intervals in 95
times out of 100 samples.
7
Confidence Interval for a Mean (σ unknown)

In many (most) of the cases, standard deviation
of the population (σ ) is unknown.
–
–
–
We cannot construct confidence interval using previous
method (based on z, or standard normal distribution).
In this case, we may use sample standard deviation (s)
to estimate the population standard deviation (σ).
Then, we can show the following statistic follows socalled “t distribution”, which is similar to z (standard
normal) distribution.
X 
t
s n
8
Characteristics of the t-distribution
1. It is, like the z distribution, a continuous distribution.
2. It is, like the z distribution, bell-shaped and
symmetrical.
3. There is not one t distribution, but rather a family of t
distributions. All t distributions have a mean of 0, but
their standard deviations differ according to the
sample size, n.
4. The t distribution is more spread out and flatter at the
center than the standard normal distribution
* As the sample size increases, the t distribution
approaches the standard normal distribution,
9
Comparing the z and t Distributions
when n is small
10
Confidence Interval Estimates for the
Mean
Use Z-distribution
If the population
standard deviation
is known.
11
Use t-distribution
If the population
standard deviation
is unknown.
When to Use the z or t Distribution for
Confidence Interval Computation
12
Confidence Interval for the Mean –
Example using the t-distribution
A tire company investigates the
tread life of its tires. A
sample of 10 tires driven
50,000 miles revealed a
sample mean of 0.32 inch of
tread remaining with a
standard deviation of 0.09
inch. Construct a 95 percent
confidence interval for the
population mean. Would it
be reasonable for the
company to conclude that
after 50,000 miles the
population mean of tread
remaining is 0.30 inches?
13
Given in the problem :
n  10
x  0.32
s  0.09
Compute the C.I. using the
t - dist. (since  is unknown)
s
X  t / 2,n 1
n
Student’s t-distribution Table
14
Degrees of freedom (of t distribution)

“df” (degrees of freedom): the number of
observation in the sample (n) minus the
number of samples.
–
–
In above case, n-1 (=9).
Why called “degree of freedom”?



15
When sample statistics are used, it is necessary to
determine the number of values that are free to vary.
(eg) The mean of four numbers (7, 4, 1, 8) is 5. Here,
the “df” is 3, because the sum of mean deviations is 0.
That is, 1 degree of freedom is lost in the process of
getting the sample mean.
So refer to the row of distribution table with df=9.
Confidence Interval Estimates for the Mean –
Using Excel
16
A Confidence Interval for a Proportion


Proportion: the fraction, ratio, or percent
indicating the part of the sample or the
population having a particular trait of interest.
Sample proportion: p = X/n
–
–
17
The population proportion is identified by π. (i.e.
the proportion of successes in a binomial
distribution)
If some conditions are met, then the p is following
a normal distribution with mean π and standard
deviation (p(1-p)/n)1/2 approximately.
Using the Normal Distribution to
Approximate the Binomial Distribution
For a confidence interval for a proportion to be constructed, the
following assumptions need to be met.
1. The binomial conditions (Chapter 6), is to be met. Briefly, these
conditions are:
a. The sample data is the result of counts.
b. There are only two possible outcomes (success or failure).
c. The probability of a success remains the same from one trial
to the next.
d. The trials are independent, meaning the outcome on one trial
does not affect the outcome on another.
2. The values nπ and n(1-π) should both be greater than or equal
to 5. This condition allows us to employ the standard normal
distribution z, to construct a confidence interval for a proportion.
18
Confidence Interval for a Population
Proportion
The confidence interval for a
population proportion (π) is
estimated by:
p  z / 2
19
p(1  p)
n
Confidence Interval for a Population
Proportion- Example
The trade union representing
company A is considering to
merge with another union B.
According to the union A’s
bylaws, at least three-fourths of
the union membership must
approve any merger. A random
sample of 2,000 current
members reveals 1,600 plan to
vote for the merger proposal.
What is the estimate of the
population proportion?
Develop a 95 percent confidence
interval for the population
proportion. Basing your decision
on this sample information, can
you conclude that sufficient
proportion of members favor the
merger? Why?
20
First, compute the sample proportion :
x 1,600
p 
 0.80
n 2000
Compute the 95% C.I.
p (1  p )
n
C.I.  p  z / 2
 0.80  1.96
.80(1  .80)
 .80  .018
2,000
 (0.782,0.818)
Conclude : The merger proposal will likely pass
because the interval estimate includes values greater
than 75 percent of the union membership .
Finite-Population Correction
Factor




Previous discussion dealt with populations of very large size (or infinite)
A population that has a fixed upper bound is said to be finite.
For a finite population, where the total number of objects is N and the
size of the sample is n, the following adjustment is made to the
standard errors of the sample means and the proportion:
However, if n/N < .05, the finite-population correction factor may be
ignored.
Standard Error of the
Sample Mean
x 
21

n
N n
N 1
Standard Error of the
Sample Proportion
p 
p (1  p ) N  n
n
N 1
Effects on FPC when n/N Changes
Observe that FPC approaches 1 when n/N becomes smaller
22
Confidence Interval Formulas for Estimating Means
and Proportions with Finite Population Correction
C.I. for the Mean ()
X z

n
N n
N 1
C.I. for the Mean ()
s
X t
n
C.I. for the Proportion ()
p(1  p)
pz
n
23
N n
N 1
N n
N 1
CI For Mean with FPC - Example
There are 250 families in a
town. A random sample of
40 of these families
revealed the mean annual
church contribution was
$450 and the standard
deviation of this was $75.
Develop a 90 percent
confidence interval for the
population mean of
church contribution.
Interpret the confidence
interval.
Given in Problem:
N – 250
n – 40
s - $75
Since n/N = 40/250 = 0.16, the finite
population correction factor must
be used.
The population standard deviation is
not known therefore use the tdistribution.
Use the formula below to compute
the confidence interval:
s
X t
n
24
N n
N 1
CI For Mean with FPC - Example
X t
s
n
N n
N 1
 $450  t.10, 401
 $450  1.685
$75
40
$75
40
250  40
250  1
250  40
250  1
 $450  $19.98 .8434
 $450  $18.35
 ($431.65,$468.35)
It is likely tha t the population mean is more than $431.65 but less than $468.35.
To put it another wa y, could the population mean be $445? Yes, but it is not
likely tha t it is $425 because the value $445 is within th e confidence
interval and $425 is not within the confidence interval.
25
Choosing an Appropriate Sample Size
How many items should be included in a
sample?
There are 3 factors that determine the
appropriate size of a sample, none of which
has direct relationship to the population size.
 The level of confidence desired.
 The maximum allowable error.
 The variation in the population.
–
26
Measured by population standard deviation.
Sample Size Determination for a
Variable

To find the sample size for a variable:
–
Use the formula for margin of error, E= z (σ/n1/2)
 zs 
n

 E 
2
where :
E - the allowable error
z - the z - value correspond ing to the selected
level of confidence
s - the sample deviation (from pilot sample)
27
Sample Size Determination for a
Variable-Example
A researcher wants to estimate the
mean amount of remuneration
for members of city councils in
large cities. The error in
estimating the mean is to be
less than $100 with a 95
percent level of confidence. The
researcher found a report by
the Department of Labor that
estimated the standard
deviation to be $1,000. What is
the required sample size?
Given in the problem:
 E, the maximum allowable error,
is $100
 The value of z for a 95 percent
level of confidence is 1.96,
 The estimate of the standard
deviation is $1,000.
28
 zs 
n

 E 
2
 1.96  $1,000 


$
100


 (19.6) 2
 384.16
 385
2
Sample Size for Proportions

The formula for determining the sample size
in the case of a proportion is based on
–
the desired confidence level, margin of error (in
the proportion), and an estimate of the proportion.
Z
n  p(1  p ) 
E
where :
2
p is estimate from a pilot study or some source,
otherwise, 0.50 is used
z - the z - value for the desired confidence level
E - the maximum allowable error
29
Example for proportion
A study needs to estimate the
proportion of cities that have
private refuse collectors. The
investigator wants the margin of
error to be within .10 of the
population proportion, the
desired level of confidence is
90 percent, and no estimate is
available for the population
proportion. What is the required
sample size?
30
2
 1.65 
n  (.5)(1  .5)
  68.0625
 .10 
n  69 cities
End of Chapter 9
31