Download Interval Estimator

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter 08
Interval Estimation
1
Chapter 8 - Learning Objectives
• Explain the difference between a point and an
interval estimate.
• Construct and interpret confidence intervals:


with a z for the population mean or proportion.
with a t for the population mean.
• Determine appropriate sample size to achieve
specified levels of accuracy and confidence.
8.1 Introduction
Statistical inference is the process by which we
acquire information about populations from
samples.
There are two ways in which we can make
inferences about the population parameter:
1. By providing estimates of a parameter
2. By testing a hypotheses about a parameter
3
8.2 Concept of Estimation
The main objective of estimation is to determine
the value of a population parameter on the
basis of a sample statistic.
There are two types of estimations
1. Point Estimation
2. Interval Estimation
4
Point Estimator
A point estimator allows us to draw inference
about a population parameter (say the mean
or the proportion) by estimating a statistic from
a sample.
That is, the sample statistic provides us with
estimate of the value of the parameter at a
single point (value)—thus the name point
estimate.
5
Interval Estimator
An interval estimator draws inferences about
a population parameter by providing a range
(interval) of value within which the unknown
population parameter lies.
Population distribution
Parameter
Sample distribution
Interval estimator
7
Example-Take a sample and compute the
Average Weekly Summer Income of
students in your sample (Say, 600)
of UMD students.
You want to know the
Average Weekly Summer
income of UMD students.
Point Estimate: µ=$400
X = 400
Interval estimate:
µ= $380-$420
Interval estimator is used more frequently than point estimator for two
reasons : (1) point estimator is more prone to making faulty inference,
and (2) interval estimator allows to specify how confident we are in our
estimate.
8
Characteristics of Estimators
In estimation (whether point or interval), we
always want to select the right sample and sample
statistic that enable us to estimate a parameter
with as small error as possible.
The selection of the right statistic depends on
some important characteristics.
Desirable characteristics of Estimators
9
Desirable Characteristics of Estimators
1. Unbiasedness:
An unbiased estimator is one whose expected value is equal
to the parameter it estimates.
2. Consistency:
We say an unbiased estimator is consistent if the difference
between the estimator and the parameter grows smaller as
the sample size increases.
3. Relative efficiency:
We say an estimate is relatively efficient if from among two
or more unbiased estimators (estimates), the one we use has
a relatively smaller variance .
10
8.3 Interval Estimation of the Population Mean
and Proportion
8.3.1 When the Population Variance is Known
8.3.2 When the Population Variance is Unknown
11
8.3.1 Estimating the Population Mean
when the Population Variance is Known
We are able to provide an interval estimate of a population mean or proportion
based on the following characteristics of a sampling distribution.
1.
Given the sampling distribution, we can draw a sample of size n from the
population, and calculate the sample mean or proportion
2.
Given the central limit theorem we consider that the sampling distribution of
the sample means or proportions is normal (or approximately normal) and
thus provide probability estimates for the sample mean or proportion that
we estimate.
3.
Z
x 
 n
Given the formula for standardizing a random variable,
we can
relate the standardized value obtained from a normal distribution and the
sample mean/proportion we are estimating :
12
Margin of Error and the Interval Estimate
The general form of an interval estimate of a
population mean is
x  Margin of Error
13
8.3 Estimating the Population Mean when
the Population Variance is Known
The margin of error is computed using the following
formula…
 

 Margin of Error  z 2 ( )
n 

14
8.3 Estimating the Population Mean when
the Population Variance is Known
Thus, the range (interval) that contains the true value of the
unknown population parameter (say the mean) is
 

x

z
(
)

2


n

Where Z is the standard value of the random variable; and
α is the confidence coefficient at which we want to provide
the interval estimate
15
8.3.1 Estimating the Population Mean when the
Population Variance is Known
 

 x  z 2 ( )
n 

where:
X
is the sample mean
z/2 is the standardized value of the Random
variable representing an area, /2 in on one tail
of the standard normal probability distribution
 is the population standard deviation
n is the sample size
1-α is the confidence coefficient
16
The Confidence Interval for  (  is known)
In its expanded form, the interval can be stated as
follows:
P( x  z  2


   x  z 2
)  1 
n
n
The confidence interval
17
Interpreting the Confidence Interval for 
Based on the estimate, we can say that with a (1 –  ) percent
confidence the interval:

 x  z


2
n
,
x  z
 
2

n
contains the true value of the unknown population parameter.
18
Interval Estimation
of a Population Proportion
p  z / 2
where:
p (1  p )
 1
n
1 - is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard
normal probability distribution
p is the sample proportion
20
The Confidence Interval for  (  is
known)
Commonly used confidence levels and their
corresponding Z scores
α
Z (for α/2)
Confidence
(Coefficient) level
(1-α)
90%
10%
Z 0.05 =1.645
95%
5%
Z 0.025 =1.960
99%
1%
Z 0.005 = 2.575
21
Interval Estimate of Population Mean:D
 Known: Example
Step-1: Identify coefficient (α) and the confidence coefficient (1- α)
at which the margin of error is to be computed (α =1% or 5% or 10% )
Step-2: Compute the corresponding margin of error for the selected
Confidence coefficient
Step-3: Establish the Interval estimate of  by adding and subtracting the
margin of error to the sample mean:
22
The Confidence Interval for  (  is known)
Hands-On-Practice Problems
23
Interval Estimate of Population Mean:
 Known: Example
D
A random sample of 81 credit card sales in a department store shows that
an average, the store sales about $68 per credit card it issued. From past
data, it is known that the standard deviation of sales on the stores credit
card is $27.
8.1) Provide the 90% confidence interval estimate of the store’s sales on
credit .
8.2) Provide the 95% confidence interval estimate of the store’s sales on
credit.
8.3) Provide the 99% confidence interval estimate of the store’s sales on
credit.
24
Interval Estimate of Population Mean:
 Known: Example
Solution: n = 81;
X
= $68.
D

=
$27.
Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be
computed:
Confidence Coefficient: (1- α)= 90%
Coefficient (α =10%)
Step-2: Compute the corresponding margin of error for the selected confidence coefficient
Coefficient (Zα/2 =0.05)=1.645;
Margin of Error = 1.645 x 3 = 4.935
Standard Error
x 

n

27 27

3
81 9
Step-3: Establish the Interval estimate of  by adding and subtracting the margin of error to the sample
mean:
68 – 4.935 = 63.065;
68 + 4.935 = 72. 935;
[ 63.065
72.935]
We are 90 percent confident that the average credit sales of the store lies
25
in the interval $63 and $73
Interval Estimate of Population Mean:
 Known: Example
D
A random sample of 81 credit card sales in a department store showed that
an average sale of $68,000. From past data, it is known that the standard
deviation of the credit card sales is $27.
8.1) The 90% confidence interval estimate of the sales on credit cards:
[63.065- 72.935]
8.2) Determine the 95% confidence interval estimate of the store’s sales on
credit cards.
26
Interval Estimate of Population Mean:
 Known: Example
Solution: n = 81;
X
= $68.
D

=
$27.
Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be
computed:
Confidence Coefficient: (1- α)= 95%
Coefficient (α =5%)
Step-2: Compute the corresponding margin of error for the selected confidence coefficient
Coefficient (Zα/2 =0.025)=1.96;
Margin of Error = 1.96 x 3 = 5.88
Standard Error
x 

n

27 27

3
81 9
Step-3: Establish the Interval estimate of  by adding and subtracting the margin of error to the sample
mean:
68 – 5.88= 62.12;
68 + 5.88 = 73. 88;
[ 62.12
73.88]
We are 95 percent confident that the average sales on a credit card lies
27
in the interval $62 and $74
Interval Estimate of Population Mean:
 Known: Example
D
A random sample of 81 credit card sales in a department store showed that
an average sale of $68,000. From past data, it is known that the standard
deviation of the credit card sales is $27.
8.1) The 90% confidence interval estimate of the sales on credit cards.:
[63.065 - 72.935]
8.2) The 95% confidence interval estimate of the sales on credit cards
[62.12 - 73.88]
8.3) Determine the 99% confidence interval estimate of the store’s sales on
credit cards.
28
Interval Estimate of Population Mean:
 Known: Example
Solution: n = 81;
X
= $68.
D

=
$27.
Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be
computed:
Confidence Coefficient: (1- α)= 99%
Coefficient (α =1%)
Step-2: Compute the corresponding margin of error for the selected confidence coefficient
Coefficient (Zα/2 =0.005)=2.575;
Margin of Error = 2.575 x 3 = 7.725
Standard Error
x 

n

27 27

3
81 9
Step-3: Establish the Interval estimate of  by adding and subtracting the margin of error to the sample
mean:
68 – 7.725= 60.275;
68 + 7.725 = 75.725;
[ 60.275
75.725]
We are 99 percent confident that the average credit sales of the store lies
29
in the interval $60 and $76
Interval Estimate of Population Mean:
 Known: Example
D
A random sample of 81 credit card sales in a department store showed that
an average sale of $68. From past data, it is known that the standard
deviation of the credit card sales is $27.
8.1) The 90% confidence interval estimate of sales on credit cards.
[63.065
-
72.935]
8.2) The 95% confidence interval estimate of sales on credit cards
[62.12
-
73.88]
8.3) The 99% confidence interval estimate of sales on credit cards
[60.275 -
75.725]
30
Implications…
As we increase the confidence coefficient (say
from 90% to 95% or to 99%), the interval that
contains the mean of the population widens.
There is a trade-off between the width of the
interval and the confidence with which we can
make the estimation
31
Interval Estimation of the  when , The
Population Standard Deviation, Is Unknown
32
The Confidence Interval for  ( When ,
The Population Standard Deviation Is Unknown)
Recall that when the population variance is known we use the
following statistic to provide an interval estimate of a population
mean

 
)
 x z 2 (
n 

33
The t - Statistic
However, information about population variance may not be
available all the time.
When the population variance is unknown, provided that the
sampled population is normally distributed, we use the
variance estimated from the sample and a t statistic (Student
t distribution) to make inference about the population mean.

n
t

x
s
Z

x 
n
34
The t - Statistic
The t distribution is moundshaped, and symmetrical
around zero.
The variance of a t-distribution depends on the
sample size. Generally it has higher variance than a
normal distribution
0
35
t Distribution
Standard
normal
distribution
t distribution
(20 degrees
of freedom)
t distribution
(10 degrees
of freedom)
z, t
The variance (spread) of a t-distribution, compared to that of normal distribution is largely determined
by the “degrees of freedom” ( the sample size)
When the degrees of freedom (sample size) is more than 100, the standard normal z value
provides a good approximation to the t value.
36
The t - Statistic
The interval estimate of the population mean is thus computed
as :
s


x

[
t
at
n
1
x
(
)
]

2


n 

37
The Confidence Interval for  (  is
unknown)
Example:8.2.1



In a random sample of 100 oil changes, it was found
that it takes an average of 22 minutes to change oil for
a given car with a standard deviation of 5 minutes.
Assuming that oil change time is normally distributed,
provide the 99% confidence interval estimate of the
average amount of time it takes to change oil on a
typical car.
[20.687
23.313]
38
The Confidence Interval for  (  is unknown)
Example 8.2.2.

Using the same information (n=100;
mean=22), but assuming a standard deviation
of 25 minutes, provide the 99% confidence
interval estimate of the population mean (the
average amount of time it takes to change oil
on a car).

[15.435
28.565]
40
Implications for the Width of the
Confidence Interval
The width of the confidence interval is
affected by
1. The confidence level (1-a):
The higher the confidence level, the wider the
interval estimate.
2. The population standard deviation (s):
The higher the variance, the wider the interval
estimate.
41
The Confidence Interval for  (  is
unknown)
Example 8.2.3.

Using a standard deviation of 5 and sample
mean of 22minutes, but assuming a sample
size of 400 oil changes, provide the 99%
confidence interval estimate of the population
mean (that is, the average amount of time it
takes to change oil on a typical car).

[20.712
23.288]
42
Implications for the Width of the
Confidence Interval
The width of the confidence interval is
affected by
1. The confidence level (1-a):
The higher the confidence level, the wider the
interval estimate.
2. The population standard deviation (s):
The higher the variance, the wider the interval
estimate.
3. The sample size (n):
The larger the sample size, the narrower the
interval estimate
43
The Width of the Confidence Interval
The width of the confidence interval is affected by confidence level,
variance of the population, and sample size.
1.
Although, we want higher confidence level and narrow interval estimate, there is a
trade-off between confidence level and the interval estimate we want to establish.
2.
Although lower variance can provide us with narrow interval estimate, the variance of
the population or sample is often beyond our control.
3.
Therefore, the only way we can establish a narrow (more informative interval) while
maintaining higher confidence level is by adjusting (increasing) our sample size.
46
The Sample Size
90%
Confidence level
Determining the Proper sample size is thus a critical component of in
Establishing Narrow Interval Estimation
47
8.3 Selecting the Sample size
From the formula that we used to establish the
interval estimate of the population parameter, we
can derive a formula that allows us to determine
the appropriate sample size.
Two important requirements:
1.
2.
At what confidence level do we want to provide the
interval estimate
What interval width (W) do we need?
48
8.3 Selecting the Sample size
 z 2 
n

 w 
2
 ( Z / 2 ) 2 ( 2 ) 
 

2
w


Where W is the interval width we want to maintain.
Hence, to compute the sample size, first we need
to determine the interval width.
49
Selecting the Sample size
Example 10.2


In order to estimate the amount of lumber that can be harvested from
a tract of land with a 99% confidence, it was indicated that the mean
diameter of trees in the tract must be within one inch.
Assuming that diameters are normally distributed with standard
deviation of 6 inches, how many samples should be selected to
provide the interval estimation for the mean of the diameter of the
trees in the tract at the specified confidence level?.
50
Selecting the Sample size
Solution

The estimate accuracy is +/-1 inch. That is w = 1.

The confidence level 99% leads to  = .01, thus z/2 = z.005 = 2.575.

The standard deviation was given as 6

Thus, we can compute the required sample size as follows:
2
2
z

  2   2.575(6) 
n
 239
 

 w   1 
51
Computing Interval Estimates: Summary

1.
Determine the sample size, and the values of
variables of interest (width, spread of the
population or sample).

2.
Select the confidence level for the interval
estimation

3.
Compute the sample mean ( population
variance may be known or unknown).

4.
Determine the critical value (Z or t from the
standard normal table)

5.
Compute the confidence interval.
52
Summary of Interval Estimation Procedures
for a Population Mean
Is the
population standard
deviation  known ?
Yes
No
Use the sample
standard deviation
s to estimate s
 Known
Case
Use
x  z /2

n
 Unknown
Case
Use
x  t /2
s
n
53
Interval Estimation
of a Population Proportion
p  z / 2
where:
p (1  p )
n
1 - is the confidence coefficient
z/2 is the z value providing an area of
/2 in the upper tail of the standard
normal probability distribution
p is the sample proportion
54