Download Class 5 Handout

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Class 5
Estimating 
Confidence Intervals
Estimation of 
• Imagine that we do not know what  is, so we
would like to estimate it.
• In order to get a point estimate of , we would
take a sample and compute x . Is there a better
way? Should we use our sample to compute
something else that would yield better guesses of
?
• If you take a sample and (1) multiple the sample
values by any amount that you like, and (2) add
the results together, you can not do better than x .
Estimation (cont.)
• For example, take a sample of size 4 from a
normal population and compute
W

X1  X 2  X 2  X 3
,
4
• where Xi is the ith sample value. Then W
has a normal distribution, E(W) = , but
Var(W )  Var( X ).
• What does this mean about W?
Estimation (cont.)
• An estimator, Y, of  is said to be unbiased
if E(Y) = . Thus, W and X are unbiased.
• In fact, it can be shown that X is the best
linear unbiased estimator (BLUE) of  in
the sense that, among all linear unbiased
estimators, it has the smallest variance.
Building Interval Estimates:
The Confidence Interval
• We do not know , but if we did (and if we
had a large enough sample), we would
know exactly how X was distributed.
• This tells us where X will probably fall.
But we have a different problem: we see X.
Where does  probably fall?
• For a given probability, this is called a
confidence interval.
The Confidence Interval
• Let z be the point on the standard normal
distribution that cuts off % in the upper
What information comes
tail.
from the sample?
• A 100(1-)% confidence interval for 
• when the normality of X is justifiable, and
•  is known:
x  z / 2 X  ( x  z / 2 X , x  z / 2 X )
 ( x  z / 2 / n , x  z / 2 / n )
Probability Statements
About the Sampling Error
There is a 1 -  probability that the value of a sample mean will
provide a sampling error of z / 2 x or less.
Sampling distribution
of X
/2
1x -  of all
values

/2
Example
• Incomes in a community are known to be
normally distributed with  = $2000. In order to
compute a 90% confidence interval for , you take
a sample of 400 incomes and determine that x =
$24,000.
• What is z/2?
Then
Example (cont.)
• What is true about this interval?
• How would it change if we contructed a
95% confidence interval?
Example (cont.)
• Now assume that you wish to construct a
95% confidence interval using a sample of
1600.
• If x = $24,000, then our interval is (23,902, 24,098).
• We would like our interval to be small.
What are the two things we can change?
Confidence Intervals for  - Unknown
• If you did not know , what would you use
to estimate ?
• To construct a 100(1-)% Confidence
Interval for  when  is unknown, compute:
x  t / 2,n 1 s
n
Where t/2,n-1 is the value on the t distribution with
n-1 degrees of freedom that cuts off /2 of the
distribution in the upper tail.
t distributions
• The t distribution is a family of similar
probability distributions.
• A specific t distribution depends on a
parameter known as the degrees of freedom.
• As the number of degrees of freedom
increases, the difference between the t
distribution and the standard normal
probability distribution becomes smaller
and smaller.
• A t distribution with more degrees of
freedom has less dispersion.
t distributions
t Value: Assume a sample of size 10.
At 95% confidence, 1 -  = .95,  = .05, and /2 = .025.
t.025,9 is based on n - 1 = 10 - 1 = 9 degrees of freedom.
In the t distribution table we see that t.025,9 = 2.262.
Degrees
of Freedom
.
7
8
9
10
.
.10
.
1.415
1.397
1.383
1.372
.
Area in Upper Tail
.05
.025
.01
.
1.895
.
2.365
.
2.998
1.860
1.833
2.306
2.262
2.896
2.821
1.812
.
2.228
.
2.764
.
.005
.
3.499
3.355
3.250
3.169
.
t distributions (Once again)
• A t distribution is mound shaped and
symmetric about 0.
• There is a different t distribution for every
different degree of freedom.
• The t distribution approaches the z
(standard normal) distribution as n.
• t.1,20 =
t.05,10 =
• t.025, =
t.025,30 =
Example
• Incomes in a community are normally
distributed. A sample of size 4 is taken, and
the following incomes are found:
21,000
24,500
22,500
22,000
• Estimate the mean income of the
community with a 95% confidence interval.
x  t / 2,n 1 s
Let’s begin by doing this in EXCEL.
n
Our example in EXCEL
• Input the data into an EXCEL column.
• Select tools/data analysis/descriptive
statistics.
• Input data range and output range.
• Select summary statistics and confidence level for
mean.
• How was the standard error computed?
• What is the ratio of the confidence level and
the standard error?
Summary
Large n
Small n
x  z / 2 
x  z / 2 
 known
n
n
( population normal )
n
x  t / 2,n1 s
x  t / 2,n 1 s
 unknown

x  z / 2 s
n
n
( population normal )
Note: The expression
s
n
is frequently referred to as the standard error.