Download Confidence Intervals – Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Confidence Intervals – Introduction
• A point estimate provides no information about the precision and
reliability of estimation.
• For example, the sample mean X is a point estimate of the
population mean μ but because of sampling variability, it is virtually
never the case that x  .
• A point estimate says nothing about how close it might be to μ.
• An alternative to reporting a single sensible value for the parameter
being estimated it to calculate and report an entire interval of
plausible values – a confidence interval (CI).
week 5
1
Confidence level
• A confidence level is a measure of the degree of reliability of a
confidence interval. It is denoted as 100(1-α)%.
• The most frequently used confidence levels are 90%, 95% and 99%.
• A confidence level of 100(1-α)% implies that 100(1-α)% of all
samples would include the true value of the parameter estimated.
• The higher the confidence level, the more strongly we believe that
the true value of the parameter being estimated lies within the
interval.
week 5
2
Large Sample CI for μ
• Recall: a point estimate of the population mean μ is the sample
mean. If the sample size is large, then the CLT applies and we have
X 
/ n
d


Z ~ N 0,1.
• A 100(1-α)% confidence interval for μ, from a large iid sample is

x  z 
n
2
• This interval is not random; it either does, or does not contain μ.
• If we make repeated CI’s then 100(1-α)% will contain μ and 100∙α%
will not.
• If σ2 is not known we estimate it with s2.
week 5
3
Example
• The National Student Loan Survey collected data about the amount
of money that borrowers owe. The survey selected a random sample
of 1280 borrowers who began repayment of their loans between four
to six months prior to the study. The mean debt for the selected
borrowers was $18,900 and the standard deviation was $49,000.
Find a 95% for the mean debt for all borrowers.
week 5
4
Width and Precision of CI
• The precision of an interval is conveyed by the width of the interval.
• If the confidence level is high and the resulting interval is quite
narrow, the interval is more precise, i.e., our knowledge of the value
of the parameter is reasonably precise.
• A very wide CI implies that there is a great deal of uncertainty
concerning the value of the parameter we are estimating.
• The width of the CI for μ is ….
week 5
5
Important Comment
• Confidence intervals do not need to be central, any a and b that
solve


X 
P a 
 b   1  
/ n


define 100(1-α)% CI for the population mean μ.
week 5
6
One Sided CI
• CI gives both lower and upper bounds for the parameter being
estimated.
• In some circumstances, an investigator will want only one of these
two types of bound.
• A large sample upper confidence bound for μ is

  x  z 
n
• A large sample lower confidence bound for μ is
  x  z 

n
week 5
7
Choice of Sample Size
• Sample size can be determined if we know
(i) the width (W=2B) of the desired CI
(ii) an estimate of σ and
(iii) the confidence level
• The sample size for a 100(1-α)% CI for μ with a desired width 2B is
 z / 2  ˆ 
n

 B 
2
week 5
8
Example
• You want to rent an unfurnished one-bedroom apartment for next
semester. How large a sample of one-bedroom apartments would be
needed to estimate the mean µ within ±$20 with 99% confidence?
week 5
9
Confidence interval for Population Proportion
• A large sample confidence interval for population proportion, p, is
pˆ qˆ
n
pˆ  z  
2
• The sample size for a 100(1-α)% CI for p with a desired width 2B is
2
 z / 2 
n
 p * 1  p *
 B 
where p* is a guessed value for the proportion of successes in a
future sample.
• Can use the sample proportion from a given sample as the value of
p* or any other value in which the investigator strongly believe.
• The most conservative approach is to choose p* = 0.5. Why?
week 5
10
Example
• In a sample of 400 computer memory chips made at Digital Devices,
Inc., 40 were found to be defective. Give a 95% confidence interval
for the proportion of defective chips in the population from which
the sample was taken?
• What sample size is necessary if the 90% CI for the proportion of
defective chips, p, is to have width of at most 0.1?
week 5
11