Download STA 291 Summer 2010

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Lecture 14
Dustin Lueker
x  Z / 2

s
n
This interval will contain μ with a 100(1-α)%
confidence
◦ If we are estimating µ, then why it is unreasonable
for us to know σ?
 Thus we replace σ by s (sample standard deviation)
 This formula is used for a large sample size (n≥30)
 If we have a sample size less than 30 a different
distribution is used, the t-distribution, we will get to this
later
STA 291 Summer 2010 Lecture 14
2

Incorrect statement
◦ With 95% probability, the population mean will fall
in the interval from 3.5 to 5.2

To avoid the misleading word “probability” we
say
◦ We are 95% confident that the true population mean
will fall between 3.5 and 5.2
STA 291 Summer 2010 Lecture 14
3

Changing our confidence level will change
our confidence interval
◦ Increasing our confidence level will increase the
length of the confidence interval
 A confidence level of 100% would require a confidence
interval of infinite length
 Not informative

There is a tradeoff between length and
accuracy
◦ Ideally we would like a short interval with high
accuracy (high confidence level)
STA 291 Summer 2010 Lecture 14
4

Start with the confidence interval formula
assuming that the population standard
deviation is known
x  Z / 2

s
 x  ME
n
Mathematically we need to solve the above
equation for n
2
 Z / 2 
ns 

 ME 
2
STA 291 Summer 2010 Lecture 14
5

The width of a confidence interval
as
as
as
as
◦
◦
◦
◦
the
the
the
the
confidence level increases
error probability increases
standard error increases
sample size n increases
 Why?
STA 291 Summer 2010 Lecture 14
6

To account for the extra variability of using a
sample size of less than 30 the student’s tdistribution is used instead of the normal
distribution
x  t / 2
s
n
STA 291 Summer 2010 Lecture 14
7




t-distributions are
bell-shaped and
symmetric around
zero
The smaller the
degrees of freedom
the more spread out
the distribution is
t-distribution look
much like normal
distributions
In face, the limit of the
t-distribution is a
normal distribution as
n gets larger
STA 291 Summer 2010 Lecture 14
8

Need to know α and degrees of freedom (df)
◦ df = n-1

α=.05, n=23
◦ tα/2=

α=.01, n=17
◦ tα/2=

α=.1, n=20
◦ tα/2=
STA 291 Summer 2010 Lecture 14
9

A sample of 12 individuals yields a mean of
5.4 and a variance of 16. Estimate the
population mean with 98% confidence.
STA 291 Summer 2010 Lecture 14
10

The sample proportion is an unbiased and
efficient estimator of the population
proportion
◦ The proportion is a special case of the mean
pˆ  Z / 2
pˆ (1  pˆ )
n
STA 291 Summer 2010 Lecture 14
11

ABC/Washington Post poll (December 2006)
◦ Sample size of 1005
◦ Question
 Do you approve or disapprove of the way George W.
Bush is handling his job as president?
 362 people approved
 Construct a 95% confidence interval for p
 What is the margin of error?
STA 291 Summer 2010 Lecture 14
12

As with a confidence interval for the sample
mean a desired sample size for a given
margin of error (ME) and confidence level can
be computed for a confidence interval about
the sample proportion
2
 Z / 2 
n  pˆ (1  pˆ )

 ME 
◦ This formula requires guessing p̂ before taking the
sample, or taking the safe but conservative
approach of letting p̂ = .5
 Why is this the worst case scenario? (conservative
approach)
STA 291 Summer 2010 Lecture 14
13

If we wanted B=2%, using the sample
proportion from the Washington Post poll,
recall that the sample proportion was .36
 1.96 
n  0.36  (1  0.36)  

 0.02 
2
◦ n=2212.7, so we need a sample of 2213
 What do we get if we use the conservative approach?
STA 291 Summer 2010 Lecture 14
14

To calculate the confidence interval, we use
the Central Limit Theorem (np and nq ≥ 5)
◦ What if this isn’t satisfied?

Instead of the typical p̂ estimator, we will
use
x2
~
p
n4

Then the formula for confidence interval
becomes
~
~
p (1  p )
~
p  Z 2
n4
STA 291 Summer 2010 Lecture 14
15

Suppose a student in an advertising class is
studying the impact of ads placed during the
Super Bowl, and wants to know what the
proportion of students on campus watched it.
She takes a random sample of 25 students
and finds that all 25 watched the Super Bowl.
◦ Find a 95% confidence interval using first method
learned for p
◦ Find a 95% confidence interval using the new
method if np, nq condition fails
STA 291 Summer 2010 Lecture 14
16