Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Topic 12
Sampling Distributions
Sample Proportions
•
is determined by:
= successes / size of sample = X/n
If you take as SRS with size n with population proportion p, then the
mean of the sampling distribution is exactly p.
•
•
o This means that
is an unbiased estimator of p.
The standard deviation of the sampling distribution is
o Only use this if the population is ten times the sample size.
To determine if the sampling distribution of
o np > 10
and
o n(1-p) > 10
is normal:
Sample Means
If
is the mean of an SRS size n from population with mean
and standard deviation , then:
o The mean is
o The standard deviation is
•
•
o Sample mean is unbiased estimator of population mean
Larger samples = less spread
o Standard deviation decreases at a rate of the , so you must
take a sample 4 times as large to cut the standard deviation in
half
Only use
for the standard deviation of
if the
population is at least 10 times the size of the sample
Central Limit Theorem
• An SRS that is large enough (~ >30) can be
•
•
considered normally distributed
If the population distribution is very skewed, it
takes a very large SRS to use the central limit
theorem
The CLT allows us to use normal probability
calculations even when the population
distribution is not considered to be normal
Topic 13
Confidence Intervals
Confidence Intervals
Confidence intervals estimate the true value
of the parameter where the parameter is the
true mean  , true proportion p, or true slope  .
estimate  (critical value)(sta ndard error of the estimate)
Confidence Intervals
 1-sample t-interval for µ
 2-sample t-interval for µ1 - µ2
 Matched-pairs t-interval
 1-proportion z-interval for p
2-proportion z-interval for p1 - p2
t-interval for slope 
Interpret the confidence level:
C% of all intervals produced using this method
will capture the true mean (difference in means), or
proportion (difference in proportions), or slope.
(Describe the parameter in context!)
Interpret the confidence Interval:
I am C% confident that the true parameter (insert
context) is between ___ and ___ (insert units), based on
this sample.
What does it mean to be 95%
confident?
• 95% chance that  is contained in the
confidence interval
• The probability that the interval contains
 is 95%
• The method used to construct the
interval will produce intervals that
contain  95% of the time.
Margin of error
• Shows how accurate we believe our estimate
is
• The smaller the margin of error, the more
precise our estimate of the true parameter
• Formula:
 critical
me  
 value
  standard deviation 
  

  of the statistic

How can you make the margin of error
smaller?
• z* smaller
(lower confidence level)
•
s smaller
(less variation in the population)
• n larger
Really cannot change!
(to cut the margin of error in half, n must
be 4 times as big)
Find a sample size:
• If a certain margin of error is wanted, then to
find the sample size necessary for that
margin of error use:
 s 
me  z * 

 n
Always round up to the nearest person!
The heights of MRHS male students
is normally distributed with s = 2.5
inches. How large a sample is
necessary to be accurate within + .75
inches with a 95% confidence
interval?
 2.5 
0.75  1.96 

 n
n = 42.68 or 43 students
The 4-Step Process
(from the Inference Toolbox)
Step 1 (Population and parameter)
Define the population and parameter you are investigating
Step 2 (Conditions)
Do we have biased data? Random?
• If SRS, we’re good. Otherwise PWC (proceed with
caution)
Do we have independent sampling?
• If pop>10n, we’re good. Otherwise PWC.
Do we have a normal distribution?
• If pop is normal (np>10, nq>10 or n>30 (CLT), we’re
good. Otherwise, graph it (histogram!).
The 4-Step Process
(from the Inference Toolbox)
Step 3 (Calculations)
• Find z* or t * based on your confidence level (and df). If
you are not given a confidence level, use 95%
• Calculate CI.
Step 4 (Interpretation)
• “We are ______% confident that the true mean (or
proportion or slope) is captured in the interval (lower,
upper)” and don’t forget CONTEXT!!!!!
How does t compare to normal?
• Shorter & more spread out
• More area under the tails
• As n increases, t-distributions
become more like a standard
normal distribution
How to find t*
Can also use invT on the calculator!
• Use Table B for t distributions
Need up
upper
t* value with
above &
– so
• Look
confidence
level5%
at is
bottom
df on
95% is below
the sides
• df = n – 1
invT(p,df)
Find these t*
90% confidence when n = 5
95% confidence when n = 15
t* =2.132
t* =2.145