Download Chapter 9 Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 9
Sampling Distributions
AP Statistics
St. Francis High School
Fr. Chris, 2001
Two Key Ideas

A Statistic is a Random Variable

As such, mean and standard deviations
can be found from combining the basic
random variables that make the statistic
Pick Pennies from a Hat
Recall how we did this
 Try it again:

– Pick at random
– Note the year
– Compute the mean and standard deviation
of your sample
– NEW: Compute what you think the mean
and standard deviation of the entire hat!
Formulas
 x   

 x  
n
p  p
  p 
p1  p 
n
Statistic vs. Parameter

A Statistic is a way to describe a
parameter

A Parameter describes a population
Which is a sample, which is a
parameter?
42% of today’s 15 year-old girls will get pregnant
in their teens
42: parameter
37% said they would vote for Joan Smith, on
election day 41% actually did. 37:statistic 41:parameter
The NIH reports that the mean systolic blood
pressure for males 35-44 years of age is 128 and
the standard deviation is 15. 72 male Stock
Brokers in this age group have a mean blood
pressure of 126.07 128, 15: parameter, 126.07:statistic
Bias vs. Variability
Bias: Is your statistic centered around
the population’s parameter?
Variability: Is your sample distribution
scattered or focused?
Identify the bias and variability
of each:
Population Parameter
Population Parameter
Population Parameter
Population Parameter
What about your sample?
Is it variable?
Is it biased? How can you tell?
http://www.mathorama.com/stat/penny97hist.html
http://www.mathorama.com/stat/penny99hist.html
Confidence Intervals
Use your sample statistics and what you know of the
central limit Theorem, to make an assertion about the
Population parameter.
x 2
0.4
x  z(std.error )
y
1
2
e
2
0.3
where z is the z  score
for the desired %
0.2

0.1
By hand:
-3
-2
-1
1
2
3
http://www.mathorama.com/stat/Confidence.html
Computer Simulation
http://www.mathorama.com/stat/RandomSamp.html
What about a proportion?
The Gallup poll asked a probability sample of 1785
adults whether they attended church or synagogue
during the past week. Suppose 40% did attend.
How likely is it that a SRS of 1785 would be
within 3% of this actual value?
pˆ   p
  pˆ  
p1  p 
n
Two rules of thumb:
The population must be at least 10 times
more than your sample size to use this
formula for standard deviation.
np > 10 and n(1-p) > 10 in order to use
the normal curve for approximating p.
Compute the standard
deviation
Since the population is more than 10 times
1785,
 ( pˆ ) 
p(1 p)

n
.4(.6)
=0.0116
1785
The Probability that
p-hat is between 37%-43%
Since (.4)(1785) >10, and (.6)(1785)>10 then
we can convert to z-scores and use the normal
curve.
z
x

.37  .4
 2.586
0.0116
.43  .4
 2.586
0.0116
Using the Normal
Distribution…
P(-2.586 < Z < 2.586)=
P(Z<2.586)-P(Z<-2.586)=
normalcdf(-2.586,2.586)=
Normalcdf(.37, .43, .4, 0.0116)=
.9903!
Okay, what if you flip a coin 20
times and it’s heads 14 times?
Is it a fair coin? How can justify your answer?
Did you mention sample variability? Bias?
Do the rules of thumb apply to find a sigma? To
use the normal distribution?
(.3)(n)  10
10
n
If you suspect that 70% is this coin’s true
.3
proportion, how many times should we flip it son  34
we can use the normal curve?
Dishonest Cola?
DC Cola is suspected of underfilling its cans
of cola. They say each can has 12 ounces,
with a standard deviation of 0.4 oz.
If this is true, how likely is it to get an
average of 11.9 oz.or less, by taking a
random sample of 50 cans?
Work it out...
11.9  12
 1.77
Z score? z 
0.4
50
Look up -1.77 in Table A, or
normalcdf(-1E99, -1.77)
Or normalcdf(-1E99, 11.9, 12, .4 / √50)
=.0384
This leads to inference...
If these were your results, there is still a
3% chance that the parameter really is
where the company says it is (12 oz.)
and sample variation lead you to a
result less than 11.9 oz.
At what point do you reject the company’s
claim? At 5%? 1%? 0.1%?
Inferential Statistics
We choose a level of rejection (alpha)
 We assume that our results are no
different, and any variation is from
chance (Null Hypothesis).
 If it is unlikely (less than our chosen
alpha), we reject the “Null Hypothesis”
 Then claim our results SIGNIFICANTLY
different.

Central Limit Theorem
Draw an SRS of size n from any population
whatsoever with mean µ and a finite
standard deviation . When n is large, the
sampling distribution of the sample mean
x-bar is close to the normal distribution
N[µ, /√n] (page 488).
Law of Large Numbers
Draw observations at random from any
population with finite mean µ. As the
number of observations drawn increases,
the mean x-bar of the observed values
gets closer and closer to .
Homework 9.1-9.4 (489)
Parameter or a statistic?
  2.5003 parameter
p  7.2% statistic


x  2.5009 statistic

p  48%
p  52% parameter
statistic

x1  335
statistic x 2  289 statistic


9.5 (492)Tumbling Toast
Toss coin 20 times. P-hat=
 10 more times… make a histogram of
your p-hats…. Is the center close to .5?
 Pool your work.. Is the center near .5?
Is it normal?

9.9 (500) Dead Guinea Pigs
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
9.10(510)
A) Large Bias, Large Variability
 B)Small Bias Small Variability
 C)Small Bias Large Variability
 D)Large Bias Small Variability

9.17 (503) School Vouchers

Assuming the poll’s sample size is less
than 780,000-10% of the population of
NJ… the variability would be about the
same
9.19 (511) Got Milk?
n=1012
  p  .7; 
p  .7
p(1 p)
(.7)(.3)

 .0144
n
1012
US  10120

p  .67
np  (1012)(.7)  708.4 10
n(1 p)  (1012)(.3)  303.6 10
P(p  .67)  P(Z  0.25)  .0186

4 *1012  4048
9.33(519) Juan’s results
=10



10

 5.7735mg
n
3
10
 3;n  12
n
9.35(524)Bad Rug
Mean=1.6 sd=1.2
1.2
normalcdf (2,9999999,1.6,
)0
200
9.39(525) Cheap Cola
=298, =3 P(<295)? P(xbar<295, n=6)?
295  298
P(X  295)  P(Z 
 1)  .8413
3
295  298
P(x  295)  P(Z 
 2.4495)  .0072
3
6
9.41(526) What a Wreck!
=2.2, =1.4
 Not normal but dist of x-bar is!

1.4
N(2.2,
 .1941)
52
2  2.2
P(x  2)  P(Z 
 1.0302)  .1515
1.4
52
P(x 

100
52
)  P(Z 
 2.2
 1.4267)  .0768
1.4
52
100
52