Download Sampling Distribution Models - Judson Independent School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Sampling Distribution Models
February 2012
Drawing Normal Models
 For cars on I-10 between Kerrville and Junction, it is
estimated that 80% are speeding. What proportion of
speeders would we expect to see if we counted 50 cars?
 Think: Does this fit the normal model?
 10% condition: 50 cars is less than 10% of the cars on the road.
 Success/Failure: At least 10 failures and 10 successes?
 np = .8(50) = 40 
 nq = .2(50) = 10 
 It fits the normal model.
Drawing Normal Models
 The sampling model for proportion of speeders is normal
with mean of 0.8 and standard deviation of
pq
(.8)(.2)

 0.057
n
50
 The model for p̂ , the sample proportion is N(0.8, 0.057)
 Per the 68-95-99.7 model, we would expect 68% of the
proportions to be within 1 σ, the interval (.743, .857)
 We would expect 95% of the proportions to be within 2 σ,
the interval (.686, .914)
 An 99.7% within 3 σ: (.629, .971).
Drawing Normal Models
 We don’t know it yet, but 51% of voters are planning to vote
“Yes” on Proposition 2. We poll a random sample of 100 voters.
What is the probability our sample will be opposed or
deadlocked?
 Does it meet normal criteria?
 10% rule? Check. There are more than 1000 voters.
 Success/Failure: Check. 51/49.
 What is the distribution for our proportion?
 μ = 0.51
 σ = √(.51)(.49)/100 = 0.050.
 Therefore, our distribution is N(0.51, 0.05).
Drawing Normal Models
 Restating our original question, we want to know the probability




of deadlock or voting against.
Since p-hat is N(0.51, 0.05), our critical value is .5. The z-score
for .5 is z  .50.05 .51  0.20
Therefore P(z<-.20) = .4207.
There is a 42% chance that our sample will not vote in favor of
the proposition.
That is not terrific. If the true parameter p is close to .5, then we
would need a larger sample to predict the outcome more
accurately.
Sampling Distribution for means
 If we have a population that has a distribution where mean and variance
are defined, then we know that the sample means from that distribution
will approach normality as the sample size increases. This is a
conclusion of the Central Limit Theorem (check the book for the
details).
 Conditions: These conditions must be met if we are going to use the
Central Limit Theorem.
 Randomness: The sample must be selected randomly.
 Independence: The individuals must be mutually independent.
 10% condition: Sample size < 10% of the total population.
 Large Enough Sample: For symmetric and uni-modal distributions, the
sample does not need to be that big. For highly skewed distributions, a
large sample is more likely to give good results. At this stage in the game, if
n ≥ 30, you can be pretty confident the sample is large enough.
Sampling Distribution for means
 If all conditions are met, then we can state that our sampling




distribution has a normal model.
If Y is a random variable with known mean (μY) and std. dev. (σY), then
Y is a normal random variable. Y ~ N(μY, σY/√n)
The mean of a sample has less variability than individual values, so the
standard deviation is divided by the square root of the sample size.
How would we apply this to a real-life situation?
It is know that the mean SAT verbal score is 500 with σ = 100. A
sample of 100 AP Statistics students is taken. What would the expected
distribution of the mean score be?
Sampling Distribution for means
 Known μ = 500 with and σ = 100. Sample size is 100 AP Students.






Does it meet the randomness, independence and large enough
conditions?
Randomness? NO. These are AP students. We cannot go forward.
Let’s change our sample to just 100 students. That would take care of
randomness. How about independence?
We’d have to assume that they were drawn from the entire population.
Every student should have an equal chance of being selected. This seems
reasonable. And this is far less than 10% of the total population.
Large enough sample? Sure.
Since all three conditions are met, our sample mean,Y ~N(500, 10)
68%: (490,510); 95%: (480,520); 99.7%: (470,530).
Sampling Distribution for means
 Looking to the future: Let’s suppose that we did take a sample of 100
AP students and their average SAT score was 513.
 Is that an unusual value? No. The z-score is 1.3. That is well within the
expected range of values. What this tells us is that it is possible that the
AP students test scores might not be different than all students. It
would require additional tests to distinguish between AP and everyone.
Caveats For Sample Distributions
 Don’t confuse the distribution of the sample with the sampling
distribution.
 Beware of dependent observations. One of the assumptions
when we assume normality is that of independence. Dependence
equals non-random samples.
 Beware small samples from skewed populations. Do not assume
that your sample is from a unimodal, symmetric population. A
sample with a small “n” from a skewed population will likely
violate the Central Limit Theorem (CLT).
 Homework/Classwork: Problems 2, 4, 5, 9 on page 428.
Application Problems
 Suppose weight for adults is normally distributed with μ =175 lb
and σ = 25 lb. An elevator has a weight limit of 10 people or 2000
lbs. What is the probability that 10 people getting on the elevator
would overload it?
 Response to an AP prompt:
 Think: Check all my conditions.
 Random? Independent? 10%? Large enough? (Assume yes on all)
 Show: Tell me that you checked the conditions and that they are OK
and then answer the problem.
 What is the sampling distribution? How did you calculate it?
 Draw a picture of the distribution. Identify the critical value of the
sample statistic (in this case the sample mean) and determine the
probability of exceeding.
Standard Error
 We do not usually know the parameters of a distribution. We
estimate the parameters with statistics, so sample mean estimates
population mean and sample standard deviation estimates
population standard deviation.
 When we have to estimate the standard deviation of a sampling
distribution, we calculate the standard error.
 Standard error for proportion:
pˆ qˆ
SE ( pˆ ) 
 Standard error for mean:
n
s
SE ( y ) 
n
 We will use the standard error quite frequently. It will come into
play when we don’t know p or μ.