Download 1 - Department of Statistics | OSU: Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Problem Solving
11/06/2007
Copyright belongs to Yonggang Yao
The CEO claimed that 90% of the 3000 company employees supported his new policy. To check
his claim, a statistician made a survey by taking an SRS of 300 employees with replacement.
Assume that every selected employee has to check either “support” or “not support” but not both.
1. Identify the population. All the 3000 company employees
2. T/F If what the CEO said is true, the sample proportion approximately follows normal
distribution.
True, because 300*90%=270 and 300*10%=30 are both greater than 10.
3. T/F If what the CEO said is true, and the sample is an SRS without replacement, the sample
proportion approximately follows normal distribution.
False, because the size ratio between the sample and the population: 300/3000=10% is larger
than 5%.
4. If the sample is an SRS with replacement and what the CEO said is true, please identify the
normal approximation of the sample proportion distribution.
The normal approximation of the sample proportion distribution is:

Normal  mean  90%, sd 


90%(1 - 90%)
 1.732% 
300

5. T/F If the statistician tripled the sample size of the SRS, the new mean of the normal
approximation for the sample proportion will be three times the old one.
False. The new sample proportion mean is still 90%, but the new standard deviation of the
sample proportion will be reduced to 1%.
6. Based on the approximated normal distribution obtained in 4, please compute the 10% and
90% percentiles of the sample proportion.
Since z.1=-1.28, the 10% percentile of the sample proportion is
mean+z.1*sd=90%+(-1.28)*1.732%=87.78%.
Similarly z.9=1.28 gives the 90% percentile of the sample proportion:
mean+z.1*sd=90%+1.28*1.732%=92.22%.
7. T/F There are 80% chance that the sample proportion of the SRS is between the two
percentiles obtained in 5, if what the CEO said is true.
True. By definition of percentile, we know the chance for the sample proportion of the SRS to
be between the two percentiles is (90%-10%=80%). In other words, the two percentiles give
an 80% confidence interval to the sample proportion.
8. Do you believe that the statistician can independently take 500 SRS and each contains 300
employees? Y/N Why?
Yes, he or she can. All the SRS share the same population, although the same individual
may be contained by difference SRS.
9. T/F Suppose what the CEO said is true. If the statistician takes 500 SRS with sample size 300
for each, on average 450 of the 500 sample proportions would be larger than the 10%
percentile obtained in 5.
True. By definition of percentile, roughly 10% of the 500 sample proportions would be
smaller than the 10% percentile. In other words, about 90% of the 500 sample proportions
(90%*500=450) would be larger than the 10% percentile.
10. Based on the approximated normal distribution in 4, what is the probability for the sample
proportion to be smaller than 240/300?
240
 90%
obs  mean 300
zs 

 5.77 gives Pr[Z<zs]=0, so the probability is about zero.
sd
1.732%
11. It turns out that only 240 employees in the SRS support the new policy. Is this result
surprising if the CEO is correct? Do you believe the sample or the CEO or both?
Since the probability for obtaining such an extreme sample proportion is almost zero if the
CEO is correct, the sample does not support the CEO’s claim. The data is more trustable in
this case, so we should trust the sample but not the CEO. This judgment problem involves a
new statistical concept: Hypothesis Testing.
Remark:
If var X   X2 and var Y   Y2 , and X is INDEPENDENT from Y, then

E (aX  b)  aEX  b and var( aX  b)  a 2 var X

E ( X  Y )  EX  EY

var( X  Y )  ( X2   Y2 )

Example: If X ~ Normal ( X ,  X2 ) , Y ~ Normal (Y ,  Y2 ) , and X is independent
from Y, then:
X  Y ~ N o r m(alX   Y ,  X2   Y2 ) and X  Y ~ Normal ( X   Y ,  X2   Y2 ) .
The electric bulbs of brand A have a mean lifetime of 1500 hours with a standard
deviation of 200 hours, while those of brand B have a mean lifetime of 1300 hours with
a standard deviation of 100 hours. Two random samples of 125 bulbs of each brand
are tested.
(a) Identify the distributions of the two sample means.
Brand A: Normal(mean=1500, sd=
200
125
); Brand B: Normal(mean=1300, sd=
100
125
),
such that the difference of the two sample means follows normal distribution with
mean=1500-1300=200 and sd=
200 2 100 2

=20.
125
125
(b) what is the probability that the brand A bulb sample will have a mean lifetime which
is at least (i) 160 hours, (ii) 250 hours more than sample mean for the brand B bulb
sample?
160  200
 2 , Pr[Z>-2]=.7972
T/F (a) 0.7972 True. z s 
20
T/F (b) 0.0022 True. z s 
250  200
 2.5 , Pr[Z>2.5]=.0022
20
The variance of the sum obtained in tossing a pair of fair dice is: 35 / 6
The variance of the average number obtained in tossing a pair of fair dice is:35/24
(a) 35 / 12
(b)
35 / 12
(c) 35 / 6
(d) 9 / 12