Download #29 a) skewed right means there are a few instances where groups

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Confidence interval wikipedia , lookup

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Law of large numbers wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
#29 a) skewed right means there are a few
instances where groups do big tips, so although
the z-score is (20-9.60)/5.4 =1.92 and we know
that the probability of a z-score greater than 2.0 is
small (ie. .025) since our distribution is not normal
the chance could be considerably higher than that.
b) since averages for the waiter due tend to be
normal (CLT) even when viewing just one tip is
not we can try to do this. CLT says the average of
4 tips will still be 9.60 but the stdev is 5.4/(square
root of 4) = 2.70. z-score is (15-9.60)/2.7 = 2.0.
So, no c) z-score will get even larger for bigger
sample size n, so No.
#31
a) average of 40 tips is still 9.60 and its stdev is
5.4/(square root of 40)= .854. To make $500 he
must have averaged 12.50 on each party.
P(average tip>12.50)=
P(z>(12.50-9.60)/.854)=3.396…way out in normal
tails..virtually impossible
b) what average is out in the right tail, in the upper
10%. Search table A-51 for .9000 (if left tail is
this then right tail is .10), see a z-score of 1.28
(with the .8997). so (average-9.60)/.854 = 1.28 so
average=9.60+(1.28*.854)=10.72. and
10.72*40=$429
#11 a) one-tailed because I would like to prove that the
population proportion is too low, rather than too low or too
high.
b) probability of rejecting a true null. The null hypothesis is
that the proportion is .27 and the alternative is that it is lower
(smaller) than .27. so this would be the probability of
deciding based on the (unusual/misleading) data that the
proportion of minorities is too low (i.e. below .27) when
really it is truly .27.
c) probability of accepting a false null. So this would be the
probability of deciding that .27 (or 27%) are minorities when
something else is true, like perhaps only 22% are
minorities or 18% or…..many beta errors.
d) power is 1-beta. So it is doing the right thing (opposite of beta
error). It is the probability of rejecting a false null. So it is the
probability of stating that less than .27 (or 27%) are minorities
when some proportion smaller (in the population) is the truth.
e) when the p-value is less than alpha we reject the hypothesis so if
alpha goes from .01 to .05 it is easier to reject Ho. So if you can
reject it more often you are more likely to reject a false null
hypothesis which is good so that power must increase.
f) since power is a good thing and it is good to get more data then
only using 37 rather than 87 employees should result in the loss of
power for the test. G) n=37 and p-hat=.19 and 95% using 2 then
.19 +- 2* (sqrt ((.19*.81)/37))=.19+-.13=(.06, .32) To test use a zscore of (.19-.27)/(sqrt((.27*.73)/37)) = -1.09 Using the A-51
normal tables we get a p-value of .1379 comparing this to an
alpha=.05 says we find supportive evidence for p=.27.
a) original distrib
normal by histogram,
independence
n/N<.10
b) 98.28+-t(df=51)*
(.68/sqrt(52))=
98.28+-2.403*.094=
98.28+-.226 =
(98.03, 98.5)
approximately.
c) we are 98% sure
that the mean for all
folks’ body
temperature is within
that interval
d) if we picked 52
people again and
again, say 100 times
then we would make
98 intervals that have
the mean inside them
e) a test of the mean
is 98.6 against it is
not 98.6 would
produce a t-score of
(98.28-98.6)/.094 =3.4, look for
df=52-1=51 (closest is 50) and find a number near 3.4 in A-53..that
is something larger than 2.678 so we know the p-value (looking up
to the top of A-53 under two tail) is less than .01. We reject the
idea that the mean is 98.6 since .01<.05 (alpha I suggested). So,
this data makes us conclude that the normal temperature is not 98.6
#11 a) the less sure we need to be the narrower we can make the
interval, so 90% is less sure so I could build a narrower interval, it
would be 98.28 +-1.676*.094 = (98.1, 98.4)
b) the more sure interval makes it more likely that we have the true
(all people’s) mean body temperature identified but at the sacrifice
of having to all more ‘wiggle room’ about where that value really
lies.
c) more data means more information meaning we can be more
sure and/or narrow our interval. If we compare this new 98%
interval made from 500 people it will be narrower than the last
based on only 52 people.
D) using the approximate formula I derived in class of
(2s/MOE)*(2s/MOE) we get ((2*.68)/.1)*((2*.68)/.1)=184.96
which you should round up to 185 people. Deveaux uses a more
exact formula to get 252.
Ch. 24 #7 from the authors’ website
(via our class website or directly
http://media.pearsoncmg.com/aw/aw_deveaux_introstats_1/data/da
ta_index.html)
I downloaded the cereal data and used XL. Via Tools-Data
Analysis-tTest 2 sample unequal variance. Entered alpha of .05
the sample mean for the children’s is 46.85 and for the adult’s is
10.367; we know that the stdev of the difference of 48.65 and
10.367 (or 36.483) is gotten by using the
sqrt of (((7.67*7.67)/27)+ ((6.6*6.6)/18)) =
sqrt of ((58.857/27)+(43.569/18))=
sqrt(2.42+2.18)=2.14 . The df=40 are shown by XL and come
from the formula in the footnote on p. 452. So t with df=40 and
confidence of 95% is 2.021 (check this in A-53). So
36.483+-2.021* 2.14 = (32.15, 40.82)
although not shown the histogram for the
adult cereal looks somewhat triangular
rather than bell-shaped and may pose a
concern about the truth of assumptions.
To test whether the null hypothesis that
there is no difference in the mean sugar
content of children and adult cereals versus
that the mean for the children’s is larger
(greater, higher) we will need a t-score =
(36.483-0)/2.14=17.05 (see XL’s t-stat of 17.01). Looking at the ttable in the book with df=40 we see that the closest number is
3.551….so the p-value is less than .0005. with .0005<.05=alpha
we definitely reject the idea that the mean sugar contents are the
same and accept/decide that the children’s mean content is higher
than the adult’s.