Download These 16 problems are from your textbook. Only the highlighted

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Regression toward the mean wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Lab#__________
Section________
Name___________________
Instructor________________
Homework #5 Solutions
These 16 problems are from your textbook. Only the highlighted ones(***) are part of your homework.
Section 5.1 (p.384)
1. Exercise 5.2
(a) No, since the number of “trials” is not fixed The number of cars per each hour of production will not
be constant. This is a key assumption that must be satisfied for the binomial model to be reasonable.
(b) Yes, since a large percentage of the population is being sampled and we ASSUME that the
probability of being against the death penalty is the same for all members.
(c) Yes, since the probability of winning is the same each week and there is a fixed number of trials.
2. $***Exercise 5.13
(a) 62/100 = 62%.
(b) To find a The standard deviation. for the proportion is [[p (1-p)]/n] =  [(.67 x .33) / 100] = .0470.
For 62%, z = -1.063 and the probability is .1438 that the proportion from the SRS is less than or
equal to the proportion determined in the administration’s sample (of 15,000 students).
(c) Dear Editor-Person: When the proportion of underage drinking is the same as the national value of
0.67, the probability of 62% or less is about 14% (.1438). This is not large AND is not so small that it
would lead you to believe this campus is different from the national percentage.
Signed, Student-Person.
3. ***Exercise 5.14
(a) The shape is normal because n = 1000(0.29) = 290 and n(1) = 1000*0.71 = 710, the center is 0.29
=  and the spread is n(1) = 1000*0.29*0.71 = 0.014. P(0.26 < p10000 < 0.32) = P((0.26 
0.29)/0.014 < Z < (0.32  0.29)/0.014) = P(2.14 < Z < 2.14) = 0.9838  0.0162 = 0.9676
(b) For n = 250, Z =  1.03 and the probability (area) within those limits is 0.6970
For n = 4000, Z =  4.29 and the probability is essentially equal to 1.
3. Exercise 5.21
(a) The mean of a binomial count is n x p. In this case, n = 1000 and p = 1/5. So, the mean = 1000 x .2 =
200. The standard deviation of a binomial count is [n x p x (1-p)] =  (1000 x .2 x .8) = 12.649.
(b) The mean and standard deviation of the proportion of binomial successes are p = 1/5 = .2 and [p(1p)/n] = .01265
(c)
24% is 3.16 standard deviations above the mean. This is seen by computing a z-score (standardized
value) of (24% - 20%) / .01265 = (.24 - .20) / .01265 = .04 / .01265 = 3.1621. The probability of
exceeding this z-value is 0.00078.
(d) Referring to example 1.28, page 77, and using table A, page T-3, we see that Z should be 2.33
standard deviations above the mean. Following page 77, x =  + Z *  = 0.2 + 2.33 * 0.01265 =
0.2295, which is the proportion of successes a subject must have to meet the standard.
4. Exercise 5.23
(a) The proportion of blacks among American adults = # of Black or African-American / # of adults in
U. S. = 23,772,494 / 209,128,094 = 0.11367
(b) The binomial model for proportions of success is appropriate, so the mean = n x p = 0.11367 x 1500
= 170.5
(c) Since the mean is very near 170 and we are using the normal approximation to the binomial, the
probability of 170 or fewer is nearly 0.5 (see page 383 to confirm the mean to use is about 170). If
we used the continuity correction, we would get exactly ½ as the probability. The rationale for using
the normal approximation is that the mean (n) is much greater than 10 AND the number of nonblacks [n x (1-p)] is also much greater than 10.
5. $***Exercise 5.25
(a) Using the normal approximation since n = 100(0.75) = 75 and n(1) = 100*0.25 = 25, the s. d. =
[(1-) / n] = [0.75 x 0.25 /100] = 0.0433. For 70% = 0.70, Z = (0.70  0.75) / 0.043301 =
1.1547, and the probability of scoring 70% or lower is about 0.1251.
(b) For n = 250, standard deviation (s.d.) = 0.0274 using same method as in (a), and for 0.7, Z = 1.826
and the probability of scoring lower than 70% is 0.034 using same method as in (a).
Lab#__________
Section________
(c)
Name___________________
Instructor________________
To reduce the standard deviation by 2, we must have 4 times as many questions, namely 400. The
rationale for this comes from recognizing that the standard deviation is computed using the
n in the denominator and that the 4 = 2, so a four-fold increase in the sample size reduces
the s.d by ½.
(d) Yes, since to cut the s.d in half always requires we increase n by a factor of 4.
Section 5.2 (p.402)
6. ***Exercise 5.28
(a) The s. d. of the sample mean is  / n = 10 / 3
(b)
Recalling 5.25 (c), we must have 4 times as much data to reduce the s.d. by a factor of 2 , so
Antonio must repeat the measurement 4 times, for a total of 12, in order to reduce  from 10
to 5. NOTE WELL: The average of several observations is LESS VARIABLE than a single
observation!!
7. Exercise 5.29
(a) Explanation: Even though the average, x-bar, may be higher or lower than the mean, , the amounts
higher will tend to balance the amounts lower.
(b) Explanation: Since the variability of the average is less for larger sample sizes, the average, x-bar,
will tend to be closer to the true mean, , when the sample size is larger.
8. $***Exercise 5.32
(a) P(X > 23) = P(Z > (23 – 21) / 4.7) = 0.425 and the probability of exceeding 0.425 is about 0.3352.
(b) The mean is the same as the mean of the individual observations, namely 21. The s.d. is  / n = 4.7 /
50 = 0.6647
(c) P( X > 23) = P(Z > (23 – 21) / 0.6647) = 3.009 and we get the area TO THE RIGHT OF 3.009,
which is 1  0.9987 = 0.0013
(d) The answer in (c) is more accurate since it is based on an average (representing repeat measurements)
vs. the answer in (a) representing a single measurement.
9. Exercise 5.34
Mean = 40.125 mm and s.d. = 0.002 / 4 = 0.002 / 2 = 0.001 mm. (See page 394)
10. ***Exercise 5.38
(a) The average x has a normal distribution with mean 55,000 miles and s.d. = 4500 / 8 = 1590.99
(b) Find the number of s.d.’s that 51,800 is below 55,000 and compute the probability that the average
could be as low as 51,800 or lower. Z = (51,800 – 55,000) / 1590.99 = 2.011, which indicates that
51,800 is 2.011 s.d.’s below the mean and the probability of this occurring and is 0.0221.
11. Exercise 5.39
The s.d. of the average, x =  / n = 1.2 flaws per square yard / 200 square yards = .0848, and Z = (2 –
1.6) / 0.0849 = 4.717 and the probability of exceeding this value is virtually zero.
12. $***Exercise 5.45
(a) The mean of the difference is the difference of the means, 385 – 360 = 25g. The s.d. of the difference
is given by [(552/20) + (502/20)] = 16.62
(b) The distribution of y is normal, mean 385, and s.d. 50 / 20 = 11.18.
The distribution of x is normal, mean 360, and s.d. 55 / 20 = 12.298
The distribution of y – x is also normal with mean and s.d. from part (a). Since the mean
difference is 25, the probability of exceeding 25 is ½.
13. ***Exercise 5.49
The s.d. of the difference of the two scores is [22 + 22] = 8 = 2.828. The mean of the difference is 0.
Therefore being 5 points away from 0 (in either direction) corresponds to Z = (5 – 0) / 2.828 = 1.768
or Z = (5 – 0) / 2.828 = 1.768 . So, we add the areas (probabilities) to the left of –1.768 and to the
right of +1.768 to get 0.0385 + 0.0385 = 0.07706, which is the probability that the scores differ by
more than 5 points in either direction.
Lab#__________
Section________
Name___________________
Instructor________________
15. ***Exercise 5.52 use SPSS
700
Histogram
600
40
21
13
51
500
30
5
14
400
47
20
300
Frequency
200
10
Std. Dev = 109.21
Mean = 141.8
100
N = 72.00
0
50.0
150.0
100.0
250.0
200.0
350.0
300.0
450.0
400.0
0
550.0
500.0
N=
72
600.0
This is a histogram and boxplot of the original data. Below are the means and boxplot of 25 samples.
700
De scriptive S tatistics
N
SA MP 1
SA MP 2
SA MP 3
SA MP 4
SA MP 5
SA MP 6
SA MP 7
SA MP 8
SA MO9
SA MP 10
SA MP 11
SA MP 12
SA MP 13
SA MP 14
SA MP 15
SA MP 16
SA MP 17
SA MP 18
SA MP 19
SA MP 20
SA MP 21
SA MP 22
SA MP 23
SA MP 24
SA MP 25
Valid N (lis twis e)
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
Minimum
43.00
56.00
82.00
73.00
53.00
67.00
79.00
45.00
45.00
45.00
57.00
57.00
53.00
43.00
53.00
56.00
53.00
45.00
80.00
45.00
56.00
57.00
53.00
79.00
80.00
Maximum
214.00
147.00
598.00
198.00
243.00
214.00
191.00
598.00
511.00
156.00
522.00
403.00
174.00
249.00
156.00
211.00
598.00
249.00
598.00
511.00
329.00
511.00
511.00
214.00
211.00
Mean
113.7500
95.0833
186.9167
120.9167
114.3333
126.0000
117.7500
187.0000
130.4167
104.5000
147.9167
183.5833
107.7500
101.9167
99.7500
116.8333
203.4167
107.6667
204.1667
121.5833
118.5833
144.0000
155.6667
113.6667
118.9167
St d. Deviation
58.61760
29.93769
152.83887
36.09950
50.57548
47.47440
32.93554
176.81012
123.17796
28.63405
127.17022
124.47523
32.45451
54.04116
30.13643
44.54382
190.56064
58.63808
171.55801
126.21443
71.18153
129.83766
133.84410
42.09153
40.15981
600
4
8
4
500
400
12
1
2
5
9
6
8
3
6
4
300
4
3
12
2
200
8
2
9
5
100
0
N = 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12
SAMP1
SAMP5
SAMP3
SAMO9
SAMP7
SAMP13
SAMP11
SAMP17
SAMP15
SAMP21
SAMP19
SAMP25
SAMP23
220
200
19
17
180
160
140
120
100
80
N=
25
XBARS
Notice that although the samples can have large standard deviations, the means of the samples don’t vary
much (look at how little the medians in the boxplots fluctuate). The last boxplot is of the x ’s. The mean of
the x ’s is 133.7 and the standard deviation is 33.7 vs. the mean of the original data, 141.9 with sd = 109.2.
NOTE: The distribution of the sample means is NOT normal since we started with a right skewed
distribution.
Lab#__________
Section________
Name___________________
Instructor________________
16. $***Exercise 5.64
(a) The assumption of independence is reasonable we have 2 physically distinct machines, one of
which makes the caps and the other of which screws the caps on.
(b) The mean of the difference (Torque – Strength) is 7 – 10 = 3 and the s.d. of the difference is
[(0.9)2 + (1.2)2] = 1.5. The Pr[Torque > Strength] = Pr[Torque – Strength > 0] =
Pr [Difference >0] = Pr[Z > 3 /1.5 ] = Pr(Z>2) = .0228, which is the probability that a cap will
break while being fastened (See Table T-3)
17. $***When can we assume that the sampling distribution of the sample proportion, p, is approximately
normal? Give a explanation for why this is so. Why do we need BOTH parts of the rule? (think about what
the distribution would look at is p = 0.01 and p = 0.99)
(1) When 10  n AND 10  n(1-)
(2) In order for the normal distribution to approximate the sampling distribution of the sample proportion
well, we must have both the number of successes, n, AND the number of failures, n(1-), to be at least 10.
When  is too small or too large (so 1) is small, the sampling distribution of the sample proportion will be
skewed  too small  skewed to the right, too large  skewed to the left. By increasing the sample size, n, the
distribution appears 'smoother' and the tails become negligible.
18. ***When can we assume that the sampling distribution of the sample mean, x , is approximately
normal? There are two different assumptions for this answer.
(1) When the original population (of individual values) is normal.
Or
(2) When the sample size, n , is “large”.