Download Answers to Practice Exam #2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Answers to Practice Exam #2
Case 1 - We need to use a hypothesis test for this case since we are trying to establish if women athletes have a higher mean
blood cell count compared to the population of young women in general whose mean is generally 4.5 million.
Since  is not known we need to use the t test in order to calculate the probability.
The sample mean ( x ) is 4.9 and the sample standard deviation is s = 0.42 .
H0 =  = 4.5 million
t
Ha =  > 4.5 million.
4.9 - 4.5
 3.30. So we want to know what is the probability that we could have observed a value of 4.9 for the sample
0.42
12
mean of twelve women athletes if the true population mean for the red blood cell count is 4.5 million. The t value of 3.30 will
help us determine this question.
P( x > 4.9) = P(t > 3.30 ) with 11 degrees of freedom.
Using Excel I get P(t > 3.30) = 0.0035. This is less than 1%, so this provides strong evidence against the null hypothesis.
If the true blood cell count of women athletes is 4.5 million we would observe a sample blood cell count as high as 4.9 from
averaging twelve women, less than 1% of the time. So we can say that this result is significant at the 1% level, for example.
Using the table of values I see that 0.0025 < P-value < 0.005.
Case 2 – In this case we use a hypothesis test, however, since  is known we will use a z score to calculate the probability.
Ho :  = 100
z
Ha:  < 100
93  100
 -2.95. If the true mean IQ is 100, what is the probability of observing a sample average of 93 for 40 inmates?
15
40
P( x < 93) = P(z < -2.95)  0.0027 (p-value) using excel.
This means we have less than 1% chance of observing such a low number (an IQ of 93) if the true mean is 100. So our result is
significant at the 1% level ( = 1%).
Case 3 - In this case we need to use a hypothesis test where we compare two population means. Since  is known,
then we will use the z statistic instead of the t statistic.
Ho:  w = nw
Ha: w < nw. Let 1 equal the mean sentence for white convicted criminals and let 2 equal the
mean sentence for non white criminals.
s 2p 
t
(30  1)26  (38  1)30
 28
30  38  2
34  32
1
 1
28  
 30 38 
 1.55. So if there is no difference between the means of each population the probability of
observing a difference as large as 2 years between the two population is given by P(t > 1.55) = 0.0684. My p-value
is 0.0684.
Case 4 – The 95% confidence interval is
(1400  1200)  1.976
802 1202

using 149 degrees of freedom.
200 150
I used Excel and the TINV command: =tinv(0.05,149) = 1.976 = t*
My confidence interval is (178, 222). So my procedure has a 95% chance of capturing the true mean difference between the
average lifetime of the two bulb brands. So it would be safe to say that this study shows that brand A bulbs last longer on
average than brand B bulbs.
Case 5 – We are dealing with proportions here and we want to estimate the true proportion, so a confidence interval is in order.
If I use the large sample C.I. formula I get
.55  1.96
.55(1  .55)
which yields (0.452, 0.648) . If I use the Wilson
100
estimate to create the confidence interval I get
57 
57 
1 

55  2
104  104 
 1.96
which yields the interval (0.452, 0.644) nearly the same as above. There is two
100  4
104
reasons
for this; n is 104 which is considered to be a big sample, but the other factor which plays a bigger role is that our
sample proportion is near 0.5. The Wilson Estimate was created to address small sample problems and value near 0
or 1. Some argue however that you need a large sample anyway to create confidence interval for proportions using
a Z statistic.
Case 6 - In this case we are comparing the population means of two populations. The standard deviations are not
known so we need to use a t statistic in order to calculate the probability.
x op = 150.2
x np
sop = 10.1
= 158.2 snp = 9.2
Ho:  np = op
Ha:  np > op.
t
158.2  150.2
10.12 9.2 2

10
10
 1.85.
The degrees of freedom is 9. Now calculate P(t > 1.85) = 0.0487 (p-value
found using excel =tdist(1.85, 9, 1). Thus we would observe such a difference or larger about 4.87% of
the time even though the mean of the compression tests for the two processes are equal. Thus at a 10%
significance level our result is significant. But at 1% this difference is not significant.
Case 7 – In this case we are looking at prescribing some treatment to the same subject and testing them
before and after to compare the results. Since we can identify with a particular subject the before and after
measurement we can determine what was the change due to the treatment on that particular subject. So a
matched t pairs experiment is the prescribed test of hypothesis for this situation.
Subject
Pretreatment
Post treatment
Difference
1
3.4
4.5
-1.1
2
3.6
3.4
0.2
3
4.8
6.5
-1.7
4
3.4
3.7
-0.3
5
4.8
7.4
-2.6
6
5.8
6.0
0.2
7
4.2
8.4
-4.2
8
5.7
8.5
-2.8
9
4.1
7.5
-3.4
10
4.3
4.3
0
Thus our population of interest is going to be the differences of the before and after treatment
measurements.
H0: D = 0 Ha:D < 0. Notice that my alternative says that the mean is less than zero. This is due to the
way I subtracted the numbers.
x = -1.57 sD = 1.61
t
 1.57  0
 -3.08 with 9 degrees of freedom. So now we calculate P(t < -3.08)  0.0066 (p-value).
1.61
10
Clearly the difference observed is not very common if we assume that the post and pre treatment
measurements means are supposed to be equal. So our result is clearly significant at the 1% significance
level.
Case 8 – Our proportions will give the proportion of people that were divorced after 5 years.
pˆ c 
145
110
= 0.29 , pˆ nc 
= 0.22.
500
500
pˆ 
145  110 255

 0.51
500  500 500
Ho: c = nc Ha: c  nc.
z
0.29  0.22
1 
 1
0.51(.49)


 500 500 
= 2.21. P(Z > 2.21) = 0.0136. The P-value is 2(.0136) = 0.0272.
Case 9 – Since all we care to do is estimate the mean diameter a confidence interval is in order. Since the
population standard deviation is not known then I will use a t statistic. The degrees of freedom is 199, and
the associated t value for 95% confidence is 1.972 which is very close to the associated z value 1.96. If I
round to the nearest thousands of a unit there is no difference between the answers.
0.042
200
(0.818 , 0.830)
0.824  1.972
Case 10Veteran
1
TCDD in plasma 2.5
TCDD in fat tissue4.9
Difference
-2.4
x D = -1.0
2
3.1
5.9
-2.8
3
1.8
4.2
-2.4
4
6.0
10
-4.0
5
6.9
7.0
-0.1
6
7.2
7.7
-0.5
sD = 1.8.
Ho:  = 0, Ha:   0.




 1.0  0 

P( x < -1.0) = P t 

1.8 


14 

= P(t < -2.08)
= .0289 using excel tdist(2.08, 14, 1).
P-value = 2(0.0289) = .0578.
0.025< P(t < -2.08) < 0.05,
.05 < P-value < 0.1.
7
2.5
2.3
0.2
8
4.1
2.5
1.6
9
4.7
4.4
0.3
10
3.3
2.9
0.4
11
1.6
1.4
0.2
12
1.8
1.1
0.7
13
3.5
6.9
-3.4
14
1.8
4.2
-2.4
Case 11 - Since the distribution is severely skewed (it is not by the way), and thus a matched t
test is not appropriate, we then should employ the sign test.
Car
MPG without
Additive
MPG with Additive
Difference
1
35.8
2
37.7
3
39.4
4
36.8
5
36.6
6
42.1
7
38.4
8
37.3
9
45.2
10
24.1
36.2
39.8
40.1
39.3
36.6
40.9
38.6
39.1
42.1
29.5
0.4
+
2.1
+
0.7
+
2.5
+
0
0.8
+
0.2
+
1.8
+
-3.1
-
5.4
+
Car number five showed no improvement, so it will not get counted, thus I really have just nine
observations. Of those eight showed improved gas mileage, one did not. How likely is this to happen if
we fuel additive did nothing to improve gas mileage?
Binomial Probability
Distribution,
n = 9, p = 0.5
# of plus
signs
0
1
2
3
4
5
6
7
8
9
Probability
0.001953
0.017578
0.070313
0.164063
0.246094
0.246094
0.164063
0.070313
0.017578
0.001953
Ho: The median mpg with or without the additive are equal.
Ha: The median mpg with additive is larger than without.
I was expecting 0.5(9) = 4.5 plus signs; no change.
To the left is the probability of observing x plus signs. Since I yielded eight plus
signs, the question will be what is the likely hood of observing that many plus signs
or more if we were expecting 4.5 plus signs.
Let X count the number of plus signs.
P(X ≥ 8) = 0.017578 + 0.001953. The sum is my p-value; so my p-value is just
under 2%.
This shows that the result is not very likely to occur just by chance, but like always it is always up to the
reader to decide if this result is rare enough for them to say it is statistically significant.