Download stat 200 final practice test

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
STAT 200 FINAL PRACTICE TEST (SOLUTIONS)
1.
A)
15 6

 .375
56 56
B)
3
 .375
8
C)
15
 .2679
56
D) P(G1 | B 2) 
15
P(G1  B 2)
15
 56 
 .4286 (note  means “and”)
35
P( B 2)
35
56
2.
15 9

 .375
64 64
3
B)  .375
8
15
 .2344
C)
64
A)
15
P(G1  B 2)
15
 64 
 .375
40
P( B 2)
40
64
10
10
 
 
E) 1   (.375) 0 (.625)10   (.375)1 (.625) 9  .9363
0 
1 
D) P(G1 | B 2) 
10 
F)  (.375) 2 (.625) 8  .1474
2 
G)   np  100(.625)  62.5   npq  100(.625)(.375)  4.841
H)
z
59.5  62.5
 .62
4.841
A  .5000  .2324  .7324
3A.
 x  3780  x  3584600
 x 
3780
3584600 
x  n
4
2
2
2
2
s2 

 4166.67
n 1
3
CI for variance or standard deviation use:
Confidence interval for population variance: Use  2 
df ( s 2 )
2
, get two values from
the  2 table and solve for  2 twice. (Take square root if you want  )
df = 3 and the two  2 table numbers are .216 and 9.348.
.216 
3(4166.67)

2
so  2 
3(4166.67)
 57870.4
.216
3(4166.67)
 1337.18
and  2 
9.348
1337.18 to 57870.4
3B. Sample size needed for CI for mean use : sample size needed for CI for mean is
 z 
n

 E 
2
 z 
 1.960(80) 
n
 
  246
10
 E 


2
2
3C. Sample size needed for CI for proportion use: sample size for CI for proportion
z 2 pq
is n 
(use p  q  .5 to guarantee sample size is large enough, use p ' in place
E2
of p to get reasonable estimate)
n
z 2 pq 1.960 2 (.5)(.5)

 4269
E2
.015 2
3D. Difference of three means use ANOVA:
ANOVA: Collect data from SRS’s from m different groups. Assume the
populations are normal and the data are collected independently. Then the F statistic for rejecting H 0 : All population means are equal in favor of H a : There is
some difference in the population means is found as follows (the sums are over each
of the S sources):
Fdata 
s 2factor
2
serror
_
where s
2
factor
_
 n ( x  x)

i
i
df factor
df factor  S  1 and df error   ni  S
2
and s
2
error
 (n

i
 1) si2
df error
_
_
x
 ni x i
n
 x is the mean of all the data, n

n
i
is how pieces of data from source i,
_
x i is the sample mean of all the data from source i, and s i2 is the sample variance of
all the data from source i. The Anova Test is always a one-tail test to the right.
Company 1
Number of bulbs studied 4
Lifetimes
990,1010,900,880
3780
x
x
2
Sample mean
Sample variance
Company 2
3
1000,900,1200
3100
Company 3
3
850,800,1000
2650
3584600
3250000
2362500
945
4166.67
1033.33
23333.33
883.33
10833.33
H 0 : All population means are equal in favor
H a : There is some difference in the population means
df factor  S  1  2 df error   ni  S  10  3  7
S 3
_
n
x
_
i
xi
n
3780  3100  2650
 953
10

_
_
s
2
factor
 n ( x  x)

s
2
error
 (n

i
Fdata 
2
i
df factor
i
 1) si2
df error
s
2
factor
s
2
error


4(945  953) 2  3(1033.33  953) 2  3(883.33  953) 2
 17088.2267
2

3(4166.67)  2(23333.33)  2(10833.33)
 11547.619
7
17088.2267
 1.480
11547.619
NO
F  4.7374
2
7
If there is no difference the chance we would find such strong or stronger evidence than
we got that there is a difference is over 5%. This is assuming all conditions were met and
the data were obtained in a proper fashion.
3E1) Difference of two means from independent samples use:
Difference of
t, df = min of
the difference
difference of
means from 2
the sample
of the two
the population
independent
sizes - 1
sample means means in Ho
samples
subtracted in
(often 0)
appropriate
order
s12 s 22

n1 n 2
H 0 :  other   or  other    0
H a :  other   or  other    0
_
_
Picture of how x other  x would be distributed if  other    0 (using t with df = 9)
t data 
(1040  1000)  0
80 2 50 2

20
10

40
 1.675
23.8747
NO
If there is the mean of the other company is not higher, the chance we would find such
strong or stronger evidence than we got that it is higher is between 5% and 10%. This is
assuming all conditions were met and the data were obtained in a proper fashion.
3E2) H 0 :  other   or  other    0
H a :  other   or  other    0
_
_
Picture of how x other  x would be distributed if  other    0 (using t with df = 9)
t data 
(1040  1000)  0
2
2
80
50

20
10

40
 1.675
23.8747
NO
If there is the mean of the other company is the same, the chance we would find such
strong or stronger evidence than we got that it is different is between 10% and 20%. This
is assuming all conditions were met and the data were obtained in a proper fashion.
3F) Comparing two variances use: Hypothesis test for ratio of two variances: Use
s12
Fdata  2 maker sure the top is bigger than the bottom. Keep track of the two
s2
different df’s. Use the F-table for the critical value(s). Note that making F(data)>1
even if you have two critical values as in a two-tail test, only the right hand one
matters.
Data summary:
x
x
Before = B After = A
240
233
2
 x 
x n
2
14402
13583
.6667
3.5833
2
s2 
n 1
A
 1 (note the A went on the top since A’s sample standard
B
H 0 :  A   B or
deviation is bigger)
H a :  A   B or
A
1
B
s A2
would be distributed assuming  A   B . Note the tail is .05. The df
s B2
for the top is 3 and for the bottom is 3.
Picture of how
F33  9.2766
Fdata 
3.5833
 5.375
.6667
NO
If the insert does not raise the variance, the chance we would find such strong or stronger
evidence than we got that it does is over 5%. This is assuming all conditions were met
and the data were obtained in a proper fashion.
G) Sample size needed for CI for proportion use: sample size for CI for proportion
z 2 pq
is n 
(use p  q  .5 to guarantee sample size is large enough, use p ' in place
E2
of p to get reasonable estimate)
n
z 2 pq 1.960 2 (.3)(.7)

 3586
E2
.015 2
H1) Comparing two means from matched pairs use:
Difference of
t, df = n-1
sample mean
population
means from 2
of the
difference
dependent
differences
mean in Ho
samples (a.k.a.
subtracted in
(often 0)
matched pairs)
appropriate
order
Bulb 1
Power consumption before 60
Power consumption after
57
after - before
3
Bulb 2
61
58
3
Bulb 3
59
57
2
s2
Where s
n
is the s.d. of
the
differences.
Bulb 4
60
61
-1
 x 
x  n
2
x  7
_
 x 2  23 x 
H0 : B   A
H a : B   A
 x  7  1.75
n
or  B   A  0
or  B   A  0
4
2
s
n 1

23 
3
72
4  1.893
_
_
Picture of how x A  x B would be distributed if  A   B  0 (using t with df = 3).
t data 
1.75  0
 1.849
1.893
NO
4
If the insert does not change power consumption, the chance we would find such strong
or stronger evidence than we got that it does is between 10% and 20%. This is assuming
all conditions were met and the data were obtained in a proper fashion.
H2)
H 0 :  B   A or  B   A  0
H a :  B   A or  B   A  0
_
_
Picture of how x A  x B would be distributed if  A   B  0 (using t with df = 3).
t data 
1.75  0
 1.849
1.893
NO
4
If the insert does not save power, the chance we would find such strong or stronger
evidence than we got that it does is between 5% and 10%. This is assuming all
conditions were met and the data were obtained in a proper fashion.
I1) HT for standard deviation use:
Hypothesis test for population variance: Get critical value(s) from  2 table, and use
2
 data

df ( s 2 )

2
where  2 is from H 0 .
H 0 :   50
H a :   50
Picture of how
( df ) s 2
2
would be distributed assuming   50 . Note the tails are each
.025. df = 3.
 2  .216
 2  9.348
From A: s 2  4166.67

2
data

(df )( s 2 )

2

(3)( 4166.67)
5
50 2
NO
If the standard deviation is 50, the chance we would find such strong or stronger evidence
than we got that it is not 50 is between 20% and 100%. This is assuming all conditions
were met and the data were obtained in a proper fashion.
I2)
H 0 :   50
H a :   50
Picture of how
( df ) s 2
2
would be distributed assuming   50 . Note the tail is .05.
df = 3.
 2  7.815
From A: s 2  4166.67
2
 data

(df )( s 2 )

2

(3)( 4166.67)
5
50 2
NO
If the standard deviation is not over 50, the chance we would find such strong or stronger
evidence than we got that it is over 50 is between 10% and 90%. This is assuming all
conditions were met and the data were obtained in a proper fashion.
J) This is not a HT or a CI, it is a probability question, we need to find the area under the
curve.
z
A  .5  .2549  .2451
K) CI for proportion use:
1 sample
z
proportion
p' 
22
 .3667
60
.3667  1.960
1055  1000
 .69
80
sample
proportion,
p’=number of
successes / n
(.3667)(.6333)
60
population
proportion, p,
in Ho
HT:
CI:
36.67%  12.19%
pq
n
p' q'
n
L) This is not a HT or a CI, it is a probability question, we need to find the area under the
curve.
z
900  1000
 2.50
80
4
A  .5  .4938  .9938
M1) Difference of two proportions, use:
Difference of z
the
difference in
proportions
difference of population
(percentages,
the two
proportions
or
sample
in Ho (often
probabilities
proportions 0) (0 and
of success)
subtracted
non 0
from 2
in
differences
samples
appropriate have
order
different
standard
deviations
see 
HT(0 case):
p 'pool q 'pool
n1

p 'pool q 'pool
n2
x1  x2
n1  n2
CI & HT(non 0 case):
p'1 q'1 p' 2 q' 2

n1
n2
p 'pool 
where
H 0 : pOther  p or pOther  p  0
H a : pOther  p or pOther  p  0
'
Picture of how all pOther
 p ' would be distributed if pOther  p  0 . The best evidence that Ha
is true is in the shaded part that is in both tails. The total shaded areais.05.
22  20
 ..3818
60  50
20 22
(  )0
.0333333333
50 60


 .358
(.3818)(.6182) (.3818)(.6182) .0930289625

50
60
p 'pool 
z data
NO
If the percentages are the same, the chance we would find such strong or stronger
evidence than we got they differ is .7188. This is assuming all conditions were met and
the data were obtained in a proper fashion.
M2) H 0 : pOther  p or pOther  p  0
H a : pOther  p or pOther  p  0
'
Picture of how all pOther
 p ' would be distributed if pOther  p  0 . The best evidence that Ha
is true is in the right tail of .05.
22  20
 ..3818
60  50
20 22
(  )0
.0333333333
50 60


 .358
(.3818)(.6182) (.3818)(.6182) .0930289625

50
60
p 'pool 
z data
NO
If the percentage is not greater for the other company, the chance we would find such
strong or stronger evidence than we got that it is higher is .3594. This is assuming all
conditions were met and the data were obtained in a proper fashion.
N) Matched pairs, use:
Difference of
t, df = n-1
means from 2
dependent
samples (a.k.a.
matched pairs)
sample mean
of the
differences
subtracted in
appropriate
order
population
difference
mean in Ho
(often 0)
 x 
x  n
2
_
From earlier: x 
1.75  3.182
1.893
4
 x  7  1.75
n
4
2
s
1.75  3.012
n 1

23 
3
s2
Where s
n
is the s.d. of
the
differences.
72
4  1.893
O) Difference of two proportions, use:
Difference of z
the
proportions
difference of
(percentages,
the two
or
sample
probabilities
proportions
of success)
subtracted
from 2
in
samples
appropriate
order
difference in
population
proportions
in Ho (often
0) (0 and
non 0
differences
have
different
standard
deviations
see 
HT(0 case):
p 'pool q 'pool
n1

p 'pool q 'pool
n2
x1  x2
n1  n2
CI & HT(non 0 case):
p'1 q'1 p' 2 q' 2

n1
n2
p 'pool 
“other – original”
22
 ..3667
60
.40  .3667  1.960 (.4)(.6)  (.3667)(.6333) 3.33%  18.25%
50
60
P) One mean, use:
1 sample mean t, df = n-1
sample mean
population
mean in Ho
'
p other

20
 .40
50
From A)
'
p original

 x  3780  x
 x 
2
 3584600
2
x 
2
s2 
n

n 1
4166.67
945  3.182
4
3584600 
3
3780 2
4  4166.67
945  102.70
_
x
3780
 945
4
s2
n
where
Q1) HT for proportion use:
1 sample
z
proportion
sample
proportion,
p’=number of
successes / n
population
proportion, p,
in Ho
HT:
CI:
pq
n
p' q'
n
1
3
1
Ha : p 
3
H0 : p 
Picture of how all p’ s would be distributed if p 
1
1
. The best evidence for p  is in
3
3
both tails. The total area of both tails is .05.
22 1

.03333
60
3
NO
z data 

 .548
(.3333)(.6667) .06086
60
If the percentage is 1/3, the chance we would find such strong or stronger evidence than
we got that it is not 1/3 is .5842. This is assuming all conditions were met and the data
were obtained in a proper fashion.
Q2) H 0 : p 
Ha : p 
1
3
1
3
Picture of how all p’ s would be distributed if p 
1
1
. The best evidence for p  is in
3
3
the right tail of .05.
22 1

.03333
60
3
NO
z data 

 .548
(.3333)(.6667) .06086
60
If the percentage is not over 1/3, the chance we would find such strong or stronger
evidence than we got that it is over 1/3 is .2912. This is assuming all conditions were met
and the data were obtained in a proper fashion.
R1) One mean, use:
1 sample mean t, df = n-1
H 0 :   1000
H a :   1000
sample mean
population
mean in Ho
s2
n
_
Picture of how all x ’s would be distributed if   1000 . The best evidence that
  1000 is in both tails. Each tail is .025.
From earlier:
 x  3780  x
 x 
x  n
2
 3584600
2
2
s2 
n 1

3584600 
3
3780 2
4  4166.67
3780
945  1000
 55
 945 t data 
NO

 1.704
4
4166.67 32.27487
4
If the mean was 1000, the chance we would find such strong or stronger evidence than
we got that it is not 1000 is between 10% and 20%. This is assuming all conditions were
met and the data were obtained in a proper fashion.
_
x
R2) H 0 :   1000
H a :   1000
_
Picture of how all x ’s would be distributed if   1000 . The best evidence that
  1000 is in the left tail of .05.
From earlier:
 x  3780  x
 x 
x  n
2
2
s2 
_
x
n 1
3780
 945
4
2
 3584600
3780 2
3584600 
4  4166.67

3
t data 
945  1000

 55
 1.704
32.27487
NO
4166.67
4
If the mean was at least 1000, the chance we would find such strong or stronger evidence
than we got that it is less than 1000 is between 5% and 10%. This is assuming all
conditions were met and the data were obtained in a proper fashion.
S) Difference of two means from independent samples use:
Difference of
t, df = min of
the difference
difference of
means from 2
the sample
of the two
the population
independent
sizes - 1
sample means means in Ho
samples
subtracted in
(often 0)
appropriate
order
“Other – Original” (1040  1000)  2.262
T) z
80 2 50 2

20
10
or 40  54.005
s12 s 22

n1 n 2
4. Show two characteristics are related, use: O and E stuff : Right tails only.
O  E 2
2
 data
=
E
H 0 :two characteristics are independent H a : they are related
df=(r-1)(c-1)
E’s are found by (row total)(column total)/(grand total)
H 0 : color and year independent
H a : color and year related
O’s
black white blue tan
2000 25
35
10
10
2001 60
60
16
24
2002 32
30
11
7
totals 117
125
37
41
E’s
black
2000 80(117)
 29.25
320
2001 160(117)
 58.5
320
2002 80(117)
 29.25
320
Picture of how all

Totals
80
160
80
320
white
80(125)
 31.25
320
160(125)
 62.5
320
80(125)
 31.25
320
blue
80(37)
 9.25
320
160(37)
 19.5
320
80(37)
 9.25
320
tan
80(41)
 10.25
320
160(41)
 21.5
320
80(41)
 10.25
320
O  E 2 ’s would be distributed if color and year were
E
independent. df =(4-1)(3-1) = 6
2
 table
 12.592
Use
O  E 2 = 4.25 2
3.75 2 .75 2 .25 2
1.5 2 2.5 2 3.5 2 2.5 2






E
29.25 31.25 9.25 10.25 58.5 62.5 19.5 21.5
2.75 2 1.25 2 1.75 2 3.25 2
NO




 3.862
29.25 31.25 9.25 10.25

2
data
=

If there was no relationship, the chance we would find such strong or stronger evidence
than we got that there is between 10% and 90%. This is assuming all conditions were
met and the data were obtained in a proper fashion.
5. Showing data is not distributed a certain way, use: O and E stuff : Right tails
O  E 2
2
only. Use  data
=
E
H 0 : data distributed a certain way H a : its not distributed that way
df=number of categories – 1
E’s are found using np where p is the probability of being in a category in Ho
H 0 : pblack  p white  .35, pblue  p tan  .15
H a : not as above
black
white
blue
tan
Total
O’s 25
35
10
10
80
E’s 80(.35) = 28 80(.35) = 28 80(.15) = 12 80(.15) = 12 80
Picture of how all

O  E 2 ’s would be distributed if color and year were
E
independent. df =(4-1) = 3
2
 table
 7.815
2
 data
=
O  E 2 =
E
9 49 4
4

 
 2.738
28 28 12 12
NO
If the colors were distributed 35-35-15-15, the chance we would find such strong or
stronger evidence than we got that the data is not distributed that way is between 10%
and 90%. This is assuming all conditions were met and the data were obtained in a
proper fashion.
6. A)
80
70
production level
60
50
40
30
20
10
0
0
5
10
15
20
dexterity score
 x  70
 y  306  x
 x 
x  n
B) r 
y
 1006
 y 
y  n
2
2
x y
 xy   n


 x   x   y   y 


n 
n
2
2
2

2
 18984
2
 26
2
2
2

 256.8




=.955
 xy 
 xy  4362
 x y  78
n
.
C) H 0 :   0
Ha :   0
t table  2.353
t data  r
n2
3
 .955
 5.77
2
1 r
1  .955 2
x y
 xy   n
D) m 
 x
x  
2
YES
b
=3
 y  m x =19.2
2
n
n
y  3 x  19.2 or productivity  3(dexterity)  19.2 Plugging in x = 0 we get y = 19.2
and plugging in x = 14 we get y = 61.2. Next we plot the points and draw the line in the
graph.
E)
y
2
 b y  m xy
n2
_
1

n
( x0  x) 2
 x 
x  n
2

2
 2.757
1 (14  14) 2

 .447
5
26
_
1
1 
n
( x0  x) 2
 x 
x  n
2
 1
2
61.2  3.182(2.757)(1.095)
F) 61.2  3.182(2.757)(. 447)
1 (14  14) 2

 1.095
5
26
or 61.2  9.61
or
61.2  3.92
Related documents