Download Oct.27

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
MATH 1131 MIDTERM
-
Wednesday, October 27; 50 minutes
~
NAME: _ _ _ _ _S_O_L!_U_f_IO_N_CS
_ _ _ _ _ _ _ __
H+\
__I~_e_
STUDENT NUMBER: _ _ _
__\_(-'--H-O-P-~-~-I------·~
There are 5 questions on this test, worth a total of 50 points. The questions are in no particular
order.
Make sure that you show your work (when this is possible). When asked for comments in a ques­
tion, do so succinctly, and you may use point form if you wish. We will be looking for a correct
answer, not a lengthy answer.
The last page contains the probability distributions for discrete RVs and you may detach it from
the rest of the test. You may also use this page as scrap paper. You may write on the back of the
pages if you run out of space.
Good Luck!
1. (2 marks each) Determine whether each of the following statements is true or false (write
T or F beside each statement). For these questions you do not need to show your work.
(a) For a data set with a strongly right-skewed histogram, the sample mean will be
smaller than the sample median.
"
f
,­
(b) Suppose that we observe the outoome of three independent coin tosses. This is an
example of a random variable.
(c) The current median income in Toronto is 23,000 annually. Suppose that the top 5%
of Toronto's population doubles its income, the bottom 5% of the population halves
its income, and the income of the remaining population stays the same. Then the
median income is still 23,000.
(d) Two scatterplots for two different data sets are shown below. You know that the
correlations coefficients for the two data sets are 0.823 and -0.518, but you can't
remember which is which for sure. From looking at the plots, we can see that the
correlation for the one on the right must be 0.823, and for the one on the left must be
-0.518.
•
•
C!
'"
"1
..
"l
'"
"F
~
'"
• •• •
•
0
.-'
•
0.0
••
C!
••
•
OJ
•
"l
• ••
• ••
0.2
•
•
"l
T
•
•
•
• •
• •
C!
•
•• •
•
•
0.4
0.6
0.8
-1.0
-0.5
0.0
0.5
(e) Consider two events, A and B. The probability of event A is 0.6 and the probability of
event B is 0.7. It follows that the two events cannot be mutually exclusive.
2. A large cell phone company has three manufacturing plants: plants A, B, and C. They
are concerned about the number of phones which are manufactured but are defective.
Plants A and B are well-established with good standards of practice, and only 3% of the
products they manufacture are defective. Plant C is relatively new, and 10% of its products
are defective. Suppose that 20% of the company's cell phones are manufactured in plant
A, 40% in plant B, and 40% in plant C.
(a) (4 marks) A cell phone is selected at random from the company's main warehouse.
What is the probability that it is defective?
reD) :: 0.2..- 0.01>
t o,'"{·
0.06 t
D.Lt . o.
\
O. 05~
-=-
(b) (4 marks) Suppose that the cell phone sampled was not defective. What is the
probability that it came from plant C?
p (e.
1no-\" D')
.,
;,;. :'0,'\.{) .'3" L~o.eq,~ .­
..
"
(c) (2 marks) You go to the companies main warehouse and start selecting cell phones
at random. How many do you expect to select before you find your first defective
phone?
-
-
3. In an environmental impact study, data was gathered which examined the relationship
between the depth of a stream and the rate of its flow. The mean stream depth was
0.438 with a standard deviation of 0.1733205, and the mean rate of flow was 2.077 with
a standard deviation of 2.464233. The correlation between a stream's depth and its rate
of flow was 0.9729808.
•
I'- ­
...
CD ­
••
~
U"I ­
.I!l
,,)
~
"<t
~
en ­
­
C\I ­
..-
­
••
•
••
•
•
I
I
I
I
I
0.3
0.4
0.5
0.6
0.7
depth
(a) (2 marks) Comment on the scatterplot.
sh-o~ )posi -\)\JtL
, .nof·\rlU2.oJ
J,;£ssi~ ~
'-.', ,..
0.5 -
().1­
(b) (3 marks) Find the least squares regression line of flow rate on stream depth. Again:
The mean stream depth was 0.438 with a standard deviation of 0.1733205, and the
mean rate of flow was 2.077 with a standard deviation of 2.464233. The correlation
between a stream's depth and its rate of flow was 0.9729808.
"b = r ~
S"­
/\
~
= 13. t5& b
': -3.5g'2­
-3.8gr
..
+
,,'
(C)' (3 marks) Predict the flow rate for a stream with depth 0.613, and comment on your
result.
E~ -3.~K'
t
13. n (O.bl~)
=If.48iD
_
unre,li o..blL
Vl.O ~cv'
;>
-, ill .,.
~
,
, ':. riti ~S' &cJ.:(X
(d) (2 marks) You decide to re-do the regression line using only'the"p"Qints with 'l~
1v'tLl)(Jk
depth less than 0.5. Without doing any calculations, how would this change th~a
,- -U ­
squares line you obtained in part (b)?
"
..
,,'
"
' '~
:
. "
. .
.J
1
~,
.
...
.
,
.
:.'
.~,
4. A statistician collected 50 pennies and recorded their ages, by calculating age as the dif­
ference between the current year and the year on the penny. The ages she collected are
given below. The data has been sorted from smallest to largest.
0
0
0
0
0
0
0
0
0
0
1i 2
2
1
1
1
1
1
0
0
2
5.5
3 3 3
5
4
4 : 5
"
"
6
8
9
9 10 16 17 IT 19 19
19 20 20 21 22 23 25 25 28 36
.
(a) (2 marks) Comment on the histogram for this data ,set.
,
.' ste,iu~
0
<')
1.0
C\I
(fY\iV\A.OcJJ;.J
0
i)-
n'r
C\I
c
Q)
::l
CT
~
U.
~
ou.+l' .eJ S
0
1.0
0
prob. nnt
.
"10·'
0
20
'30
40
~ 6w.ut.~
.
age
-
(b) (3 marks) Find the five-number summary fOr this
cfata set.
, ...
"
' / ,
,', :',nn'n'~ 0
meAl ::3 fa ioe
)OL
~l~,i-S
:::c
h\L dJ:rut.
<:­
=
~.~. ~
7•
~
= It 0'-0')
o.~s
1+.5 -- I~+ 02'0 Cl~-I:r)
O.1-S
4
0+
(c) (2 marks) The mean penny age observed was 8.38 with a standard deviation of 9.67.
What proportion of the data lies within two standard deviations of the mean?
~,?>~
± d'
~.co1
-50 ­-
~t
• •
(d) (3 marks) Using the empirical rule, what proportion of the data should lie within two
standard deviations of the mean? Compare this result to your answer in part (c)
above, and explain why the two are (or are not) different.
! '
it \S surpnSI"5
+hti
MO
+hu..t
OJI\SW (2J5
oAt
SOS~~l· ~J.QJ1
.. >
--J{tcd. . pur' ~in'c.a.9
:clL5+V-;.bll~~, fs So
s~dl
•
5. Let X denote the number of cars owned by a randomly chosen family in Canada, and
suppose that X has the following distribution.
2 '3
x ,0 1
p(x) ? 0.2 0.4 0.3
.' .
,
,
(a) (2 marks) Find the probability that X is equal to zero.
0.1
(b) (2 marks) Find the mean of X.
.
.; .
,
(c) (6 marks) You randomly select 10 Canadian families. What is the probability that at
least two have two or more cars?
\- .p(X:OJ - P (X~IJ
°t
- 1­
"'!""'" '
('~) o.~ 'o.~ ~
MATH 1131 MIDTERM Wednesday, October 27; 50 minutes NAME: _ _ _ _ _S_O_L._U_fl_O_N_S_ _ _ _ _ _ __
STUDENTNUMBER: __________1r_~_tJ~
_______________________'
There are 5 questions on this test, worth a total of 50 points. The questions are in no particular
order.
Make sure that you show your work (when this is possible). When asked for comments in a ques­
tion, do so succinctly, and you may use point form if you wish. We will be looking for a correct
answer, not a lengthy answer.
The last'J)age contains the probability distributions for discrete RVs and you may detach it from
the rest of the test. You may also use this page as scrap paper. You may write on the back the
pages if you run out of space.
.
of
Good Luck!
. ."
'
1. (2 marks each) Determine whether each of the following statements is true or false (write
T or F beside each statement). For these questions you do not need to show your work.
(a) For a data set with a strongly left-skewed histogram, the sample mean will be smaller
than the sample median.
(b) Suppose that we observe the outc~nie of three independent coin tosses. This is an
example of a random variable.
F
(c) The current mean income in Toronto is 23,000 annually. Suppose that the top 5% of
Toronto's population doubles its income, the bottom 5% of the population halves its
income, and the income of the remaining population stays the same. Then the mean
income is still 23,000.
(d) Two scatterplots for two different data sets are shown below. You know that the
correlations coefficients for the two data sets are 0.823 and -0.518, but you can't
remember which is which for sure. From looking at the plots, we can see that the
correlation for the one on the right must be 0.823, and for the one on the left must be
-0.518. •
••
••
q
'"
:;J
'"
0
N
• •• •
•
••
•
.. •
"1
•
'"
"1
:3
•
0.0
• ••
0.2
• • • ••
•
•
e
OA
0.6
0.8
•
• •
.,.:
• •
•
••
•
::.l
-1.0
-0.5
0.0
0.5
(e) Consider two events, A and B. The probability of event A is 0.8 and the probability of
event B is 0.55. It follows that the two events cannot be mutually exclusive.
T
2. In an environmental impact study, data was gathered which examined the relationship
between the depth of a stream and the rate of its flow. The mean stream depth was
0.438 with a standard deviation of 0.1733205, and the mean rate of flow was 2.077 with
a standard deviation of 2.464233. The correlation between a stream's depth and its rate
of flow was 0.9729808.
l'­
•
-
•
co to
-
"<l"
-
('t)
-
C\I
-
,....
-
$
~
~
., •
••
•
•
I
I
I
I
I
0.3
0.4
0.5
0.6
0.7
depth
(a) (2 marks) Comment on the scafterplot.
SAMe A-S Wtil'ff: (b) (3 marks) Find the least squares regression line of flow rate on stream depth. Again:
The mean stream depth was 0.438 with a standard deviation of 0.1733205, and the
mean rate of flow was 2.077 with a standard deviation of 2.464233. The correlation
between a stream's depth and its rate of flow was 0.9729808.
(c) (3 marks) Predict the flow rate for a stream with depth 0.600, and comment on your
result.
5:; - 3.~ r
11>. <to
:. Lf, 3 1~, fa
•
'.
t
CO.bOO;
1;'
'.
." ~tI"
•
(d) (2 marks) You decide to re-do the regr~ssion':lme using only the points with stream
depth less than 0.5. Without doing any calculations, how would this change the least
squares line you obtained in part (b)?
3. A large cell phone company has three manufacturing plants: plants A, B, and C. They
are concerned about the number of phones which are manufactured but are defective.
Plants A and B are well-established with good standards of practice, and only 2% of the
products they manufacture are defective. Plant C is relatively new, and 10% of its products
are defective. Suppose that 30% of the company's cell phones are manufactured in plant
A, 20% in plant B, and 50% in plant C.
(a) (4 marks) A cell phone is selected at random from the company's main warehouse.
What is the probability that it is defective?
0.1> · O.OL + 0.2.· 0.02 + O.S· O.t
-= D.Ok,
(b) (4 marks) Suppose that the cell phone sampled was not defective. What is the
probability that it came from plant C?
o.s·o.~
O.Slt
0.45
--
o.5'i
'
"
'{
(c) (2 marks) You go to the companies ma,ifl,warehouse and start selecting cell phones
at random. How many do you expect to select before you find your first defective
phone?
-\ ­
-;. \b.1:,
4. A statistician collected 50 pennies and recorded their ages, by calculating age as the dif­
ference between the current year and the year on the penny. The ages she collected are
given below. The data has been sorted from smallest to largest.
000
0
0 0
1 1
1 1 2 2
4 4
5
5
5 5
10 16 17 17 19 19
22 23 25 25 28 36
000
0
001
1
233
3
689
9
19 20 20 21
(a) (2 marks) Cqmment on the histogram for this data set.
'. ,
,
,.
_,
:"
';
'•
.:.
"
'e...
..•.
":
""
aC')
L{)
C\J
a
C\J
i:l
c::
Q)
~
:l
rr
~
LL
~
L{)
a
0
10
20
30
40
age
(b) (3 marks) Find the f'iye:-~umper summary for this d~ta. s~t. '"
"",
;
~
.
(c) (2 marks) The mean penny age observed was 8.38 with a standard deviation of 9.67.
What proportion of the data lies within two standard deviations of the mean?
(d) (3 marks) Using the empirical rule, what proportion of the data should lie within two
standard deviations of the mean? Compare this result to your answer in part (c)
above, and explain why the two are (or are not) different.
5. Let X denote the number of cars owned by a randomly chosen family in Canada, and
suppose that X has the following distribution.
2 3
1 0.5 0.2
(a) (2 marks) Find the probability that X is equal to zero.
D.2..
(b) (2 marks) Find the mean of X.
1."1­
(c) (6 marks) You randomly select 10 Canadian families. What is the probability that at
least two have two or more cars?
SAME"