Download uNID

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Name:
uNID:
First Midterm Exam (MATH1070 Spring 2012)
Instructions: This is a one hour exam. You can use a notecard. Calculators
are allowed, but other electronics are prohibited.
1. [40pts] Multiple Choice Problems
In a statistics class with 136 students, the professor records how much
money each student has in his or her possession during the first class of
the semester. The following histogram is of the data collected. Based on
this histogram, answer questions 1) – 3).
1) The number of students with under USD 10 in their possession is
closest to
C
A. 50. B. 70. C. 60. D. 40.
2) The percent of students with over USD 20 in their possession is about
B
A. 10%. B. 20%. C. 30%. D. 40%.
3) From the histogram, which of the following is true?
A.
B.
C.
D.
A
The mean is much larger than the median.
The mean is much smaller than the median.
It is impossible to compare the mean and median for these data.
The mean and median are approximately equal.
Name:
uNID:
A sample was taken of the verbal SAT scores of applicants to a California
State College. The following is a boxplot of the scores. Based on this
histogram, answer questions 4) and 5).
4) Based on this boxplot, the interquartile range is closest to
A. 500. B. 200. C. 600. D. 400.
B
5) If 25 points were added to each score, then interquartile range of the
new scores would
A
A.
B.
C.
D.
remain unchanged.
be increased by 5.
be increased by 25.
be increased by 625.
6) A Normal density curve has which of the following properties?
A.
B.
C.
D.
D
It has a peak centered above its mean.
It is symmetric.
The spread of the curve is proportional to the standard deviation.
All of the above.
Name:
uNID:
Refer to the following scatterplot For each menu item at a fast food restaurant, fat content (in grams) and number of calories were recorded. A
scatterplot of these data is given below.
7) A plausible value for the correlation between calories and fat is
A. +0.9. B. -0.9. C. -1.2. D. +0.2.
A
8) Which of the following is not true of the correlation coefficient r?
D
A. −1 ≤ r ≤ 1.
B. If r = 0, then there is no relationship between x and y.
C. If r is the correlation between x and y, then r is also the correlation between y and x.
D. Multiplying all data values (x’s and y’s) by 10 will have no impact
on r.
Name:
uNID:
2. [12pts] A company produces packets of soap powder labeled Giant
Size 32 Ounces. The actual weight of soap powder in such a box has
a Normal distribution with a mean of 33 oz and a standard deviation
of 0.7 oz. To avoid having dissatisfied customers, the company says a
box of soap is considered underweight if it weighs less than 32 oz. To
avoid losing money, it labels the top 5% (the heaviest 5%) overweight.
1). What proportion of boxes is underweight (i.e., weigh less than
32 oz)?
2). How heavy does a box have to be for it to be labeled overweight?
1. Let X denote the weight of a box. Then we want to know the
proportion of boxes such that X < 32. The corresponding zscore is
32 − 33
X − 33
=
= −1.43
Z=
0.7
0.7
From the table of the standard normal cumulative proportions,
we find that the proportion for X < 32 is 0.0764.
2. Let x0 be the threshold of overweight. Then the proportion
corresponding to X ≥ x0 is 5%, or equivalently the proportion
corresponding to X < x0 is 95%. From the table of the standard normal cumulative proportions, we find that the z-score
corresponding to 0.95 is 1.645 (both 1.64 and 1.65 are O.K.).
Therefore
x0 = 0.7(1.645) + 33 = 34.1515.
Name:
uNID:
3. [10pts] The following are the heights (in inches) of 25 students in a
given class. Draw the histogram.
51
62
68
53
63
69
55
63
70
55
64
70
57
66
72
59
66
74
60
67
78
60
68
62
68
√
Since there are 25 observations, it is suggested to use 25 = 5 bins for
our histogram. (It’s O.K. to use different number of bins as long as that
number is neither too big nor too small.) The range is 78 − 51 = 27. Thus
the bin size should be around 6. In fact, it is more natural to use 6 bins
and use bin size 5 here. The following is the frequency table
bins
50 ≤ x < 55
55 ≤ x < 60
60 ≤ x < 65
65 ≤ x < 70
70 ≤ x < 75
75 ≤ x < 80
Here is the histogram:
frequency
2
4
7
7
4
1
Name:
uNID:
4. The following are the grades of 18 students in a given exam.
(a) [4pts] Make a stemplot.
Here we draw a stemplot with split stems, i.e., the stem 6− represents
60 ∼ 64 and the stem 6+ represents 65 ∼ 69. The stemplot is given
as follows:
6−
6+
7−
7+
8−
8+
9−
9+
03
9
4
66789
23
568
02
79
(b) [10pts] Find the five-number summary (min, Q1, median, Q3, max).
Since there are 18 observations, the median is the average of the 9th
and 10th observation, i.e. (79 + 82)/2 = 80.5. Since the median is
not an observation in the data set, the lower half is the 9 observation
from 60 to 79. Then the first quartile which is the median of the
lower half is the 5th observation, which is 76. Similarly, the third
quartile is 88. Therefore the five number summary is
min
60
Q1
76
median
80.5
Q3
88
max
99
(c) [6pts] Are there any potential outlier(s) according to the 1.5×IQR
rule?
We have
IQR = Q3 − Q1 = 88 − 76 = 12,
and
1.5 × IQR = 12(1.5) = 18.
Since Q1 − 18 = 58 < 60 and Q3 + 18 = 106 > 99, there is no outlier
according to the 1.5 × IQR rule.
Name:
uNID:
5. A student wonders if people of similar heights tend to date each other.
She measures herself, her dormitory roommate, and the women in the
adjoining rooms; then she measures the next man each woman dates.
Here are the data (heights in inches).
Women x
Men y
66
72
64
68
66
70
(a) [4pts] What is the mean of the heights of these three women? What
about men?
We have
x̄ =
66 + 64 + 66
= 65.333
3
and
ȳ =
72 + 68 + 70
= 70
3
(b) [8pts] Compute the standard deviation of the height for these 3 men
by complete the following table. Use your calculator only to add,
subtract, multiply, divide, square or take the square root of numbers.
yi
72
68
70
yi − ȳ
2
-2
0
(yi − ȳ)2
4
4
0
Therefore the standard deviation of y is
v
r
u
n
u 1 X
1
sy = t
(yi − ȳ)2 =
(4 + 4 + 0) = 2.
n − 1 i=1
3−1
Now find the standard deviation of the height for these 3 women by
the same procedure.
xi
66
64
66
xi − x̄
0.667
-1.333
0.667
(xi − x̄)2
0.444
1.778
0.444
Therefore the standard deviation of x is
v
r
u
n
u 1 X
1
(xi − x̄)2 =
sx = t
(0.444 + 1.778 + 0.444) = 1.155.
n − 1 i=1
3−1
(c) [6pts] Find the correlation coefficient r between the height of men
and women.
n 1 X xi − x̄
yi − ȳ
·
n − 1 i=1
sx
sy
1
0.667
2
−1.333
−2
0.667
0
=
·
+
·
+
·
3−1
1.155
2
1.155
2
1.155
2
r=
= 0.866
8