Download What would be a better way to represent this data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Probability amplitude wikipedia , lookup

Misuse of statistics wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 1
EDRS 811 Fall 13: Brigham
EDRS 811
Midterm Examination
Fall 2013 Brigham
Answers are highlighted in yellow (explanations are in red italics)
1. A sample of employees of a large pharmaceutical company has been obtained. The length of
time (in months) they have worked for the company was recorded for each employee. A
stemplot of these data is shown below. In the stemplot 6|2 represents 62 months.
6
7
8
9
223345789
000234445678889
00112344457999
0001112358
What would be a better way to represent this data set?
a.
b.
c.
d.
Display the data in a time plot.
Display the data in a boxplot.
Split the stems. (to show the data has ups and downs, not just one central peak)
Use a histogram with class width equal to 10.
Use the following to answer questions 2–4:
In a statistics class with 136 students,
the professor records how much money
each student has in their possession
during the first class of the semester.
The histogram shown below represents
the data he collected:
2. What is approximately the percentage
of students with under $10.00 in their
possession?
a.
35%
b.
40%
c.
44% (61/136 = 0.449)
d. 50%
3. Which of the following description(s)
is/are correct regarding the shape of
the histogram? (more than one appies)
a. Skewed right
b. Skewed left
c. Symmetric
d. An outlier is present.
e. Unimodal
f. Bimodal
Cheryl O’Connor
Grade on hardcopy: 30/30 points
4. What
a.
b.
c.
d.
Midterm Examination: 2
EDRS 811 Fall 13: Brigham
is approximately the number of students with $30.00 or more in their possession?
Less than 5
About 10
About 30
More than 100
Use the following to answer questions 5–8:
During the early part of the 1994 baseball season, many sports fans and baseball players
noticed that the number of home runs being hit seemed to be unusually large. Below are
separate stemplots for the number of home runs by American League and National League
teams based on the team-by-team statistics on home runs hit through Friday, June 3, 1994
(from the Columbus Dispatch sports section, Sunday, June 5, 1994).
American League
2
3 5
4 039
5 14788
6 488
7 57
National League
2 9
3 1
4 26788
5 3555
6 337
7
Legend: 2|9 represents 29.
5. What is the median for the number of home runs for the American League teams?
a.
45
b.
50
c.
50.5
d.
57.5
6. Determine whether each of the following statements is true or false.
a.
The American League plot is reasonably symmetric. True
b.
The National League plot is bimodal. False (unimodal, mode = 55)
c.
The median number of home runs hit by National League teams for this time period was
higher than the median for the American League teams. False (50.5 < 57.5)
d.
The lowest number of home runs hit by any team for this time period is 29. True
7. What is the mean for the number of home runs for the National League teams?
a.
45
b.
50
c.
50.1
d.
57.5
8. What is the maximum number of home runs from a National League team?
a.
7
b.
70
c.
67
d.
48
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 3
EDRS 811 Fall 13: Brigham
9. The ages (to the nearest year) of the 667 people participating in a large workshop are
summarized as follows:
Age
Number of students
18
14
19
120
20
200
21
200
22
90
23
30
24
10
25
2
32
1
What is true about the median age?
a.
It could be any number between 19 and 20.
b.
It must be 20.
c.
It must be 21.
d.
It must be over 21.
Use the following to answer questions 10–14:
The Insurance Institute for Highway Safety publishes data on the total damage suffered by
compact automobiles in a series of controlled, low-speed collisions. The cost for a sample of
9 cars, in hundreds of dollars, is provided below:
10
6
8
10
4
3.5
7.5
8
9
10. What is the median cost of the total damage suffered for this sample of cars?
a.
$400
b.
$800
c.
$730
d.
$1000
11. What is the first quartile for the above data?
a.
$350
b.
$600
c.
$500 (median of values below the median, excluding median because odd # cars)
d.
$800
12. What is the interquartile range of the above data?
a.
$300
b.
$400
c.
$350
d.
$450 (Q3-Q1 = 950-500 = 450)
13. What is the mean of the total damage suffered for this sample of cars?
a.
$239
b.
$733
c.
$800
d.
$950
14. Using the correct units, what is the value of the variance?
a.
224.85 dollars
b.
238.48 dollars2
c.
50,555.54 dollars2
d.
56,875 dollars2
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 4
EDRS 811 Fall 13: Brigham
Use the following to answer questions 15 and 16:
The Michigan Department of Transportation (M-DOT) is working on a major project: 80% of
the highways in Michigan need to be repaved. To speed completion of this project, many
contractors will be working for M-DOT. Contractors are currently bidding on the next part of
the project. To help make a decision about which contractor to hire, M-DOT collects many
variables besides just the estimated cost. One of those variables is the contractor’s estimate
of the number of workdays required to finish the job. Twenty contractors have bid on the
next job. The boxplot represents their estimates of the number of work days required:
15. What is (approximately) the interquartile
range, based on the boxplot?
a.
140 days
b.
270 days (360-90=270)
c.
360 days
d.
760 days
16. Determine whether each of the following
statements is true or false.
a.
The median number of days is
approximately 180.
False (median≈160)
b.
The minimum number of days is
approximately 40. True
c.
The maximum number of days is
approximately 750.
False (max=900, outlier)
d.
Twenty-five percent of contractors
estimated the number of days to
be more than 100.
False (about 75%)
Use the following to answer questions 17 and 18:
The asking prices (in thousands of dollars) for a sample of 13 houses currently on the
market in Neighborville are listed below. For convenience, the data have been ordered.
175 199 205 234 259 275 299 304 317
17. What is the five-number summary? (in thousands of dollars)
175
219.5
299
345
350
355
384
549
549
18. Use the 1.5  IQR rule to determine if there are any outliers present. What is/are the value(s) of
the outlier(s)?
a. No outliers present
1.5*(350-219.5) = 195.75
b. One outlier: 175
219.5-195.75 = 23.75 (no values below)
c. One outlier: 549
350+195.75 = 545.75 (one value above=549)
d. Two outliers: 175 and 549
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 5
EDRS 811 Fall 13: Brigham
Density
Use the following density curve to answer questions 19–22:
0.5
0
0.5
1.0
1.5
2.0
X
19. Determine whether each of the following statements regarding this density curve is true or
false.
a.
It is symmetric. True
b.
The total area under the curve is 1. True
c.
The median is 1. True
d.
The mean is 1. True
20. For this density curve, what percent of the observations lie above 1.5?
a. 25%
b. 50%
c. 75%
d. 80%
21. For this density curve, what percent of the observations lie between 0.5 and 1.2?
a. 25%
b. 35% ((1.2-0.5)*0.5 = 0.35)
c. 50%
d. 70%
22. Using the standard Normal distribution tables, what is the area under the standard Normal
curve corresponding to Z > –1.22?
a. 0.1151
b. 0.1112
c. 0.8849
d. 0.8888 (1 – Z<-1.22 = 1 – 0.1112 = 0.8888)
23. Fill in the blank. When creating a scatterplot, one should use the
axis for the explanatory variable.
x (horizontal)
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 6
EDRS 811 Fall 13: Brigham
Use the following to answer questions 24and 25:
A researcher measured the height (in feet) and volume of usable lumber (in cubic feet) of
32 cherry trees. The goal is to determine if the volume of usable lumber can be estimated
from the height of a tree. The results are plotted below:
24. ___Volume______________ ________ is the
response variable in this study.
80
70
60
Volume
25. Select all descriptions that apply to
the scatterplot.
a. There is a positive association
between height and volume.
b. There is a negative association
between height and volume.
c. There is an outlier in the plot.
(Apparently)
d. The plot is skewed to the left.
50
40
30
20
10
60
65
70
75
80
85
90
Height
26. A college newspaper interviews a psychologist about a proposed system for rating the
teaching ability of faculty members. The psychologist says, “The evidence indicates that
the correlation between a faculty member’s research productivity and teaching rating is
close to zero.” What would be a correct interpretation of this statement?
a.
Good researchers tend to be poor teachers and vice versa.
b. Good teachers tend to be poor researchers and vice versa.
c.
Good researchers are just as likely to be good teachers as they are bad teachers.
Likewise for poor researchers.
d. Good research and good teaching go together.
27. A phenomenon is observed many, many times under identical conditions. The proportion of
times a particular event A occurs is recorded. What does this proportion represent?
a. The probability of the event A.
b. The distribution of the event A.
c. The correlation of the event A.
d. The variance of the event A.
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 7
EDRS 811 Fall 13: Brigham
28. Suppose we roll a red die and a green die. Let R be the event that the number of spots
showing on the red die is three or less, and G be the event that the number of spots
showing on the green die is more than three. The events R and G
are _________.
a. disjoint
b. complements
c. independent
d. reciprocals
29. Consider the following scatterplot of
two variables x and y:
a.
0.20
What can we conclude from
this graph?
0.15
The correlation between x and
y must be close to 1 because
0.10
there is nearly a perfect
relationship between them.
The correlation between x and
0.05
y must be close to –1 because
there is nearly a perfect
0.00
relationship between them,
0.00
0.20
0.40
0.60
0.80
but it is not a straight-line
X
relation.
The correlation between x and y is close to 0. (correlation measures linear
relationship)
The correlation between x and y could be any number between –1 and +1.
Without knowing the actual values, we can say nothing more.
Y
b.
0.25
c.
d.
e.
30. An outlier is ______.
a. a point in a scatterplot that follows the same pattern as the other points
b. a point in a scatterplot that does not follow the same pattern as the other points
c. Both of the above.
d. Neither A nor B.
1.00
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 8
EDRS 811 Fall 13: Brigham
31. Match the four graphs labeled A, B, C, and D, with the following four possible values of
the correlation coefficient: –0.9, –0.7, 0.4, 0.95. Assume all four graphs are made on the
same scale.
A)
–0.7
B)
0.95
C)
0.4
D)
–0.9
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 9
EDRS 811 Fall 13: Brigham
Use the following to answer questions 32–34:
If you draw an M&M candy at random from a bag of the candies, the candy you draw will
have one of six colors. The probability of drawing each color depends on the proportion of
each color among all candies made. Assume the table below gives the probabilities for the
color of a randomly chosen M&M:
Color
Probability
Brown
0.3
Red
0.3
Yellow
?
Green
0.1
Orange
0.1
Blue
0.1
32. What is the probability of drawing a yellow candy?
a.
0.1 (1-0.3-0.3-0.1-0.1-0.1 = 0.1)
b. 0.2
c.
0.3
d. Impossible to determine from the information given.
33. What is the probability of not drawing a red candy?
a.
0.3
b. 0.6
c.
0.7 (1-0.3 = 0.7)
d. 0.9
34. What is the probability that you draw neither a brown nor a green candy?
a.
0.3
b. 0.6 (1-(0.3+0.1) = 1-0.4 = 0.6)
c.
0.7
d. 0.9
Use the following situation to answer questions 35–36:
A study was conducted in a large population of adults concerning eyeglasses for correcting
reading vision. Based on an examination by a qualified professional, the individuals were
judged as to whether or not they needed to wear glasses for reading. In addition it was
determined whether or not they were currently using glasses for reading. The following
table provides the proportions found in the study:
Used glasses for reading
Yes
No
Judged to need
Yes
0.42
0.18
glasses
No
0.04
0.36
35. What is the probability that the selected adult is judged to need eyeglasses but does not use
them for reading?
a. 0.42
b. 0.18
c. 0.54
d. 0.60
e. 0.36
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 10
EDRS 811 Fall 13: Brigham
36. Suppose two adults are selected from the population independently and at random. What is
the probability that both were judged to need eyeglasses and neither was using them for
reading?
a. 0.36
b. 0.1296
c. 0.0324 (0.18*0.18 = 0.0324)
d. 0.9216
e. 0.18
37. Suppose that A and B are two independent events. The probability that event A occurs is 0.4
(i.e., P(A) = 0.4), and that B occurs is P(B) = 0.2. What is the probability that A does not
occur and B also does not occur?
a. 0.92
b. 0.40
c. 0.60
d. 0.08
e. 0.48 ((1-0.4)*(1-0.2) = 0.6*0.8 = 0.48)
Use the following to answer questions 38–40.
The probability distribution of random variable, X, is defined as follows:
X
Probability
0
0
1
.3
2
.1
3
.3
4
.3
38. The table above describes a random variable that is ______.
a. discrete
b. continuous
c. both discrete and continuous
d. None of the above.
39. The expected value of the probability distribution is __2.6_______. (0*0+1*.3+2*.1+3*.3+4*.3 = 2.6)
40. The P(X = 3) = ___0.3_____________________
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 11
EDRS 811 Fall 13: Brigham
Use the following to answer questions 41–44:
Suppose that a college determines the following distribution for X = number of courses
taken by a full-time student this semester:
Value of X
Probability
3
0.07
4
?
5
0.25
6
0.28
41. The probability for X = 4 is missing. What is it?
a. 0.07
b. 0.40 (1-0.07-0.25-0.28 = 0.40)
c. 0.25
d. 0.50
42. What is the average number of courses full-time students at this college take this semester?
a. 4 classes
b. 4.74 classes (3*.07+4*.4+5*.25+6*.28 = 4.74)
c. 4.26 classes
d. 5 classes
43. What is the standard deviation of the number of courses full-time students at this college take
this semester?
a.
0.89 classes
b.
0.94 classes sqrt(.07*(3-4.74)^2+.4*(4-4.74)^2+.25*(5-4.74)^2+.28*(6-4.74)^2)
c.
1 class
d.
23.36 classes
44. What is P(X > 4.74)?
a. 0.25
b. 0.28
c. 0.53 (P(X=5 or 6) = 0.25+0.28 = 0.53)
d. Impossible to calculate, because X cannot be 4.74
Use the following to answer questions 45–48:
Suppose that A and B are two independent events with P(A) = 0.3 and P(B) = 0.3.
45. What is P(A and B)?
a.
0.09 (0.3*0.3 = 0.09)
b.
0.52
c.
0.51
d.
0.60
46. What is P(A or B)?
a.
0.09
b.
0.52
c.
0.51 (0.3+0.3-0.09 = 0.51)
d.
0.60
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 12
EDRS 811 Fall 13: Brigham
47. What is P(A and B c )?
a.
0.09
b.
0.49
c.
0.21 (0.3*(1-0.3) = 0.3*0.7 = 0.21)
d.
0.60
48. What is P(A c or B c )?
a.
0.40
b.
0.91 ((1-0.3)+(1-0.3)-(1-0.3)*(1-0.3) = 0.7+0.7-(0.7*0.7) = 1.4-0.49 = 0.91)
c.
0.49
d.
1.40
Use the following to answer questions 49-50:
The scores of individual students on the American College Testing (ACT) Program
Composite College Entrance Examination have a Normal distribution with mean 18.6 and
standard deviation 6.0. At Northside High, 36 seniors take the test. Assume the scores at
this school have the same distribution as national scores.
49. What is the mean of the sampling distribution of the sample mean score for a random
sample of 36 students?
a. 1.0
b. 6.0
c. 3.1
d. 18.6
50. What is the standard deviation of the sampling distribution of the sample mean score
for a random sample of 36 students?
a. 1.0 (6.0/sqrt(36) = 6.0/6 = 1.0)
b. 6.0
c. 3.1
d. 18.6
Use the following to answer questions 51 and 52:
Let X represent the SAT score of an entering freshman at University X. The random
variable X is known to have a N(1200, 90) distribution. Let Y represent the SAT score of an
entering freshman at University Y. The random variable Y is known to have a N(1215, 110)
distribution. A random sample of 100 freshmen is obtained from each university. Let X =
the sample mean of the 100 scores from University X, and Y = the sample mean of the 100
scores from University Y.
51. What is the probability that X will be less than 1190?
a. 0.0116
b. 0.4090
c. 0.1335 N(1200,90/sqrt(100))=N(1200,9); Z=(1190-1200)/9=-1.11; 0.1335 from Table A
d. 0.4562
Cheryl O’Connor
Grade on hardcopy: 30/30 points
Midterm Examination: 13
EDRS 811 Fall 13: Brigham
52. What is the probability that Y will be less than 1190?
a. 0.0116 N(1215,110/sqrt(100))=N(1215,11); Z=(1190-1215)/11=-1.11; 0.1335 from Table A
b. 0.4090
c. 0.1335
d. 0.4562
53. As you increase the margin error of a confidence interval, _________. (Note: Assume the
sample size is fixed.)
a.
the confidence level increases.
b. the confidence level decreases.
c.
the confidence level remains the same.
Use the following to answer question 54
The scores on the Wechsler Intelligence Scale for Children (WISC) are thought to be
Normally distributed with a standard deviation of  = 10. A simple random sample of 25
children is taken, and each is given the WISC. The mean of the 25 scores is x = 104.32.
54.
Based on these data, what is a 95% confidence interval for ?
a.
104.32 ± 0.78
b.
104.32 ± 3.29
c.
104.32 ± 3.92 (
)
d.
104.32 ± 19.60