Download Exam 1 Practice Problems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Time series wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Math 10043
Practice Problems For Exam 1
Chapters 1, 2, 3, & 10
Problems 1 – 32: Circle T for True or F for False. Answers are at the end.
"x
T
F
(1) The only true average is
T
F
(2) The number of leaves on a stem-and-leaf display represents the number of data values in
the data set.
T
F
(3) Some sets of numerical data may have more than one mean.
T
F
(4) The median is the same value as the second quartile.
T
F
(5) Describing the behavior of a sample is the ultimate objective of statistical analysis.
T
F
(6) The average size of a population of trees represents a parameter.
T
F
(7) An outlier is an unusually large or small data value
T
F
(8) One’s height is an example of qualitative data.
T
F
(9) The unit of measure for the standard z-scores is in standard deviations from the mean.
T
F
(10) An individual’s blood type is an example of continuous data.
T
F
(11) Standard deviation of a grouped frequency distribution is an approximation of the
standard deviation for raw data.
T
F
(12) The number of children in a family is an example of continuous data.
T
F
(13) The Empirical Rule states that approximately 95% of the data will lie within three
standard deviations of the mean.
T
F
(14) The time it takes a swimmer to swim one lap is an example of discrete data.
T
F
(15) The mean GPA at Joel’s school is x = 2.83 , with a standard deviation s = 0.47.
If Joel’s z-score from his GPA is z = 2.21, then his actual GPA must be 1.32.
T
F
(16) For the normal distribution, approximately 84% of the data is greater than a value that
is one standard deviation below
! the mean.
T
F
(17) One’s blood type is an example of qualitative data.
T
F
(18) The mean is not strongly affected by outliers.
T
F
(19) On a standardized test, a student’s z-score was near zero. This means that the
student’s actual test score was near the standard deviation.
T
F
(20) A useful way to compare two different types of data is to compare variances.
T
F
(21) The number of children in a family is an example of discrete data.
T
F
(22) A statistic is a numerical characteristic of a sample.
n
.
!
SPRING 2013
1
T
F
(23) Given a mean winter temperature at Bear Claw, Alaska of –13° F, and s = 12.5° F,
the z-score for the 2° F temperature recorded on January 1, 2000 would be z = -0.88.
T
F
(24) If we know that the distribution of data is bell-shaped, we may use the Empirical Rule.
T
F
(25) The mean of a sample always divides the data into two equal halves.
T
F
(26) For the normal distribution, approximately 5% of the data is less than a value that is two
standard deviations below the mean. [HINT: A sketch might be helpful.]
T
F
(27) A measure of central tendency describes how widely the data are dispersed about a
central value.
T
F
(28) If the correlation coefficient has a value near -1, then the two variables have a weak
relationship.
T
F
(29) The correlation coefficient, r, always has a value 0 ≤ r ≤ 1.
T
F
(30) The scatter plot is used to measure the strength of the relationship between two
variables.
T
F
(31) If the correlation coefficient, r has a value near zero, then the variables are not linearly
related.
T
F
(32) The regression equation is used display the relationship between two quantitative
variables.
Problems 33 – 34: Multiple Choice. Each has one correct answer.
33. During a series of fitness tests at school, Megan was told that her z-score for sit-ups was 1.25, and
that the mean for the school was 70 sit-ups. Megan knows that she did 76 sit-ups during the test. Find
the standard deviation of sit-ups for the test.
(A) 0.21
(B) 4.75
(C) 4.8
(D) 6.25
(E) 7.5
(F) None of these
34. Bottles of wine from 10 to 20 years old were selected at random and sold at a wholesale auction.
The ages (in years) and corresponding prices (in dollars) yielded a correlation coefficient of r = 0.46. The
line of best fit was ŷ = 9.64 + 2.83x . If it is appropriate, use the regression equation to predict the price
of a bottle of wine that is 18 years old.
(A) Not appropriate because of extrapolation
(B) $60.58
(C) $176.35
(D) Not appropriate because r is weak
(E) None of these
Problems 35 – 58: Show all work.
35. Twenty snow blowers were filled with one gallon of gasoline and allowed to run until the tank was
empty. The times (in minutes) that the snow blowers operated are shown below. (a) Construct a
frequency distribution using a class width of five. Include relative frequency in the table. (b) Sketch a
FREQUENCY histogram of the results. (c) Construct a repeated stem-and-leaf display of the data.
65
72
SPRING 2013
70
59
60
63
65
66
53
66
68
62
63
70
68
58
75
60
70
76
2
36. Below are heating costs in dollars for a sample of two-bedroom apartments for one month. Construct
a comparative, repeated stem-and-leaf display contrasting heating costs for gas and electricity.
HEATED BY GAS
26
24
30
28
29
34
20
31
19
32
23
28
HEATED BY ELECTRICITY
36
34
42
26
32
25
40
37
31
49
39
35
41
34
43
38
26
22
27
18
37. Find the mean, median, mode, and midrange of the test scores given below. You may use your
calculator to sort the data, and to check your answers (for mean and median).
63
48
91
79
83
91
79
81
68
99
38. Find the mean, median, mode, midrange, range, and standard deviation of the values given:
9
12
4
6
6
8
7
39. Find the mean, median, mode, midrange, range, interquartile range, and standard deviation of the
values given:
16
14
12
13
4
15
10
18
8
24
40. The following shows the class sizes for several mathematics classes at TCU. (a) Complete the
distribution. (b) How many classes were sampled? (c) Sketch a histogram of the distribution
(d) Find the approximate mean of the data. (e) Find the approximate standard deviation. No work
required for parts d and e.
Class Size
5–9
10 - 14
15 - 19
20 – 24
25 - 29
30 - 34
35 - 39
40 - 44
45 - 49
Number of classes
8
4
2
5
4
8
2
14
4
Relative Frequency
41. A food corporation packages a tin of almonds with an advertised weight of 170 grams. A sample of
tins yields the weights given below. (a) Find the mean and standard deviation of the volumes.
Later it was discovered that the scale was five grams low; in other words, each weight should be raised
by 5 grams. (b) How will this affect the mean? (c) How will this affect the standard deviation?
153
171
182
178
163
167
164
151
172
160
158
160
42. The heights in feet of the 14 tallest buildings in Minneapolis are given in the data set below. Find the
mean, median, mode, midrange, range, and standard deviation of these heights.
960
416
SPRING 2013
775
403
668
366
579
356
561
355
447
340
440
337
3
43. Test scores received by 24 students are listed below. Find the mean, median, mode, midrange,
range, and standard deviation of the test scores. You may let your calculator find the median for you.
[HINT: Have your calculator SORT the data in order to find the mode.]
63
77
32
71
71
81
85
62
94
96
84
77
94
61
63
90
54
60
75
87
81
71
94
76
44. A study gave the following frequency distribution for the IQs of a group of children. (a) Find the
mean and standard deviation of the IQs. Use 1-VAR STAT only! No work to show on part a.
(b) Sketch a histogram of the data.
IQ
60-69
70-79
80-89
90-99
100-109
110-119
120-129
130-139
140-149
150-159
NUMBER OF CHILDREN
1
5
13
22
28
23
14
3
2
1
45. In 2011, the average age of a car or truck in the U.S. was 10.8 years. Suppose the standard
deviation of the age of such vehicles is 2.3 years. Find the boundary ages of these vehicles within one,
two, and three standard deviations of the mean.
46. Women’s heights are normal distributed with a mean of 65.5 in. and a standard deviation of 2.5 in.
What is the approximate percentage of women’s heights between: (a) 63.0 in. and 68.0 in.?
(b) 60.5 in. and 70.5 in.?
47. The Emperor penguin has a mean height of 1.2 meters, with a standard deviation of approximately
0.25 meters. Assuming that penguin heights are normally distributed, (a) approximately what percentage
of all such penguins will have heights less than 1.7 meters? (b) Suppose that 75 penguins are sampled.
How many should have heights between 0.95 and 1.45 meters?
48. The time it takes a second-grader to complete a standardized test is normally distributed with an
average of 84 minutes, and a standard deviation of 7 minutes. (a) What percentage of second graders
will complete the test in between 70 and 98 minutes? (b) In approximately 68% of the cases, the testing
time will fall between what two times (i.e. what two boundary values)? (c) What percentage of second
graders will complete the test in less than 63 minutes?
49. Two individuals are on a reducing diet. The first, weighing 178 pounds, belongs to an age group for
which the mean weight is 146 pounds with a standard deviation of 14 pounds. The second, who weighs
193 pounds, belongs to an age group for which the mean weight is 160 pounds with a standard deviation
of 17 pounds. Which of these individuals is more seriously overweight for his or her age group? Explain.
50. The giant panda has a mean length (or height) of 132 centimeters, with an approximate standard
deviation of 7.6 centimeters. The red panda has a mean length of 61 centimeters, with an approximate
standard deviation of 3.5 centimeters. A zoo has one of each variety of panda. Their giant panda is 127
centimeters long; their red panda is 59 centimeters long. Which is smaller for its species? Explain.
SPRING 2013
4
51. The mean height of adult males is 69 inches, with a standard deviation of 2.8 inches. The mean
height of adult females is 65.5 inches, with a standard deviation of 2.5 inches. My father is 66 inches tall.
My mother is 63 inches tall. Which of these individuals is taller for his or her gender? Justify your
answer.
52. Find the five-number summary and interquartile range of each of the following data sets:
a.
45
42
56
61
38
49
57
69
33
78
45
b.
15
39
15
43
16
31
20
38
21
26
35
27
53. A random sample of ten custom homes listed for sale in an exclusive subdivision in Phoenix, Arizona
provided the following information on size and price, where x denotes size, in hundreds of square feet,
and y denotes listed price, in thousands of dollars. (a) Find and interpret the correlation coefficient.
(b) State the regression equation. (c) Use the regression equation you found to predict the listed price
of a 3500 square foot home? NOTE: On part c, if it is not appropriate to make such a prediction, do not
make the prediction--instead, write a sentence explaining why it is not appropriate.
x
y
26
298
27
207
37
390
29
290
29
224
34
305
31
326
40
375
22
195
24
290
54. In order to establish if the length of a minnow (x) is related to its age (y), a biological study of a
minnow was conducted. The data was used to calculate a correlation coefficient of r = 0.45. Write a fourpart statement to interpret the correlation coefficient given.
55. Students took a math competency test at the beginning of a statistics course. The competency score
(from 0 to 50) and the course grade (as a percent) are given below. (a) Sketch a scatter plot of the data.
To determine whether there is a relationship between math competency test score and final grade in the
statistics course, (b) find and interpret the correlation coefficient and (c) find the regression equation.
(d) Use the regression equation from part c to predict a student's course grade when the competency
score was 28, if it is appropriate to do so. If it is not appropriate, write a sentence explaining why.
x = Competency score
40
36
42
33
44
35
38
42
45
40
y = Course grade
78
80
90
72
95
75
77
83
90
80
56. In order to establish if the age of a baby (x) is related to the average number of hours it sleeps daily
(y), several babies’ ages and sleep times were recorded. The data was used to calculate a correlation
coefficient of r = -0.89. Write a statement to interpret r.
57. A study was performed to investigate the relationship between a secretary’s typing speed x and his
or her reading speed y (both in words per minute). Typing speeds and reading speeds from several
secretaries were sampled, yielding a correlation coefficient of r = 0.48. Write a complete statement to
interpret the correlation coefficient.
58. A study was conducted recording the diastolic blood pressure—x, and systolic blood pressure—y,
(both measured in millimeters of mercury) for a group of women. (a) Sketch a scatter plot of the data
(b) Find and interpret the correlation coefficient. (c) Find the regression equation. (d) Use the
regression equation you found to predict the systolic blood pressure of a woman whose diastolic blood
pressure is 75 millimeters of mercury. NOTE: On part d, if it is not appropriate to make such a prediction,
do not make the prediction--instead, write a sentence explaining why it is not appropriate.
Diastolic – x
76
70
82
90
68
70
62
60
67
72
80
Systolic -- y
122
102
118
126
108
130
104
118
130
116
122
SPRING 2013
5
EXAM 1 PRACTICE PROBLEMS – ANSWERS
(Histograms & scatter plots are on the last page)
1. F
3. F
5. F
7. T
9. T
11. T
13. F
15. F
16. T
18. F
20. F
22. T
24. T
26.
28.
30.
32.
Median, mode, and midrange are also called averages.
2. T
. . .may have more than one mode.
4. T
. . .the behavior of a population . . .
6. T
8. F
Height, a measure, is an example of continuous data.
10. F
Blood type, a category, is an example of qualitative data.
12. F Number of children is a count, and is thus discrete data.
approximately 99.7%
14. F Time, a measure, is an example of continuous data.
His actual GPA is 3.87. Plug the three given values into the z-score formula, and solve for x.
(Use a sketch to verify.)
17. T
The median is not strongly affected by outliers.
19. F . . . near the mean.
. . . calculate and compare z-scores.
21. T
23. F z = 1.20. When we subtract a negative value, we add: 2 - (-13) = 2 + 13.
25. F The median is the physical center of the data set. This is ONLY true for
mean when the data set is normally distributed, or where the mean and median
just happen to be equal.
2.5%. A sketch might help with this.
27. F A measure of dispersion . . .
. . . a strong negative correlation.
29. F r always has a value -1 ≤ r ≤ 1.
The correlation coefficient measures the strength . . .
31. T
A scatter plot is used display the relationship . . .
F
F
F
F
33. C
34. D
35. (a)
(c)
Minutes of
operation
Number of
Snow Blowers
Relative
frequency
STEMS
50 – 54
55 – 59
60 – 64
65 – 69
70 – 74
75 – 79
1
2
5
6
4
2
0.05
0.10
0.25
0.30
0.20
0.10
5L
5H
6L
6H
7L
7H
LEAVES
3
98
03320
558866
0020
56
36.
GAS
89
3024
879866
2140
!
STEMS
1H
2L
2H
3L
3H
4L
4H
ELECTRICITY
65
4214
67958
2013
9
37.
x = 78.2 ; x! = 80 ; bimodal: 79, 91 ; midrange: 73.5
38.
x = 7.4 ; x! = 7 ; mode = 6 ; midrange = 8 ; range = 8 ; s = 2.6
39.
x = 13.4 ; x! = 13.5 ; no mode ; midrange = 14 ; range = 20 ; iqr = 6 ; s = 5.5
40. (a) x = 164.9 s = 9.5 (b) x is raised by 5 grams. (c) s is unchanged.
!
SPRING 2013
6
41.(a)
(b) n = 51
(d)
x = 28.7
(e) s = 13.7
Class Size
Number of classes
Relative Frequency
5–9
10 - 14
15 - 19
20 – 24
25 - 29
30 - 34
35 - 39
40 - 44
45 - 49
8
4
2
5
4
8
2
14
4
0.157
0.078
0.039
0.098
0.078
0.157
0.039
0.275
0.078
!
42.
x = 500.2 ft. ; x! = 428 ft. ; no mode ; midrange = 648.5 ft.; range = 623 ft. ; s = 188.1 ft.
43.
x = 75.0 ; x! = 76.5 ; bimodal: 71, 94 ; midrange = 64; range = 64 ; s = 15.3
!
44. x = 105.04 s = 16.4
!
45. Within one s of x : 8.5 years to 13.1 years; Within two s of x : 6.2 years to 15.4 years;
Within three s of x : 3.9 years to 17.7 years
46. (a) 68 % (b) 95%
48. (a) 95%
47. (a) 97.5% (b) 51 penguins
(b) between 77 and 91 minutes
(c) 0.15%
49. The first person is more overweight for his/her age group because the z-score is higher; his/her
weight is more standard deviations above the mean.
50. The giant panda is smaller for its species because its z-score is lower; its height is more standard
deviations below the mean.
51. My mother is taller for her gender because her z-score is higher; her height is fewer standard
deviations below the mean.
52. (a) 5#-summary: 33, 42, 49, 61, 78 ; iqr = 19
(b) 5#-summary: 15, 18, 26.5, 36.5, 43; iqr = 18.5
53. (a) r = 0.79 There is a moderately strong, positive correlation between the square footage and the
listed price of homes in a subdivision of Phoenix, Arizona. (b) ŷ = 9.18x + 15.50 (c) x = 35 hundred
square feet
ŷ = 336.8 thousands of dollars, or $336,800
54. There is a weak, positive correlation between the length and size of minnows.
55. (b) r = 0.88 There is a fairly strong, positive correlation between the grade on a math competency
test and course grade in statistics for college students. (c) ŷ = 1.66x + 16.49 (d) x = 28 It is not
appropriate to predict because of extrapolation.
56. There is a fairly strong, negative correlation between the age and sleep time of babies.
57. There is a weak, positive correlation between a secretary’s typing speed and his/her reading speed.
58. (b) r = 0.37 There is a weak, positive correlation between the diastolic blood pressure and systolic
blood pressure for women. (c) ŷ = 0.40x + 88.56 (d) It is not appropriate to make the prediction
because r is weak.
SPRING 2013
7
HISTOGRAMS & SCATTER PLOTS
35b.
41c.
44b.
SPRING 2013
8
55a.
58a.
SPRING 2013
9