Download Test 1.v2 - La Sierra University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Math 251, 10 October 2003, Exam I
Name:
ANSWERS
.
Instructions: Complete each of the following eight questions, and please explain and justify all
appropriate details in your solutions in order to obtain maximal credit for your answers.
1. (6 pts) Classify the type of sampling used in the following examples.
(a) To maintain quality control, a tire manufacturer tests every 100th tire that comes off of the assembly
line in its plant.
ANS: This is a systematic sample
(b) To conduct a poll, the Join Arnold team randomly chose 8 different prefixes in California (the first
3 digits of the telephone number) and called all households from those prefixes.
ANS: This is a cluster sample. The population was divided into groups, some groups were randomly
selected and every member of the selected groups was surveyed.
(c) To determine student attitudes toward worship requirements at La Sierra, President Geraty gave
questionnaires to ten randomly selected students each from of the following groups: Freshmen,
Sophomores, Juniors, Seniors and Graduate Students.
ANS: This is a stratified sample. The population was divided into groups, and a random sample from
each group was selected.
2. (6 pts) Categorize the following data according to level: nominal, ordinal, interval, or ratio.
(a) The quality of a restaurant’s food: poor, average, good.
ANS: Ordinal—the responses can be ranked, but differences between the ranks do not make sense.
(b) The outdoor temperature in degrees Fahrenheit.
ANS: Interval—differences in times make sense, but ratios do not.
(c) The length of time of the drive home.
ANS: Ratio—differences and ratios are meaningful.
3. (2 pts) In a set of data with more than two values, how does decreasing the smallest number affect
the mean? How would it affect the median?
ANS: The mean is decreased, but the median remains unchanged.
4. (6 pts) How will the mean, standard deviation, and coefficient of variation compare in Population 1
below compare with those in Population 2 below? Explain why but do not compute the means,
standard deviations or coefficients of variance. Notice the data values in Population 2 are 5 times data
values in Population 1.
Pop1:
Pop2:
5
25
10
50
15
75
20
100
25
125
30
150
40
200
45
225
50
250
75
375
80
400
95
475
ANS: The mean and standard deviation of the second population are 5 times the mean and standard
deviation of the first, both the average and the distance from the mean of the data in the second is
exactly 5 times that in the first. However, the coefficients of variation of both populations are equal
since the factors of 5 in the mean and standard deviation cancel in when computing the ratio for the
coefficient of variation.
5. Consider the data (which are systolic blood pressures of 25 subjects):
95
124
142
98
126
152
102
126
166
106
128
168
108
130
184
110
130
112
132
118
134
118
136
120
138
(a) (2 pts) What class width should be chosen if you would like to have 8 classes.
ANS: First, the range divided by the number of classes is (184 – 95)/8 = 11.125. Now go up to the next
who number to make sure all data are covered in the 8 classes, so we choose a class width of 12.
(b) (8 pts) Complete the following table for this data given that the first class has limits 95—109
Lower
Limit
Upper
Lower
Upper
Cumulative Relative
Limit Boundary Boundary Midpoint Frequency Frequency Frequency
95
109
94.5
109.5
102
5
5
.20
110
124
109.5
124.5
117
6
11
.24
125
139
124.5
139.5
132
9
20
.36
140
154
139.5
154.5
147
2
22
.08
155
169
154.5
169.5
162
2
24
.08
170
184
169.5
184.5
177
1
25
.04
(c) (10 pts) Find the median, Q1, and Q3 for the above data. Draw a box and whisker plot for the data.
You may draw it horizontally if you prefer.
ANS: The median is in the (25 + 1)/2th place, i.e. the 13th place, so the median is 126.
The first quartile is the median of the first 12 numbers, which is 111 (average of 6th and 7th data).
The third quartile is the median of the highest 12 numbers which is 137 (average of 19th and 20th data).
High: 184
Third Quartile: 137
Median: 126
First Quartile: 111
Low: 95
See text for method of constructing box and whisker plot. The lower whisker starts at 95 and goes to
111, the low edge of the box is at 111, the upper edge is at 137, and the line in the box is at 126. The
upper whisker starts at 137 and goes up to 184.
(d) (5 pts) Construct a relative frequency histogram for the data using the table in (b).
Relative Frequency Histogram
0.4
0.3 5
Relative Frequency
0.3
0.2 5
0.2
0.1 5
0.1
0.0 5
0
5
4.
18
5
9.
16
5
4.
15
5
9.
13
5
4.
12
5
9.
10
.5
94
Systolic Pressure
6. (6 pts) At a large university, 3000 students wrote a mathematics placement test one day. Given that
 x = 86,250 and  x2=2,521,875 for these test scores, Find the mean and population standard
deviation, and the coefficient of variation.
The mean is: 86,250/3000 = 28.75
SSx = 2,521,875 – (86,250)2/3000 = 42,187.5
The population standard deviation is  = (SSx /N)1/2 = (42,187.5/3000)1/2 = 3.75
The coefficient of variation is CV = 100%  13.04%
7. (4 pts) A population is known to have a mean of 80 and standard deviation of 5. Use Chebyshev’s
theorem to find the interval in which you would expect to find at least 8/9 of the data.
ANS: Chebyshev’s theorem says that at least 8/9 of data lies within 3 standard deviations of the mean.
Therefore, we compute the interval   3 which is (65,95). So at least 8/9 of all data in the
population should be in the interval (65,95).
8. (5 pts) Professor Henry Wiggins decided to study the ages of the students attending the classes he
taught. He constructed the following frequency distribution for ages in years.
x
18—21
22—25
26—29
F
65
25
10
Please help Professor Wiggins by estimating the mean and sample standard deviation for the ages of
students in his classes.
ANS: Use the formulas x   xf, and x2   x2f where on the right hand side we use the class
midpoints and frequencies. Then
x 19.565 + 23.525 + 27.510 = 2130
x2  19.5265 + 23.5225 + 27.5210 = 46,085
and
SSx  46,085 – (2130)2/100 = 716
Therefore, the mean is approximately 2130/100 = 21.3, and the standard deviation is approximately
s = (SSx /(n-1))1/2 = (716/99)1/2  2.68929791