Download Test 1.v1 - La Sierra University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Math 251, 10 October 2003, Exam I
Name:
ANSWERS
.
Instructions: Complete each of the following eight questions, and please explain and justify all
appropriate details in your solutions in order to obtain maximal credit for your answers.
1. (6 pts) Classify the type of sampling used in the following examples.
(a) To conduct a poll, the Join Arnold team randomly chose 8 different prefixes in California (the first
3 digits of the telephone number) and called all households from those prefixes.
ANS: This is a cluster sample. The population was divided into groups, some groups were randomly
selected and every member of the selected groups was surveyed.
(b) To maintain quality control, a tire manufacturer tests every 100th tire that comes off of the assembly
line in its plant.
ANS: This is a systematic sample.
(c) To determine student attitudes toward worship requirements at La Sierra, President Geraty gave
questionnaires to ten randomly selected students each from of the following groups: Freshmen,
Sophomores, Juniors, Seniors and Graduate Students.
ANS: This is a stratified sample. The population was divided into groups, and a random sample from
each group was selected.
2. (6 pts) Categorize the following data according to level: nominal, ordinal, interval, or ratio.
(a) The time someone goes to bed.
ANS: Interval—differences in times make sense, but ratios do not.
(b) Length of time to complete a marathon.
ANS: Ratio—differences and ratios are meaningful.
(c) The condition of a highway: poor, acceptable, good.
ANS: Ordinal—the responses can be ranked, but differences between the ranks do not make sense.
3. (2 pts) In a set of data with more than two values, how does increasing the largest number affect the
mean? How would it affect the median?
ANS: The mean is increased, but the median remains unchanged.
4. (6 pts) How will the mean, standard deviation, and coefficient of variation compare in Population 1
below compare with those in Population 2 below? Explain why but do not compute them. Notice the
data values in Population 2 are 10 times the data values in Population 1.
Pop1:
Pop2:
5
50
10
100
15
150
20
200
25
250
30
300
40
400
45
450
50
500
75
750
80
800
95
950
ANS: The mean and standard deviation of the second population are 10 times the mean and standard
deviation of the first, both the average and the distance from the mean of the data in the second is
exactly 10 times that in the first. However, the coefficients of variation of both populations are equal
since the factors of 10 in the mean and standard deviation cancel in when computing the ratio for the
coefficient of variation.
5. Consider the data (which are systolic blood pressures of 25 subjects):
105
126
146
108
126
152
110
128
166
110
130
188
112
130
190
112
130
116
132
118
134
118
136
120
140
(a) (2 pts) What class width should be chosen if you would like to have 8 classes.
ANS: First, the range divided by the number of classes is (190 – 105)/8 = 10.625. Now go up to the
next who number to make sure all data are covered in the 8 classes, so we choose a class width of 11.
(b) (8 pts) Complete the following table for this data given that the first class has limits 105—119
Lower
Limit
Upper
Lower
Upper
Cumulative Relative
Limit Boundary Boundary Midpoint Frequency Frequency Frequency
105
119
104.5
119.5
112
9
9
.36
120
134
119.5
134.5
127
9
18
.36
135
149
134.5
149.5
142
3
21
.12
150
164
149.5
164.5
157
1
22
.04
165
179
164.5
179.5
172
1
23
.04
180
194
179.5
194.5
187
2
25
.08
(c) (10 pts) Find the median, Q1, and Q3 for the above data. Draw a box and whisker plot for the data.
You may draw it horizontally if you prefer.
ANS: The median is in the (25 + 1)/2th place, i.e. the 13th place, so the median is 128.
The first quartile is the median of the first 12 numbers, which is 114 (average of 6th and 7th data).
The third quartile is the median of the highest 12 numbers which is 138 (average of 19th and 20th data).
High: 190
Third Quartile: 138
Median: 128
First Quartile: 114
Low: 105
See text for method of constructing box and whisker plot. The lower whisker starts at 105 and goes to
114, the low edge of the box is at 114, the upper edge is at 138, and the line in the box is at 128. The
upper whisker starts at 138 and goes up to 190.
(d) (5 pts) Construct a relative frequency histogram for the data using the table in (b).
Relative Frequency Histogram
0.4
0.35
Relative Frequency
0.3
0.25
0.2
0.15
0.1
0.05
0
5
4.
19
5
9.
17
5
4.
16
5
9.
14
5
4.
13
5
9.
11
5
4.
10
Systolic Pressure
6. (6 pts) At a large university, 4000 students wrote a mathematics placement test one day. Given that 
x = 85,400 and  x2=1,904,290 for these test scores, Find the mean and population standard deviation,
and the coefficient of variation.
ANS:
The mean is: 85,400/4000 = 21.35
SSx = 1,904,290 – (85,400)2/4000 = 81,000
The population standard deviation is  = (SSx /N)1/2 = (81,000/4000)1/2 = 4.5
The coefficient of variation is CV = 100%  21.08%
7. (4 pts) A population is known to have a mean of 50 and standard deviation of 15. Use Chebyshev’s
theorem to find the interval in which you would expect to find at least 8/9 of the data.
ANS: Chebyshev’s theorem says that at least 8/9 of data lies within 3 standard deviations of the mean.
Therefore, we compute the interval   3 which is (5,95). So at least 8/9 of all data in the population
should be in the interval (5,95).
8. (5 pts) Professor Henry Wiggins decided to study the ages of the children attending the nursery at
his school. He constructed the following frequency distribution for ages in months.
x
10—19
20—29
30—39
F
20
55
25
Please help Professor Wiggins by estimating the mean and sample standard deviation for the ages of
children at the nursery.
ANS: Use the formulas x   xf, and x2   x2f where on the right hand side we use the class
midpoints and frequencies. Then
x 14.520 + 24.555 + 34.525 = 2500
x2  14.5220 + 24.5255 + 34.5225 = 66,975
and
SSx  66,975 – (2500)2/100 = 4475
Therefore, the mean is approximately 2500/100 = 25, and the standard deviation is approximately
s = (SSx /(n-1))1/2 = (4475/99)1/2  6.7232448