Download Practice problems with solutions 3 - Victoria Vernon, Empire State

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Geographic information system wikipedia , lookup

Theoretical computer science wikipedia , lookup

Neuroinformatics wikipedia , lookup

Inverse problem wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Error detection and correction wikipedia , lookup

Pattern recognition wikipedia , lookup

Data analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Practice 3. Describing data
1.
The blood types for 30 people who agreed to participate in a medical study were as follows. Find the
mode(s) of the blood types.
O A A O A AB O B A O
A O A B O O O AB A A
ABOAAOOBOO
SOLUTION: O
2. The federal government requires a car manufacturer to have a minimum miles per gallon (mpg)
average over the models it makes. Suppose that the average mpg for the four models manufactured
is 24 mpg and the government standard is 32 mpg. The manufacturer will add a model. What mileage must
the new model get so that the manufacturer meets the government requirement?
SOLUTION: 64 mpg
3. 25 randomly selected students were asked the number of movies they watched the previous week. The
results are as follows:
# of movies Frequency Relative Frequency
0
5
1
9
2
6
3
4
4
1
a. Find the sample mean x−
b. Find the sample standard deviation, s
c. Construct a histogram of the data.
d. Complete the relative frequency column of the chart.
e. Find the first quartile.
f. Find the median.
g. Find the third quartile.
h. Construct a box plot of the data.
i. What percent of the students saw fewer than three movies?
j. Find the 40th percentile.
k. Find the 90th percentile.
SOLUTION
a. 1.48
b. 1.12
e. 1
f. 1
g. 2
h.
i. 80%
j. 1
k. 3
4. Suppose that a publisher conducted a survey asking adult consumers the number of fiction paperback
books they had purchased in the previous month. The results are summarized in the table below.
# of books
Freq.
0
18
1
24
2
24
3
22
4
15
5
10
7
5
9
1
a. Are there any outliers in the data? Use an appropriate numerical test involving the IQR to identify
outliers, if any, and state your conclusion. If a data value is identified as an outlier, what should be done
about it?
b. Are any data values further than 2 standard deviations away from the mean? In some situations,
statisticians may use this criteria to identify data values that are unusual, compared to the other data
values. (Note that this criteria is most appropriate to use for data that is mound-shaped and symmetric,
rather than for skewed data.)
c. Do parts (a) and (b) of this problem give the same answer?
d. Examine the shape of the data. Which part, (a) or (b), of this question gives a more appropriate result
for this data?
e. Based on the shape of the data which is the most appropriate measure of center for this data: mean,
median or mode?
SOLUTION
a.
Q1=1, Q3=4, IQR = 4 – 1 = 3
Q1 – 1.5*IQR = 1 – 1.5(3) = -3.5
Q3 + 1.5*IQR = 4 + 1.5(3) = 8.5
The data value of 9 is larger than 8.5. The purchase of 9 books in one month is an outlier.
The outlier should be investigated to see if there is an error or some other problem in the data; then a
decision whether to include or exclude it should be made based on the particular situation. If it was a correct
value then the data value should remain in the data set. If there is a problem with this data value, then it
should be corrected or removed from the data. For example: If the data was recorded incorrectly (perhaps a 9
was miscoded and the correct value was 6) then the data should be corrected. If it was an error but the
correct value is not known it should be removed from the data set.
b. xbar – 2s = 2.45 – 2*1.88 = -1.31
xbar + 2s = 2.45 + 2*1.88 = 6.21
Using this method, the five data values of 7 books purchased and the one data value of 9 books
purchased would be considered unusual.
c. No: part (a) identifies only the value of 9 to be an outlier but part (b) identifies both 7 and 9.
d. The data is skewed (to the right). It would be more appropriate to use the method involving the IQR
in part (a), identifying only the one value of 9 books purchased as an outlier. Note that part (b) remarks
that identifying unusual data values by using the criteria of being further than 2 standard deviations
away from the mean is most appropriate when the data are mound-shaped and symmetric.
e. The data are skewed to the right. For skewed data it is more appropriate to use the median as a
measure of center.
5. The normal monthly precipitation (in inches) for August is listed for 20 different U.S. cities. Find
the median of the data.
3.5 1.6 2.4 3.7 4.1
3.9 1.0 3.6 4.2 3.4
3.7 2.2 1.5 4.2 3.4
2.7 0.4 3.7 2.0 3.6
SOLUTION: 3.45 in.