Download Math 150 Review for Exam 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Math 150 Review for Exam 1
For problems 1 through 9 suppose that the following numbers represent changes (before minus
after) in cholesterol level in mg/dL in patients after they were put on a vegetarian diet.
23, 25,  8, 36, 25, 12,  5, 24, 115, 20
1. (a) What are the cases?
2.
3.
4.
5.
(b) What is the variable?
(c) What type of variable is it?
Find, label, and interpret the minimum and the maximum.
Find, label, and interpret the quartiles.
Name, compute, and interpret 3 measures of central tendency.
Name, compute, and interpret 4 measures of spread.
6. (a) Compute and interpret the z-score for each patient.
(b) What is the smallest z-score?
(c) What is the largest z-score?
(d) Which z-score is furthest from the mean?
7. Find the 5-number summary.
8. Make each of the following types of graphs if possible.
(a) stemplot
(b) dotplot
(c) bar graph
(d) histogram
(e) boxplot
(f) stem-and-leaf plot
(g) box-and-whiskers plot
9. What do the graphs tell you about the distribution of cholesterol changes?
10. Suppose that 3 of the patients had brown hair, 3 had gray hair, 1 had red hair, 2 had black
hair, and 1 had blond hair. Make and interpret a bar graph on the hair color of the patients.
11. Suppose that speeds of cars in miles per hour were recorded in a school zone as follows.
during school hours: 32, 37, 43, 28, 29, 32, 28
after school hours: 40, 48, 36, 38, 40
(a) Compute a 5-number summary for each time period.
(b) Make a boxplot for each time period using a single number scale.
(c) Interpret the results.
12. Pat scored 70 on a math test, 6 on a history test, 45 on a typing test, and 17 on an English
test. The math test had a mean of 73 and a standard deviation of 5.
The history test had a mean of 7 and a standard deviation of 1.
The typing test had a mean of 40 and a standard deviation of 8.
The English test had a mean of 16 and a standard deviation of 2.
(a) Compute Pat's z-score for each test.
(b) In which subject did Pat do his/her best compared to the other students?
(c) In which subject did Pat do his/her second best compared to the other students?
(d) In which subject did Pat do his/her worst compared to the other students?
Math 150 Answers to Review for Exam 1
1.(a) the patients
(b) the change in cholesterol level.
(c) numerical, measurement, quantitative (any of these three terms is acceptable).
2. The minimum or smallest value for this set of patients is  8 mg/dL. The maximum or
largest value is 115 mg/dL. Thus generally we would expect the change in cholesterol level for
patients on this diet to be between  8 mg/dL (a slight increase) and 115 mg/dL (a large
decrease).
3. The first quartile Q1 (also called the lower quartile QL) is 12 mg/dL. The third quartile Q3
(also called the upper quartile QU) is 25 mg/dL. Thus the change in cholesterol level of roughtly
one quarter of the patients is less than 12 mg/dL, the change in cholesterol level of roughly one
quarter of the patients is more than 25 mg/dL, and the change in cholesterol level of the
remaining one half of patients is between 12 mg/dL and 25 mg/dL.
4. The mean is 26.7 mg/dL Because all of the values are added in computing the mean, the
mean can be affected by outliers. The median or middle value is 23.5 mg/dL. Generally about
half of the data is above the median and half of the data is below the median. The median is not
affected by outliers. The mode or most frequently occurring value is 25 mg/dL. The mode is
more haphazard and less reliable than the mean and median but does have one advantage in that
it can be used for categorical data.
For this data set the mean, the median, and the mode are all in the 20s which means that
the cholesterol level of a typical patient on this diet would typically drop 20 something mg/dL.
The fact that the mean is more than the median indicates that the distribution of cholesterol
changes is skewed right. In other words there is a patient or a few patients with very big
cholesterol changes that cause(s) the mean to be higher than the median.
5. The range of 123 mg/dL gives the difference between the highest change in cholesterol level
and the lowest change in cholesterol level. The range has the advantage of being easy to compute
but it is very much affected by outliers and thus is less reliable than other measures of spread.
The interquartile range (IQR) of 13 mg/dL gives the difference in changes in cholesterol level
of the middle half of the data. It has the advantage and disadvantage of being completely
unaffected by outliers. The standard deviation of 33.96092526 mg/dL indicates how much a
typical value differs from the mean. It is the square root of the variance. The variance of
1153.344444 mg2/dL2 is the average of the squared deviations from the mean. The standard
deviation and the variance are equivalent in the sense that if one is big then the other will be big
and if one is small then the other will be small. Both the standard deviation and the variance are
somewhat affected by outliers.
All four measures of spread give an idea of how spread out the data is. The fact that the
range is more than 4 times as big as the IQR means that there is an outlier or outliers.
Math 150 Answers to Review for Exam 1
6. (a) see below (b)  1.021762503 st.devs.
(c),(d) 2.600046946 st.devs.
change
z-score
interpretation
23 mg/dL  0.10894874
about 0.11 standard deviations below the mean
25 mg/dL  0.050057529 about 0.05 standard deviations below the mean
 8 mg/dL  1.021762503 about 1.02 standard deviations below the mean
36 mg/dL
0.273844129
about 0.27 standard deviations above the mean
25 mg/dL  0.050057529 about 0.05 standard deviations below the mean
12 mg/dL  0.432850398 about 0.43 standard deviations below the mean
 5 mg/dL  0.933425687 about 0.93 standard deviations below the mean
24 mg/dL  0.079503134 about 0.08 standard deviations below the mean
115 mg/dL
2.600046946
about 2.60 standard deviations above the mean
20 mg/dL  0.197285555 about 0.20 standard deviations below the mean
7. min  8 mg/dL, Q1 12 mg/dL, median 23.5 mg/dL, Q3 25 mg/dL, max 115 mg/dL
8.(a),(f)
8.(b)
8.(c) For numerical data a histogram rather than a bar graph is used.
8.(d)
Math 150 Answers to Review for Exam 1
8.(e),(g)
9. There is one main cluster of patients whose cholesterol level dropped between 20 and 25
mg/dL or close to that after being on the diet. There is one very major (extreme) outlier, a
patient whose cholesterol level dropped 115 mg/dL after being on the diet. This extreme outlier
causes the overall distribution to be skewed right. Without this outlier the overall distribution
would be skewed left. There are perhaps two minor outliers, two patients whose cholesterol
levels actually increased slightly after being on the diet by 8mg/dL and 5 mg/dL respectively. If
one disregards all three outliers then the distribution is approximately symmetric. The
distribution does not appear to be granular.
10.
From the graph the most common hair colors among the patients were brown and gray. Since
hair color is not a numerical variable, the colors could be arranged in different orders to get
different shapes. Thus it is not meaningful to talk about the shape of this graph.
11.(a) Mph during school hours
Mph after school hours
11.(b)
min 28,
min 36,
Q1 28.5,
Q1 38,
median 32,
median 40,
Q3 34.5,
Q3 40,
max 43
max 48
Math 150 Answers to Review for Exam 1
11.(c) From the boxplots it is apparent that generally cars traveled more slowly during school
hours but there was also more variation in speeds during school hours. Both sets of data were
skewed right.
12.(a) math  0.6,
(b) typing
history  1,
(c) English
typing 0.625,
English 0.5
(d) history
standard deviations