Download Math 138 Summer 4 2013 Section 442

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
Math 138 Summer 4 2013 Section 442 - Unit Test 1 Green Form, page 1 of 7
1.
In 2010, as required by law every ten years, the US Census Bureau collected
information on the people throughout the United States. They asked people
about age, race, place of birth, language spoken at home, and education level.
a. Identify the W’s. (6 points)
Who: People throughout the US
What: Person, age, race, place of birth, home language, education level
When: 2010
Where: Throughout the US
How: Collection by US Census Bureau (method not given)
Why: Required by law (other answers OK.)
b. Name the variables and specify for each variable whether it should be
treated as a quantitative or categorical variable. For the quantitative
variables, give the units. (5 points)
Person (Categorical), age (Quantitative, years), race (Categorical), place of
birth (Categorical), home language (Categorical), education level
(Categorical)
2. As a group, the Dutch are among the tallest people in the world. The average
Dutch man is 184 cm tall (just over 6 feet). If a Normal model is appropriate
and the standard deviation is 8 cm, use the 68-95-99.7 rule to approximate
what percentage of all Dutch men will be over 200 cm tall? Show your work.
(4 points)
200 cm is 2 standard deviations from 184 cm. 5% of men are outside of 200
cm. This is 2.5% on each side. Hence, approximately 2.5% are more than
200 cm.
Math 138 Summer 4 2013 Section 442 - Unit Test 1 Green Form, page 2 of 7
3. The states differ greatly in the percent of their residents who were born
outside the United States. California leads with 26.5% foreign-born. The
following are the histogram and descriptive statistics for the distribution of
foreign-born residents in the 50 states.
foreign born people living in the United States by state
21
20
Frequency
15
10
9
6
5
5
2
2
3
1
1
0
0
0
6
12
18
percent of foreign born
24
Descriptive Statistics: percent of foreign born
Variable
C2
N
50
N*
0
Mean
7.144
StDev
5.682
Minimum
1.100
Q1
2.875
Median
5.150
Q3
10.525
Maximum
26.200
Answer the following questions. Each correct answer will receive two
credits.
(a) Which measure of center is best to use in describing this distribution?
Median
(b) Give a reason for your answer to (a).
The data are skewed.
(c) Which measure of spread is best to use in describing this distribution?
IQR (Interquartile Range)
(d) Assume that the bins start at – 1.5 and the bin width is 3. How many
states have between 7.5 and 16.5 percent of foreign-born residents?
6 + 5 + 2 = 13.
Math 138 Summer 4 2013 Section 442 - Unit Test 1 Green Form, page 3 of 7
4. A study in Sweden looked at former elite soccer players, people who had
played soccer but not at the elite level, and people of the same age who did
not play soccer. Here is a contingency table with level of soccer play and
whether or not they had arthritis of the hip or knee by their mid-fifties.
Arthritis
No arthritis
Total
Elite
10
61
71
Non-elite
9
206
215
Did not play
24
548
572
Total
43
815
858
a. What percent of the people had arthritis? Give the fraction and the
percent. (3 points)
43/858 = 5.01%
b. What percent of elite players had arthritis? Give the fraction and the
percent. (3 points)
10/71 = 14.08%
c. What percent of those with no arthritis did not play soccer? Give the
fraction and the percent. (3 points)
548/815 = 67.24%
5. The heights of women aged 20 to 29 are approximately Normal with a mean
of 64 inches and a standard deviation of 2.7 inches.
a.
Which height is more unusual, a 70 inch woman or a 56 inch tall
woman? Use z-scores to compare. (5 points)
70”: (70-64)/2.7 = 2.59
56” (56-64)/2.7 = - 2.96
The 56” woman is more unusual because her height is farther from
the mean.
b.
What information do you get from the z-scores that the actual heights
do not give? (5 points)
The number of standard deviations from the mean.
6. Scores on the ACT test for the 2004 high school graduating class had mean
20.9 and standard deviation 4.8. In all, 1,171,460 students in this class took
the test, and 1,052,490 of them had scores of 27 or lower.
a.
If the distribution of scores were Normal, what percent of scores
would be 27 or lower? (4 points)
Normalcdf(-99999,27,20.9,4.8_ = 0.8981
b.
c.
What percent of the actual scores were 27 or lower? (2 points)
1052490/1171460 = 0.898
Does the normal distribution describe the actual data well? Explain
your answer using the results of parts a and b. (4 points)
Yes. (a) and (b) agree to three decimal places
Math 138 Summer 4 2013 Section 442 - Unit Test 1 Green Form, page 4 of 7
7. The Centers for Disease Control and Prevention lists causes of death for men
in the United States during 2004.
Cause
Percent
Heart disease
27.2
Cancer
24.2
Accidents
6.1
Circulation disease and stroke 5.0
Respiratory diseases
5.0
Influenza and pneumonia
2.3
Other causes
a. What percent of deaths were from causes not listed here? (3 points)
100 – (27.2 + 24.2 + 6.1 + 5 + 5 + 2.3) = 30.2
b. Create a pie chart for these data. (5 points) I used StatCrunch here
because I wanted an accurate plot. Anything that approximates this will
be OK.
Math 138 Summer 4 2013 Section 442 - Unit Test 1 Green Form, page 5 of 7
8.
It is well-known that the cost of goods and services continues to rise
with passage of time. A slice of pizza is no exception. The following
chart gives the cost of a slice of pizza in New York City for selected
years.
Year
1960 1973 1985 1995 2002 2003
Cost of a slice of pizza
$0.15 $0.35 $1.00 $1.25 $1.75 $2.00
Dependent Variable: Pizza
Independent Variable: Year
Pizza = -82.12306 + 0.041889444 Year
Sample size: 6
R (correlation coefficient) = 0.9728
Plot of residuals:
Math 138 Summer 4 2013 Section 442 - Unit Test 1 Green Form, page 6 of 7
a.
Make a scatterplot of the data. Is a linear relationship reasonable?
Explain. (4 points)
b.
Find the correlation coefficient for these points. Is a linear
relationship reasonable? Explain. (4 points )
r (correlation coefficient) = 0.9728 . A linear relationship is reasonable
because r is close to 1.
c.
Look at the residual plot. Is a linear relationship reasonable? Explain.
(3 points)
Generally yes because there is no discernible pattern. With a sample
size this small, though, it is hard to tell.
d.
What is the regression line for predicting the price of a slice of pizza in
a given year? (1 point)
Pizza = -82.12306 + 0.041889444 Year. Given above.
e.
What is the predicted cost of a slice of pizza in 2002? (3 points)
If the equation is used as written, 1.7396, or $1.74.
f.
What is the residual for 2002? Interpret the residual in context. (4
points)
g.
Interpret the slope in context. (3 points)
For each additional year, the cost of a slice of pizza goes up an average
of $0.042 cents.
Again, I used StatCrunch for an accurate picture. A linear
relationship is possible because the points generally appear
to be linear.
$1.74 - $1.75 = - $0.01. The line underpredicted the actual cost for
2002.
Math 138 Summer 4 2013 Section 442- Unit Test 1 Green Form , page 7 of 7
9. The following table shows the number per 1000 of incoming ninth graders in
selected Eastern states who graduate in four years with a standard High
School diploma for the years 2007 and 2012. Construct a boxplot for the
· ibutions.
data for each year and comp_are t h e d1str
2007 2012
729
737
795
801
863
853
609
805
714
784
822
770
793
735
769
610
751
a. Make side-by-side box plots for the data. Label the boxplots with the fivenumber summary. (2 for the drawing; 10 for the five-number summary.
Credit will be deducted for summary statistics that are not part of the five!lumber summary.) ~\
tf.~f
· 0~ 5
11
3
14
•
Q
I
~: ~
I I r ~z
I 14eoI I. . . . . . . ~;;
Qt
~
73{1 77o yo.J
b. Are there any outliers? Compare using the 1.5 IQR rule. (4 points)
No outlier (maximum is 863)
2007: Q3 + 1.5*1QR =808.5 + (1.5*87) =939
Q1- 1.5*IQR = 721.5- (1.5*87) = 591
No outlier (minimum is 609)
No outlier (maximum is 853)
2012: Q3 + 1.5*IQR = 803 + (1.5*67) = 904.5
Q1- 1.5*IQR =736- (1.5*67) =635.5
610 is an outlier; no others
c. Which data are more variable? Give a reason for your answer. (2 points for
reason; none for answering 2007 or 2012 with no explanation.)
2007. Range is greater.