Download Student`s height (in) Mother` height (in)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Psych 315, Winter 2017, Homework 4 Answer Key
Due Friday, January 27 either in section or in your TA’s mailbox by 4pm.
ID
Name
Section [AA Adriana] [AB Adriana] [AC Kelly] [AD Kelly]
The scatterplot below plots students heights and their mother’s heights for the 25 students who chose Green as their favorite
color.
You can download the Excel file containing the data for this homework here: HW4datav1.xls
Round all answers to 2 decimal places.
Mother' height (in)
70
65
60
55
50
60
62
64
66
68
70
72
Student's height (in)
1) Calculate the correlation. You don’t have to show your work and you can use Excel’s ’correl’ function.
r = 0.39
1
2) Use Excel or your calculator to find the equation of the regression line and draw it on the scatterplot. You’ll have to
calculate the standard deviations and the means of X and Y. Remember, the slope is:
sy
m = r( sx ) and the y-intercept is: b = Ȳ − (m)(X̄)
mean of x: 64.72, mean of y: 62.72
sx = 3.21, sy = 4.5
4.5 = 0.55
Slope: m = (0.39) 3.21
Intercept: b = 62.72 - (0.55)(64.72) = 27.12
Y = 0.55X + 27.12
3) Use Excel or your calculator to find the standard error of the estimate:
rP
(Y −Y 0 )2
Syx =
n
Y’ = 62.87, 61.22, 65.62, 63.42, 60.67, 61.77, 63.97, 60.67, 63.97, 66.17, 64.52, 62.87, 62.32, 62.32, 61.22, 64.52, 62.32, 62.87,
60.12, 61.22, 62.32, 60.12, 66.72, 61.77 and 62.32
Y-Y’ = 0.13, 5.78, 2.38, -3.42, 2.33, 1.23, 1.03, 2.33, 0.03, 4.83, -10.52, 1.13, -4.32, -0.32, 3.78, 3.48, 1.68, -2.87, -12.12, 1.78,
0.68, 1.88, -1.72, 3.23 and -2.32
(Y − Y 0 )2 = 0.02 + 33.41 + ... + 5.38 = 428.54
q
Syx = 428.54
25 = 4.14
P
4) Use the correlation as another way of calculating the standard error of the estimate. Your answer should be close, but
not exactly the same due to rounding error.
p
Syx = Sy 1 − r2
p
(4.5) 1 − 0.392 = 4.14 inches
5) Use the regression line to predict the mother’s height for a student that is 62 inches tall.
Y = mX+b = (0.55)(62) + 27.12 = 61.22 inches
6) Assuming homoscedacity, find the range of mother’s heights that covers the middle 50% of the heights of mothers of women
that are 62 inches tall. Hint: The heights of the mothers of women that are 62 tall should be distributed normally with a
mean determined by the regression line (problem 5) and a standard deviation equal to the standard error of the estimate
(problem 4).
The Mother’s heights should be distributed normally with
a mean of 61.22 and a standard deviation of 4.14
Using table A, the z-scores covering the middle 50 percent
of the normal distribution is z = +/- 0.67
Converting to heights, the range is between
61.22-(0.67)(4.14) and 61.22+(0.67)(4.14)
which is between 58.45 and 63.99 inches.
2
7) Repeat problems 5 and 6 but for students that are 67 inches tall. Note, because of homoscedasticity, the range above and
below the predicted height should not change.
Y = mx+b = (0.55)(67) + 27.12 = 63.97 inches
The Mother’s heights should be distributed normally with
a mean of 63.97 and a standard deviation of 4.14
Using table A, the z-scores covering the middle 50 percent
of the normal distribution is z = +/- 0.67
Converting to heights, the range is between
63.97-(0.67)(4.14) and 63.97+(0.67)(4.14)
or between 61.2 and 66.74 inches
8) You should see that for any student’s height, the middle 50% of the corresponding mothers heights should fall within the
same range above and below the regression line.
Draw two parallel lines on the scatterplot, one above and one below the regression line that should cover the middle 50% of
the mother’s heights. Use the values from problems 6 and 7 as points on the lines.
9) Find the percent of data points that fall between these two parallel lines. How close does it match to 50%?
15 of the 25 points fall between the parallel lines
This is 100 15
25 = 60 percent of the points.
This is pretty close.
10) The correlation between SAT scores and IQ is around 0.5. Assume that SAT scores are normally distributed with a
mean of 915 and a standard deviation of 88.24, and IQ scores are normally distributed with a mean of 100 and a standard
of deviation of 15.
a) Find the equation of the regression line that predicts IQs from SAT score. Hint: use the equations from problem 2. Give
your answer in slope-intercept form.
Let X be SAT scores, and Y be IQ
sy
15 = 0.08
The slope is r sx = (0.5) 88.24
The line goes through the means, so:
Y = (0.08)(X-915) + 100
Y = (0.08)X + 26.8
b) What is the expected IQ of a student with a SAT score of 1000?
IQ = (0.08)(1000)+ 26.8 = 106.8
c) What is the proportion of variance of Y explained by X (the coefficient of determination)?
The coefficient of determination is r2 = 0.52 = 0.25
3
d) What is the total variance in the IQ scores?
The variance is the standard deviation squared: 152 = 225
e) From parts c and d, calculate the amount of variance in IQ scores that is explained by SAT scores.
The amount of variance explianed by SAT scores is equal
to the total amount of variance in SAT scores multiplied
by the proportion of variance accounted for, which is r2 .
(225)(0.25) = 56.25
11) Explain why the correlation between parent’s heights and all student’s heights might be lower than for the correlations
you’d find for just the female or male students. Draw a picture if it helps.
While there may be a strong correlations within each gender
combining students leads to added variance in the student’s
heights that is not explained by the parent’s height.
This leads to an overall lower correlation for the whole group
than for the correlations within each gender.
76
74
Students by gender
76
Male
Female
74
72
r= 0.84
Student's height (in)
Student's height (in)
72
70
68
66
64
r= 0.85
70
68
66
62
60
60
58
58
65
70
56
60
75
Parent's height (in)
r= 0.52
64
62
56
60
All students
65
70
Parent's height (in)
4
75
12) Explain why the correlation between student’s heights and video game playing time might be stronger for the whole
group than for the correlations within male and female students. Again, draw a picture if it helps.
Suppose there is no correlation between height and video game playing
within each gender. But since men play games more than women, and
men are taller than women, the combined distribution is correlated.
Video game playing (hours/week)
7
Students by gender
Male
Female
8
7
6
5
4
3
2
r= 0.00
1
0
60
All students
r= 0.01
Video game playing (hours/week)
8
6
5
4
3
r= 0.72
2
1
65
70
0
60
75
Student's height (in)
65
70
Student's height (in)
5
75