Download Test 1 - La Sierra University

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 251, 15 October, Exam I
Results from this test:
Number of students writing test: 26
Mean:
78.8%
Standard Deviation: 11.4%
Low:
57.6%
First Quartile:
68.5%
Median:
78.8%
Third Quartile:
88.8%
High:
95.6%
Math 251, 15 October, Exam I
Name: (Partial Answers)
.
Instructions: Complete each of the following eight questions, and please explain and
justify all appropriate details in your solutions in order to obtain maximal credit for your
answers.
1. (2 pts) What is your birthday (Month & Day)? (This data will be used in class later so
please enter your true birthday)
2. (2 pts) If your instructor were to compute the class mean of this test when it is graded,
and use it to estimate the average for all tests taken by this class this quarter, would this
be an example of descriptive or inferential statistics? Explain.
Inferential – using a sample mean (one test) to estimate the population (all tests, quizzes
and assignments) mean.
3. (a) (2pts) In a survey of a sample of parents, 53% said they protect their children from
sun exposure using sunscreen. Is 53% a parameter or statistic? Explain.
Statistic – it is a numerical property of the sample of parents.
(b) (2 pts) In a union’s vote, 55% voted in favor of ratifying a contract proposal. Is 55% a
statistic or parameter? Explain.
Parameter – it is a numerical property of the population of union members.
(c) (1 pt) A study on attitudes about smoking is conducted at a college. The students are
divided by class, and then a random sample is selected from each class. What type of
sampling technique is this (e.g. simple random, convenient, stratified, systematic,
cluster)?
Stratified – a random sample is taken from each class (strata).
4. (5 points) (True or False)
(a) T
right.
The median is (generally) to the left of the mean in data that is skewed to the
(b) T The Empirical Rule for bell-shaped distributions says that about 95% of the
data lies within two standard deviations of the mean.
(c) F The 70th percentile of a set of data is the number so that 70% of the data lie
above that number, and 30% of the data are below that number.
(approximately 70% of data below, and 30% above).
(d)
T The z-score for a number 4 standard deviations below the mean is -4.
(e) F Chebychev’s Theorem says that exactly 8/9 of the data in any distribution will
lie within 3 standard deviations of the mean. (Not exactly 8/9 but at least 8/9.)
5. At a large university, 5000 students wrote a mathematics placement test one day. Given
that  x = 306,250 and  (x-)2= 451,250 for these test scores.
(a) (4 pts) Find the mean and population standard deviation for these scores.
Mean = 306,2505000 = 61.25
Population Standard Deviation = (451,2505000)1/2 = 9.5
(b) (2 pts) Find the test score that is 2 standard deviations below the mean.
61.25 – 2(9.5) = 42.25
(c) (2 pts) If the distribution is normal (bell-shaped), according to the empirical rule, what
is the approximate percentile of a score that is two standard deviations above the mean?
50% + 47.5% = 97.5%, therefore, the 97.5th percentile
6. Consider the following data of 26 numbers.
33
66
90
35
70
90
47
72
93
48
76
94
51
78
96
57
80
97
60
82
64
84
64
85
65
89
(a) (2 pts) Find the median of the data.
Because 26 is even, the median is the average of the 26/2 = 13th place and the 14th place,
therefore the median is (72+76)/2 = 74
(b) (4 pts) Given that Q1 = 61, and Q3 = 88 find the IQR and construct a box and whisker
plot for the data.
IQR = 88 – 61 = 27. To make the box plot, note that L=33, Q1 = 61, Q2 = 74, Q3 = 88,
and H = 97 (see text for further details on box plot).
(c) (6 pts) Construct a relative frequency histogram for the data where the first class has
limits 30-44, be sure to list all class limits and boundaries, and class width.
Limits
30-44
45-59
60-74
75-89
90-104
Boundaries
29.5-44.5
44.5-59.5
59.5-74.5
74.5-89.5
89.5-104.5
Frequency
2
4
7
7
6
Relative Freq.
2/26 = .077
4/26 = .154
7/26 = .269
7/26 = .269
6/26 = .231
Class Width = 15
See text for details on the Histogram – the boundaries should be on the horizontal axis (or
midpoints), the heights of the bars should be the relative frequency.
7. A doctor is interested in the relationship between age (x) and blood pressure (y) in
men.
So far the doctor has collected the following data.
Age (x)
Blood Pressure (y)
16
109
25
122
39
143
45
132
49
199
57
175
64
185
70
199
For this data: x =365, x2 =19073, y=1264, y2 =208690, xy = 61807
(a) (4 pts) Find the equation of the least squares regression line.
m = (8*61807 – 365*1264)(8*19073-3652) = 1.7095924
y-intercept: b = 1264/8 – 1.7095924*365/8 = 79.999845
The line equation is:
y = 1.7095x + 79.9998
(b) (2 pt) Use the regression line equation to predict the blood pressure of a 40-year-old
man.
Ans: 1.7095*40+79.9998 = 148.4 (plug x=40 in line, and compute y)
(c) (2 pt) At what age is a man’s expected blood pressure 140?
Ans: (140 – 79.9998)/1.7095 = 35.1 years (plug y = 140 in line, and solve for x).
(d) (2 pt) The correlation coefficient for this data is .888. Does this indicate that there is a
good linear fit? Explain.
It represents a pretty good fit. The closer the number is to 1, the better the fit to a line of
positive slope, while a correlation coefficient close to 0 indicates that there is practically
no linear correlation.
8. (2 pts) In studying the relation between hours of TV watched per week (x) and GPA’s
(y), it was found that GPA’s tended to decreases as the hours of TV watched increased.
Would you expect the correlation coefficient to be positive or negative for the data
collected? Explain.
A negative correlation coefficient, because as the x’s increase, the y’s decrease.