Download 1 STA 225: Exam 1 Instructor: Theo Ogunyemi July 19, 2004 Name

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Time series wikipedia , lookup

Categorical variable wikipedia , lookup

Transcript
1
STA 225: Exam 1
Instructor: Theo Ogunyemi
July 19, 2004
Name ___________________________
Student # ________________________
This is a closed book exam. Show all your work for full credit. Good luck!
PART I (Multiple Choice, 2 points each). Circle the letter of the best response for each question.
1. You compute the correlation coefficient between hours of TV watched each week and grade-point index
for a group of college students, and you obtain r = –1.65. This means that
(a) you made an arithmetic mistake.
(b) students who watch more TV tend to get lower grades.
(c) Students who watch more TV tend to get higher grades.
(d) There is little association between TV viewing and grade-point index.
2. In the regression equation, the value that gives the amount by which Y changes for one unit in X is
called the
(a) coefficient of correlation.
(b) coefficient of determination.
(c) slope.
(e) intercept.
3. The letter grades given in a course are of A, B, C, D, or F. This grade is an example of
(a) a continuous variable
(b) categorical variable
(c) discrete variable
(d) none of the above
4. During the first week of August, 15 women and 25 men joined a weight control club. The average age of
the women was 32 and the average age of the men was 28. What was the average age for the entire
group?
(a) 29.5
(b) 30
(c) 30.5
(d) 31
(f) none of the above
5. The correlation coefficient measures
(a) whether there is a relation between two variables.
(b) whether or not a scatterplot shows an interesting pattern.
(c) whether a cause and effect relation exists between two variables.
(d) the strength of a straight line relation between two variables.
2
6. Random variable Y takes values that are at most y means:
(a)
(b)
(c)
(d)
Y>y
Y≥y
Y≤y
Y>y
€
€
€ 7. Consider the following statements concerning a box plot:
€
(i) it can be constructed from data on a nominal scale.
(ii) it can be used to identify skewness in the data.
(iii) it can be used to identify outliers.
Which of the above statements are false?
(a) only (i)
(b) only (ii)
(c) only (ii) and (iii)
(d) only (i) and (ii)
(e) none of the above
8. A set of data is found to have a sample standard deviation of 25. Suppose 9 were added to each of the
numbers in the data. The standard deviation of the resulting data
(a)
(b)
(c)
(d)
would now be 34.
would now be 28.
would still be 25.
cannot be determined.
9. Suppose a straight line is fit to data having response variable y and explanatory variable x. Predicting
values of y for values of x outside the range of the observed data is called
(a)
(b)
(c)
(d)
contingency
extrapolation
causation
correlation
10. In a survey it was reported the “IQ is negatively correlated with family size”. This means that
(a)
(b)
(c)
(d)
a mistake has been made, since correlation cannot be negative.
persons from larger families tend to have higher IQ scores.
family size has no effect on IQ, since the correlation is negative.
persons from larger families tend to have lower IQ scores.
3
11. A distribution of home prices is skewed to the right. Consider the following statements concerning this
distribution:
(i) the mean is larger than the median.
(ii) half the area of the histogram of the distribution falls to the left of the median.
(iii) the variable is on the nominal scale of measurement.
Which of the above statements are true?
(a) only (i)
(b) only (ii)
(c) only (i) and (ii)
(d) only (ii) and (iii)
(e) all of them
12. If P(−z ≤ Z ≤ z) = 0.754 where Z ~ N(0,1) , the value of z is
€
(a)
(b)
(c)
(d)
1.61
0.161
1.16
1.96
€
4
PART II: Show all the necessary steps for full credits.
1. (10 pts.) The number of chapters in ten math books published by a company is as follows:
12, 8, 11, 10, 7, 10, 15, 13, 14, 9
Use the following information in your calculations: Σxi =109, and Σxi2 =1249.
(a) Find the measures of location: mean, medium, and mode.
€
€
(b) Find the measures of variation: sample variance and sample standard deviation.
5
2. (15 pts.) The following are annual salaries (in thousands) of 19 employees of a corporation.
21.0
19.0
16.5
22.4
18.6
18.6
50.0
17.8
19.3
18.0
17.0
27.0
18.6
18.4
21.7
19.1
20.5
16.4
20.7
(a) Construct a stem and leaf plot for the data. Include a key on how to interpret your plot.
(b) Determine the quartiles Q1 , Q2 , Q3 ,
€
(c) Does the data set contain any outliers? Compute the inner fences, and outer fences and identify mild
and extreme outliers, if any. Draw a box plot (indicating outliers, if any).
6
3. (10 pts.) The army reports that the distribution of head circumference among soldiers is
approximately normal with mean 22.8 inches and standard deviation 1.1 inches.
(a) Helmets are mass-produced for all except the smallest 3% and the largest 3% of head sizes.
Soldiers in the smallest or largest 3% get custom-made helmets. What head sizes get
custom-made helmets.
(b) What percentage of the soldiers will have head sizes between 20 inches and 24 inches?
7
4. (10 pts.) The gestation period for human births (in days) is assumed to have an approximate Normal
distribution with mean 266 and standard deviation 16.
(a) What is the 15th percentile of gestation period?
(b) What gestation period is exceeded by 90% of births?
(c) Below what gestational period do 70% of births fall?
8
5. (10 pts.) (a) The weights of ripe watermelons grown at Mr. Smith’s farm are normally distributed
with a standard deviation of 2.8 pounds. Find the mean weight of Mr. Smith’s ripe
watermelons if only 3% weigh less than 15 pounds.
(b) A machine fills containers with a mean weight per container of 16.0 ounces. If no more than
5 percent of the containers are to weigh less than 15.8 ounces, what must the standard deviation
of the weights equal? (Assume normality.)
9
6. (21 pts.) A student criminology, interested in predicting the age at incarceration (Y) using the age at first
police contact (X), collected the following data:
Age at first contact
X
11
17
13
12
15
10
12
Age at incarceration
Y
21
20
20
19
18
23
20
(a) Draw a scatter plot of the data.
(b) Find the equation of the least squares regression line using the information below:
n=7
Σx i = 90, Σx i2 =1192, Σy i =141, Σy i2 = 2855, Σx i y i =1799
(Show the details of your work.)
€
(c) Using the least squares regression line in part (b), predict the age at incarceration for an individual
who at age 14 had his first contact with police.
10
(d) Predict the age at incarceration for an individual who at age 25 had his first contact with police.
Comment on the appropriateness of using the line in (b) for this prediction.
(e) Calculate the correlation coefficient between X and Y.
(f) What percent of variation in Y is accounted for by X?