Download Review Questions for Final

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 251, Review for Final, Autumn 2002
The following questions are samples of the types of questions that may be on the final. There
may be questions on the final from topics not represented here. For further review, look at your
old tests and reviews, assigned homework, etc.
Material covered since 3rd test will probably comprise about 30% of the points on the final test.
This material includes hypothesis test for means (large samples and small samples), hypothesis
tests for proportions, hypothesis test for the difference of two population means. The chi-square
test for goodness of fit, and analysis of variance. The rest of the test will comprise of questions
chosen from the other material covered throughout the quarter.
1. (a) Which type of random variable is the number of consumers refusing to answer a telephone
survey and what possible values can it take?
(b) How many bridge hands are there that have 4 aces? What is the probability of getting such a
hand?
2. For events A and B in a sample space S, we are told P(A) = .5 and P(B) = .3 and
P(A and B) = .15. Which of the following is true?
(a) A and B independent events.
(b) P(A or B) = .8
(c) A and B mutually exclusive events.
(d) All of the above.
3. Which of the following is true about a binomial random variable for n trials with probability of
success on each trial given as p.
(a) The probability of n successes is pn.
(b) Its variance is equal to np(1-p).
(c) The probability of no successes is (1-p)n.
(d) All of the above.
4. An hypothesis test on the mean reports a P-value of .031. Which of the following is true?
(a) The null hypothesis should be accepted if the level of significance is .03.
(b) The null hypothesis should be rejected if the level of significance is .05.
(c) There is almost a 97% chance of making a Type I error.
(d) All of the above.
5. If a 95% confidence interval for the population mean has length 12 when the sample size is
100, what would the length of a 95% confidence interval from the same population be if the
sample size were 1600?
(a) 12
(b) 48
(c) 3
(d) 6
6. A two-tailed hypothesis test on the mean of an approximately normal population is
conducted with a sample size of n=10. For what t-values should the null hypothesis be rejected
given that the level of significance is .05?
(a) t  -1.96 or t  1.96
(c) t  -1.833 or t  1.833
(b) t -2.262 or t  2.262
(d) t  -2.228 or t  2.228
7. (a) Given the data 9,12,15,17,17,19,23,44,57,61,63,70. Find the mean, median, range, and
mode.
b) If your score is at the 81st percentile on a national exam which was taken by 200,000 people,
approximately how many of those 200,000 test takers scored higher than you?
8. In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the
Republican candidate, where in reality, unknown to the pollster, 42 percent support the
Republican candidate.
(a) What is the value of the statistic of interest?
(b) What is the value of the parameter of interest?
(c) Describe the population of interest.
(d) In general, is it true that given a certain population, the parameter of interest will not change
under repeated sampling? Explain.
9. (a) According to Chebychev’s theorem, how much data from any distribution can be more than
3 standard deviations from the mean?
(b) Given a population of size 4,800 with unknown distribution, at least how many data values
are within 4 standard deviations of the mean?
10. The following ranked data represent the number of miles driven each day by a salesman over
a 30-day period.
31
71
86
37
74
86
43
75
87
44
75
89
44
78
89
55
81
92
58
81
92
65
81
93
65
82
99
66
84
101
Construct a relative frequency histogram for these data whose first class has class limits 30-44:
11. Consider the sample of 30 numbers
31
71
86
37
74
86
43
75
87
44
75
89
44
78
89
55
81
92
58
81
92
65
81
93
65
82
121
66
84
133
for which x = 2258, and x2 = 184670 (or (x-2= 14,717.86667) Find:
(a) the sample mean
(b) the sample variance
(c) the sample standard deviation
(d) Given that Q1= 65, Q2= 79.5 and Q3= 87, construct a boxplot for the data.
(e) Find the interquartile range for the data.
12. (True or False)
(a)
The median is a resistant measure because it is not influenced by extreme observations.
(b)
The mean is a resistant measure because extreme measures on one side average out with
those on the other side.
(c)
The mean and median are equal in a symmetric distribution.
(d)
right.
The mean is usually to the right of the median in a distribution that is skewed to the
13. The following represent scores of a group of 15 students on Math and English tests.
Scores on English Test
73 75 77 77 78
79
80
81
82
83
84
85
85
86
89
Scores on Math test
72 75 79 83 84
85
87
88
90
91 92
93
93
97
98
(a) Construct stem and leaf plots for both tests splitting stems 7,8,9 into two parts with leaves 0-4
on one part and 5-9 on the other part?
(b) Which test scores seem to have a higher standard deviation? Explain. Don't compute!
14. Suppose distribution of test scores for a certain test is normal with  = 70 and  = 12.
Suppose that 500 students wrote the test.
(a) What test score would have a z-score of -2.25?
(b) What score would put a student at the 90th percentile?
(c) Approximately what number of students would have scores between 60 and 90?
15. A study of behavior of a large number of drug offenders after treatment for drug abuse
suggests that the likelihood of conviction within a two-year period after treatment may depend on
the offender's education. The proportions of the total number of cases falling to four
education/conviction categories are shown in the following table:
10 or more years of
Convicted
.1
Not Convicted
.3
education
9 or less years of education
.27
.33
Suppose a single offender is selected from the treatment program. Define the events:
A: The offender has 10 or more years of education.
B: The offender is convicted within 2 years of completion of treatment.
Find:
(a) P(A or B)
(b) P(A and B)
(c) P(B|A)
(d) The probability that neither A nor B occurs.
(e) Are A and B independent?
(f) Are A and B mutually exclusive?
16. A business employs 600 men and 400 women. Five percent of the men and 10% of the
women have been working there for more than 20 years. If an employee is selected by chance,
what is the probability the employee is male, given that the length of employment is more than
20 years?
17. (a) How many permutations are there of 30 objects taken 3 at a time?
(b) In how many ways can a gold medal, silver medal and bronze medal be awarded to 30
competitors in a fencing competition?
(c) How many menu possibilities are there in a restaurant that offers 5 different appetizers, 6
Salads, 12 main dishes and 10 desserts if one choice is made from each category?
(d) Suppose that a large shipment of CD’s contains 5% defective CD’s. Suppose a customer
chooses 2 of these CD’s at random. What is the probability that:
i) Both CD’s will be good?
ii) Both CD’s will be defective?
iii) Exactly one CD is defective?
iv) At least one CD is defective?
v) At least one CD is good?
18. A jury pool consists of 13 men and 15 women. What is the probability that a randomly
chosen jury from this pool will consist of 5 men and 7 women?
19. Let x be the random variable that represents the number of heads observed when
5 fair coins are tossed. Make a probability distribution for x, and find the probability that
one will get more than 3 heads when tossing five fair coins.
20. Consider the random variable whose probability distribution is given by
the following table.
x
p(x)
3
.1
7
.3
8
.45
11
?
(a) Is this a discrete or continuous random variable?
(b) Find P(x = 11).
(c) Construct a probability histogram for p(x), and compute the expected value of x and the
standard deviation of x.
21. The following sample data concerns the number of years a student studied German in school
versus their score on a proficiency test.
Years (x)
3
Test Score(y) 57
Note: x = 35
4
78
y = 697
4
72
2
58
5
89
3
63
x2 = 133 y2 = 50085
4
73
5
84
3
75
2
48
xy =2554
(a) Find the equation of the least squares line for this data.
(b) Use your line from (a) to predict the score on the proficiency test of a person who had 3.5
years of German.
(c) Use the regression line in (a) to predict the number of years of German required to achieve a
proficiency score of 75.
(d) Compute the correlation coefficient r for this data. What does this coefficient suggest about a
linear relationship between number of years German was studied in school and test scores for this
sample? That is, determine whether it is a good fit, and whether it indicates a positive or
negative linear relationship.
22. Cascade Airlines (a.k.a. “Crashcade” and now defunct) records showed that on average 10%
of prospective passengers will not claim their reservations on a certain flight.
Suppose that they booked 21 passengers for 20 seats on that flight.
(a) Find the mean and standard deviation for the number of passengers who will claim a
reservation.
(b) Find the probability that all passengers who show up for the flight will receive a seat?
23. A developer wishes to test whether the mean depth of water below the surface in a large
development tract was less than 500 feet. For the sample data, n = 32 test holes, the sample mean
was 486 feet, and the standard deviation was s = 53 feet. Complete the test using the P-value
approach, and report the conclusion for a 1% level of significance.
24. A vendor was concerned that a soft drink machine was not dispensing 6 ounces per cup, on
average. A sample size of 40 gave a mean amount per cup of 5.95 ounces and a standard
deviation of .15 ounce.
(a) Find the P-value
(b) For which of the following levels of significance would the null hypothesis be rejected?
(c) For each case in part (b), what type of error has possibly been committed?
(d) Find a 98% confidence interval for the mean amount of soda dispensed per cup.
(e) Supposing that the population standard deviation is  = .15, what sample size would be
needed so that the margin of error in a 98% confidence interval is E = .01?
25. On June 7, 1999 a poll on the USA Today website showed that out of 2000 respondents, 71%
felt that Andre Agassi deserved to be ranked among the greatest tennis
players ever.
(a) Assuming that the 2000 respondents form a random sample of the population of tennis fans,
construct a 95% confidence interval for the proportion of all tennis fans who feel that Andre
Agassi should be ranked among the greatest tennis players ever.
(b) Based on (a), would you be comfortable in saying that the poll is accurate to within plus or
minus 2 percent 19 times out of 20? Explain.
(c) In actuality, the survey was based on voluntary responses from readers of the USA Today
sports website. Do you think the 2000 respondents actually formed random sample? Explain.
26. (a) Suppose that a February Gallup poll of 1200 randomly selected voters found that 53
percent support George W. Bush's energy policy. Conduct an hypothesis test at a level of
significance of = .01 to test whether the true voter population support for George W. Bush's
energy policy in February was greater than 50 percent.
(b) Report the P-value of the test in (a) and give a practical interpretation of it.
27. A brand of paint claims that in one coat, 1 gallon will cover at least 350 square feet on
average. A random sample of ten 1-gallon cans produced the following data.
Area Covered (Square Feet): 342, 378, 358, 364, 381, 392, 339, 356, 386, 347
Note: for this data x = 3643
x2 = 1330395
(a) Conduct the hypothesis test:
H0: = 350 vs. Ha:  > 350
at a level of significance of significance of = .05. Be sure to state critical region, test statistic
and conclusion in your answer.
(b) Construct a 99% confidence interval for the mean.
28. In a 1993 survey of 50 Education graduates and 50 Social Science graduates, the following
data were obtained for their average starting salaries.
Major
Education
Social Sciences
Mean
22,554
20,348
St. Dev
2225
2375
(a) Find a point estimate for the difference in average starting salaries for Education and Social
Science majors.
(b) Let 1 be the population mean salary for the Education graduates and 2 be the population
mean salary for the Social Science graduates.
Report the P-value for the hypothesis test H0: 1- 2 = 1200 versus Ha: 1- 2 > 1200.
(c) Based on (b), do you think there is sufficient evidence to believe that 1 is at least $1200
greater than 2 ? Explain.
29. Suppose that the probability is .91 that a person who has reservations for a certain opera will
show up, and the decision of one person is independent from that of another. Suppose the opera
has sold 1243 tickets. What is the probability that at least 1140 people will show up for the opera.
30. (a) If you were to conduct an hypothesis test to determine if the means from several different
populations are equal using the method of analysis of variance, what assumptions would you
make on the populations? What distribution would you use to conduct your test?
(b) Do problem 3, p. 532. (See Answer in Text)
31. (a) A local radio station claims that 15 percent of all people in Riverside say it is their
favorite station, 65 percent of all people in Riverside listen to it occasionally, while 20 percent
never listen to it. Suppose you surveyed 200 randomly selected people in
Riverside and found that of those 200 people, 20 claimed it was their favorite station, 131 said
they listen to it occasionally, while 49 never listen to it. Conduct an hypothesis test at a level of
significance of .05 to determine whether the stations claim is correct. Make sure to state the
rejection region for your test.
(b) What are the assumptions one must make when using the chi-square test for Goodness-of-Fit?
(c) For further practice, see, e.g. problem 3, p. 500.
32. List conditions that are needed on the population and on the random sample(s) in order to
make inferences in the following settings. In some cases, there may be no conditions required, so
just list none.
(a) Confidence interval for a mean from a large sample.
(b) Hypothesis test on a mean using a small sample.
(c) Hypothesis test on a proportion.
(d) Hypothesis test concerning two means from large independent samples.
33. Confidence intervals for variance and standard deviation. Do problem #11 on p. 307. See text
for answer.