Download Final Exam Review

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Omnibus test wikipedia , lookup

Regression toward the mean wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Math 138 Final Exam Review
Although there are many problems on this review, it does not fully cover all the material
in MATH 138. For additional review problems, review your homework problems,
projects, previous quizzes and exams, and other classroom handouts.
1.
Classify the variable as categorical or quantitative: amount of monthly
electric bill in dollars.
2.
A real estate company kept a database on the apartments in a certain city. The
percentages of various types of apartments are listed below.
Type
Percent
Studio
15.9
1-bedroom
25.5
2-bedroom
45.8
3-bedroom
10.1
What percentage of the apartments in the city are 1-bedroom or 2-bedroom
apartments?
3.
A survey of patients at a hospital classified the patients by gender and blood
type, as seen in the table.
Gender
Male Female
Blood A
105
93
type
B
98
84
O
160
145
AB
15
18
a. What percentage of the patients with type-B blood are male?
b. What percentage of the female patients have type-O blood?
c. What percentage of the patients are male and have type-A blood?
d. What percentage of the patients are female or have type-O blood?
e. If you know that a patient has type-AB blood, what is the probability the
patient is a female?
4.
The number of days off that 30 police detectives took in a given year are
provided below. Create a histogram of the data.
10 1 3 5
4
7
0 6
6
1
5
1 0 9 11
1
5 7 10 1
5
4 1 7
7
11 1 5
6
0
5.
Describe what these boxplots tell you about the relationship between literacy
rates on the different continents.
Percents of literacy rates by continent
100
90
80
Data
70
60
50
40
30
20
10
Africa percents
South American percents
Europe percents
Asian percents
6.
A manufacturer records the number of errors each work station makes during
the week. Create a dotplot of the data.
6 3 2 3 5 2 1
0 2 5 4 2 0
7.
The stem-and –leaf diagram shows the ages of males playing basketball at a
public gym over the course of a day. Describe the shape, center, spread, and
unusual features of the distribution.
4 8 9
4 0 1 2 3
3 6 6 8 8 9
3 0 0 0 1 4 4
2 6 7 9 9 9
2
1 5 5 5 5 6 6 6 6 6 6 7 7 7
1 2 3 4 4 4 4
0
0
8.
Here are the test scores of 32 students. Construct a boxplot for the given data.
32 37 41 44 46 48 53 55
56 57 59 63 65 66 68 69
70 71 74 74 75 77 78 79
80 82 83 86 89 92 95 99
9.
The ages of the 21 members of a track and field team are listed below.
Identify potential outliers, if there are any.
15 18 18 19 22 23 24
24 24 25 25 26 26 27
28 28 30 32 33 40 42
10.
Suppose a computer chip manufacturer rejects 15% of the chips produced
because they fail presale testing. If you test 4 chips, what is the probability
that not all of the chips fail?
11.
Assume that 11% of people are left-handed. If we select 10 people at random,
find the probability that exactly 3 are lefties.
12.
Of the coffee makers sold in an appliance store, 6.0% have either a faulty
switch or a defective cord, 2.0% have a faulty switch, and 0.8% have both
defects. What percent of the coffee makers will have a defective cord?
13.
In a certain college, 33% of the physics majors belong to ethnic minorities. If
10 students are selected at random from the physics majors, what is the
probability that no more than 6 belong to an ethnic minority?
14.
The random variable x is the number of houses sold by a realtor in a single
month at the Sendsom’s Real Estate Office. Its probability distribution is as
follows. Find the mean and standard deviation for the probability distribution.
x, Houses sold P(x), probability
0
0.24
1
0.01
2
0.12
3
0.16
4
0.01
5
0.14
6
0.11
7
0.21
15.
Suppose you buy 1 ticket for $1 out of a lottery of 100 tickets where the prize
for the one winning ticket is to be $50. What is your expected value?
16.
Two different tests are designed to measure employee productivity and
dexterity. Several employees are randomly selected and tested with the
following results.
Productivity Dexterity
23
49
25
53
28
59
21
42
21
47
25
53
26
55
30
63
34
67
36
75
a. How do you know if a linear regression is appropriate? Explain in at least
two ways.
b. If a linear regression is appropriate, find the equation for the regression
line.
c. If an employee has productivity rating of 29, what would you expect
his/her dexterity to be?
d. What is the residual for productivity of 23?
e. Interpret the residual in (d).
17.
A tax auditor has a pile of 191 tax returns of which he would like to select 17
for a special audit. Describe a method for selecting the sample which involves
systematic sampling.
18.
At a college there are 120 freshmen, 90 sophomores, 110 juniors, and 80
seniors. A school administrator selects a random sample of 12 of the
freshmen, 9 of the sophomores, 11 of the juniors and 8 of the seniors. She
then interviews all the students selected. Identify the type of sampling used in
this example.
19.
On a multiple choice test with 16 questions, each question has four possible
answers, one of which is correct. For students who guess at all the answers,
what is the mean and standard deviation for the number of correct answers?
20.
The March 2000 Consumer Reports compared various brands of supermarket
enchiladas in cost and sodium content. Use the scatterplot and part of the
regression analysis to answer the questions.
Fitted Line Plot
Sodium content (mg) = 2185 - 607.0 Cost (per serving)
1750
S
R-Sq
R-Sq(adj)
Sodium content (mg)
1500
250.702
77.3%
74.0%
1250
1000
750
500
1.0
1.5
2.0
Cost (per serving)
2.5
3.0
a. Is a linear model appropriate here? Explain.
b. What is the correlation between cost and sodium content?
c. How much sodium would you expect if the cost is $2.90?
21.
Assume that 25% of students at a university wear contact lenses. We
randomly select 200 students. What is the mean and standard deviation of the
proportion of students in this group who may wear contact lenses?
22.
The volumes of soda in quart soda bottles can be described by a Normal
model with a mean of 32.3 oz and a standard deviation of 1.2 oz. What
percentage of bottles can we expect to have a volume less than 32 oz?
23.
The number of hours per week that high school seniors spend on computers is
normally distributed, with a mean of 4 hours and a standard deviation of 2
hours. 60 students are chosen at random. Let y be the mean number of hours
spent on the computer for this group. Find the probability that y is between
4.2 and 4.4 hours.
24.
A researcher wishes to estimate the proportion of fish in a certain lake that is
inedible due to pollution of the lake. How large a sample should be tested in
order to be 99% confident that the true proportion of inedible fish is estimated
to within 6%?
25.
A mayoral election race is tightly contested. In a random sample of 2200
likely voters, 1144 said that they were planning to vote for the current mayor.
Based on a 95% confidence interval, would you claim that the mayor will win
a majority of the votes? Explain.
26.
A skeptical paranormal researcher claims that the proportion of Americans
that have seen a UFO, p, is less than 4%. Identify the Type II error in this
context.
27.
7 of 8,500 people vaccinated against a certain disease later developed the
disease. 18 of 10,000 people vaccinated with a placebo later developed the
disease. Test the claim that the vaccine is effective in lowering the incidence
of the disease. Use a significance level of 0.02.
28.
Suppose the proportion of sophomores at a particular college who purchased
used textbooks in the past year is p s and the proportion of freshmen at the
college who purchased used textbooks in the past year is p f . A study found a
95% confidence interval for ps  p f is 0.235,0.427 . Does this interval
suggest that sophomores are more likely than freshmen to buy used
textbooks? Explain.
29.
In the past, the mean running time for a certain type of flashlight battery has
been 8.5 hours. That manufacturer has introduced a change in the production
method and wants to perform a hypothesis test to determine whether the mean
running time has changes as a result. The hypotheses are:
H 0 :   8.5 hours
H A :   8.5 hours
Explain a Type I error.
30.
A coach uses a new technique to train gymnasts. 7 gymnasts were randomly
selected and their competition scores were recorded before and after the
training. The results are shown below.
Subject A
B
C
D
E
F
G
Before
9.4 9.5 9.6 9.6 9.4 9.6 9.6
After
9.5 9.7 9.6 9.5 9.5 9.9 9.4
Do the data suggest that the training technique is effective in raising the
gymnasts’ scores? Perform a hypothesis test at the 5% significance level.
31.
Test the claim that the population of female college students tend to gain
weight their freshman year of college. The mean weight is given by
  132 lb. Sample data are summarized as
n  20
y  137 lb.
s  14.2 lb.
Use a significance level of   0.1.
32.
A laboratory tested twelve chicken eggs and found that the mean amount of
cholesterol was 240 milligrams with s  19.8 milligrams. Construct a 95%
confidence interval for the true mean cholesterol content of all such eggs.
Interpret.
33.
Suppose you have obtained a confidence interval for  , but wish to obtain a
greater degree of precision. Which of the following would result in a
narrower confidence interval?
a. Increasing the sample size while keeping the confidence level fixed
b. Decreasing the sample size while keeping the confidence level fixed
c. Increasing the confidence level while keeping the sample size fixed
d. Decreasing the confidence level while keeping the sample size fixed
34.
A car insurance company performed a study to determine whether an
association exists between age and the frequency of car accidents. They
obtained the following sample data. Perform a test to see if there is an
association between age and frequency of car accidents.   0.05
Age Group
Under 25 25-45 0ver 45 total
Number of
0
74
89
82
245
accidents in
1
18
8
12
38
past 3 years
More than 1
8
3
6
17
total
100
100
100
300
35.
Using the data below and a 0.05 significance level, test the claim that the
responses occur with percentages of 15%, 20%, 25%, 25%, and 15%
respectively.
Response
A
B
C
D
E
Frequency
12 15 16 18 19
Answers:
1.
quantitative
2.
71.3%
98
3.
a.
 0.5385
182
145
b.
 0.4265
340
378 105 105
c.

 0.1462
718 378 718
340 305 145 500
d.



 0.6964
718 718 718 718
18
 0.5455
e.
33
4.
53.85%
42.65%
14.62%
69.64%
Answers will vary in that the size of the bins may vary. One possible
answer is given.
Histogram of days off
7
6
Frequency
5
4
3
2
1
0
0.0
1.5
3.0
4.5
6.0
days off
7.5
9.0
10.5
5. The boxplots indicate lower literacy rates in Africa in general. The literacy rates
in Asia, Europe and South American are all much higher with most rates in the
90% range. Euope has an outlier and Asia has a couple of outliers that are in the
same range as some of the higher African rates.
6.
Dotplot of errors for the week
0
1
2
3
4
errors for the week
5
6
7. The distribution is unimodal and skewed to the right (larger numbers). It is
centered at 26 years and has a standard deviation of 11.33 years. There is a gap
between 18 and 25.
8.
Boxplot of test scores
100
90
test scores
80
70
60
50
40
30
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
The 5 number summary is
15
22.5 25
29
42
The IQR is 29  22.5  6.5 . The fences are 22.5 1.5  6.5  12.75 and
29  1.5  6.5  38.75 . Both 40 and 42 are outside the upper fence and
would be outliers.
P( not all fail) = 1 – P( all fail ) = 0.9995
P ( X = 3 ) = 0.0706
P( defective switch) = 0.048
P( X  6) = 0.9815
The mean is 3.6 houses. The standard deviation is 2.62 houses.
The expected value is - $0.50.
a.
The scatter plot looks fairly linear.
The correlation coefficient is 0.9861. Since that is close to 1, the
linear association is probably appropriate.
The residual plot does not have a pattern. We can see from both
the residual plot and the graph of the regression line that the line
fits the scatterplot better in the middle that at either end.
y  1.905 x  5.055
b.
c.
The dexterity would be 60.3.
d.
The residual for 23 is 0.12959
e.
The data value for 23 is 0.12959 above the predicted value on the
regression line.
The tax auditor could number the returns 1 through 191. He could then
use a random number table to select a number at random between 1 and
11. Starting with that number, he could list every 11th number until he has
17 numbers. He could then select the tax returns corresponding to the
numbers listed.
This type of sampling is stratified sampling.
The mean is 4 correct answers. The standard deviation is 1.732 correct
answers.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
a.
The graph shows a negative linear association.
f.
The correlation coefficient is 0.8792.
g.
The amount of sodium would be 424.7mg
The mean for the proportion is 0.25. The standard deviation is 0.0306.
We expect about 40.13% of bottles to have a volume less that 32 oz.
The probability that the mean is between 4.2 and 4.4 hours is 0.1586.
You would have to sample 461 fish using a z-value of 2.575.
The confidence interval is ( 0.49912, 0.54088 ). This confidence interval
includes some values smaller than 50%.
The error of failing to reject the claim that the true proportion is at least
4% when it is actually less that 4%.
H 0 : pvaccinated  p placebo
H A : pvaccinated  p placebo
The test statistic is z = -1.80 and the P-value is 0.036. We fail to reject the
null hypothesis. There is not sufficient evidence to support the claim that
the vaccine is effective in lowering the incidence of the disease at this
significance level.
Yes, since 0 is not in the interval, there is evidence that sophomores are
more likely than freshmen to buy used textbooks.
The manufacturer would decide that the mean battery life is more than 8.5
hours when in fact it is equal to 8.5 hours.
H 0 : d  0
H A : d  0
The test statistic is t = - 0.880 and the P-value is 0.206. We fail to reject
the null. At the 5% significance level, the data do not provide sufficient
evidence to conclude that the training technique is effective in raising the
gymnasts’ scores. (Define the difference as the before scores minus the
after scores.)
H 0 :   132
H A :   132
The test statistics is t = 1.5747 and the P-value is 0.0659. Since our Pvalue is less than   0.10 , we reject the null. There is sufficient
evidence that the mean weight of college females increases their freshman
year.
We are 95% confident that the mean cholesterol content of the eggs is
between 227.42 milligrams and 252.58 milligrams.
A and D
The number of accidents is independent of the age group. The test statistic
is  2  7.6149 and the P-value is 0.10675 with df = 4.
H 0 : The responses occur according to the stated percentages.
H A : The responses do not occur according to the stated percentages.
The test statistics is  2  5.146 . The P-value is 0.2727. We fail to reject
the null hypothesis. There is not sufficient evidence to warrant rejection
of the claim that the responses occur according to the stated percentages.