Download Review: Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Law of large numbers wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Name
Honors Math 3 review problems
Chapter 3 page 1
Review: Statistics
For a topic list in textbook order, see page viii at the front of the textbook. For a list of the major
things you learned how to do in each part of the chapter, see the bullet points on pp. 294-295.
Here’s another outline of the material in which related skills and concepts are grouped together.





Statistics for a data set
o Find the mean, mean absolute deviation, variance, and standard deviation.
o Prove and use this alternate formula for variance: x 2  (x ) 2 .
o Prove and use that means and variances are additive.
Repeated experiments
o Given the statistics (mean, variance, and/or standard deviation) for a single experiment, find
the statistics for the experiment repeated n times.
o For repeated Bernoulli trials, find the probability that the total has a specific value (using the
formula involving a combination number).
o For repeated Bernoulli trials, find the mean, variance, and standard deviation.
Assessing the effectiveness of a treatment
o Identify appropriate randomization methods for selecting the treatment and control groups.
o Recognize outcomes that show strong evidence, possible evidence, or little or no
evidence that a treatment is effective.
Sample surveys
o Identify appropriate methods for random sampling.
o Given a sample proportion, use simulation results to assess what values are plausible for the
population parameter.
o Estimate margins of error with 95% or 99+% confidence.
o Understand that correlation does not imply causation.
o Understand the differences between sample surveys, observational studies, and experiments.
Probability distributions
o Make probability histograms.
o For normal distributions, apply the 68/95/99+% rule and use the normalpdf and normalcdf
functions to answer probability questions.
o Apply the Central Limit Theorem to identify normal distributions.
Name
Honors Math 3 review problems
Chapter 3 page 2
More review problems
At our last class we worked on the review problems on pp. 294-295. Here are a few more review
problems to complete today.
1. The following table shows the distribution of days in the hospital after birth for 57 new
mothers.
Days in
Hospital
1
2
3
4
5
Number of
Mothers
14
34
5
3
1
a. Calculate the mean, variance, and standard deviation for the above data.
b. Does this data have normal distribution? Tell why or why not.
2. A student scores 60 on a math test that has a mean of 54 and a standard deviation of 3, and
she scores 80 on a history test with a mean of 75 and a standard deviation of 2. On which test
did she do better compared to the rest of the class?
3. A manufacturer produces a large number of toasters. From past experience, the manufacturer
knows that approximately 2% are defective. In a quality control procedure, we randomly
select 20 toasters for testing.
a. Determine the probability that exactly one of the toasters is defective.
b. Find the probability that at most two of the toasters are defective. Include enough details
so that it can be understood how you arrived at your answer.
c. Find the mean and standard deviation for the random variable x in the toaster problem.
Make sure to define the random variable x.
4. The average number of pounds of meat a person consumes a year is 218.4 pounds. Assume
that the standard deviation is 25 pounds and the distribution is approximately normal.
a. Find the probability that one person selected at random consumes less than 224 pounds
per year.
b. If a sample of 40 individuals is selected, find the probability that the mean for the sample
will be less than 224 pounds per year.
5. When you add a constant c to each element in a data set, what happens to the mean and
what happens to the variance? Prove your claims.
Name
Honors Math 3 review problems
Chapter 3 page 3
6. The travel times for the MBTA 76 bus from Hanscom to Alewife have a normal distribution
with a mean of 39 minutes with a standard deviation of 3.5 minutes.
Answer the following questions, using appropriate calculator functions where needed.
a. Suppose that the MBTA wishes to publish a range of possible travel times for this route,
such that 95% of the trips will fall in the range. Find this range.
b. What percent of this bus route’s trips have a travel time between 35 and 45 minutes?
c. For each of the following travel times, calculate the z score, then state whether the travel
time would be fairly typical, highly unusual, or in-between:
40 minutes, 30 minutes, 35 minutes, 45 minutes, 60 minutes
d. Tony, Sasha, and Derman separately traveled this bus route and calculated their z-scores:
z = 0.3 for Tony, z = –1.5 for Sasha, and z = 2.1 for Derman. Find their travel times.
e. Using an appropriate calculator function, approximate the probability that the travel time
on this bus route, rounded to the nearest minute, will be 38 minutes.
f. If you were to repeat part e for each of the times from 30 minutes to 50 minutes, then
make a probability histogram, name and sketch the shape that this histogram would have.
g. An MBTA executive is investigating a complaint that trips on this bus sometimes take
more 45 minutes. What percent of the trips take more than 45 minutes?
h. Suppose that the MBTA wishes to revise its range of possible travel times for this route,
such that 97% of the trips will fall in the range. Find this range.
7. In this problem you’ll need to combine two ideas from the chapter: the statistics of repeated
Bernoulli trials (page 249) and calculating the standard deviation as a proportion or
percentage of the number of trials (page 265, for example).
Suppose that a Bernoulli trial with probability p is repeated n times.
a. In terms of p and n, calculate the standard deviation as a proportion of the number of
trials. Hint: calculate n then simplify.
b. In part a you found the standard deviation as a proportion of the number of trials. What is
the limit of this result when n gets larger and larger?
c. Explain how the result of part a applies to 1,000,000 flips of a 60%/40% unfair coin.
Name
Honors Math 3 review problems
Chapter 3 page 4
8. A company offers a course for students to help them prepare for the state standardized test.
The company claims the course will improve students’ ability to pass the standardized test.
A researcher would like to test this claim. Forty students volunteer to participant in the study
and are divided equally into two groups: the Treatment Group and the Control Group.
a. Describe an effective and valid way to randomize the participants into the Treatment
Group and Control Group.
b. Describe what should occur if a participant is in the Treatment Group or Control Group.
c. Overall, 25 of the participants in the study passed the state standardized test. Describe a
simulation you could perform to represent this scenario.
d. An appropriate simulation has been carried out 120 times and the following histogram
shows the results.
Complete the table below for each of the following conditions:
i. Little or no evidence the course improved students’ ability to pass the test.
ii. Possible evidence the course improved students’ ability to pass the test.
iii. Strong evidence the course improved students’ ability to pass the test.
Treatment Group
Control Group
Total
Passed
Failed
25
15
Total
20
20
40
e. In the experiment, 14 out of the 20 participants in the Treatment Group passed the test.
Do you believe the claim that the course improved students’ ability to pass the test?
Explain.
Name
Honors Math 3 review problems
Chapter 3 page 5
Answers
1. a. mean = 2, variance =, standard deviation =
b. No. The graph is not symmetrical about the mean.
2. Find z-scores for each test score. Math: , History: . This means that in math her score is 2
standard deviations above the mean, while in history her score is 2.5 standard deviations
above the mean. Therefore, she did better compared to the rest of the class in history.
3. a.
b. P(at most 2 defective) = P(exactly 0 defect.) + P(exactly 1 defect.) + P(exactly 2 defect.)
P(at most 2 defective) =
c. Let X be the number of defective toasters out of 20 tested.
mean of X: 20(0.2) = 0.4, standard deviation of X:
4. a. normalcdf (0, 224, 218.4, 25)  0.589
b. By CLT, the mean of the sample will be approximately normally distributed with mean
218.4 and standard deviation . normalcdf (0, 224, 218.4, 3.953)  0.922.
5. The mean is increased by c. The variance stays the same.
Proof (mean): Suppose x1, …, xn have a mean of x . Then x1+c, …, xn+c have a mean of
( x1  c)    ( xn  c) x1    xn nc


 x c.
n
n
n
Proof (variance): Since each data value increases by c and the mean also increases by c,
all of the deviations are unchanged so the variance remains the same.
6. a. 32 to 46 minutes
b. normalcdf(35,45,39,3.5) ≈ 0.830, so 83%.
c. A fast way to get these z-scores is to press [Y=], enter Y1 = (X–39)/3.5, then use the
table. Answers: 0.286 (typical), –2.571 (unusual), –1.143 (typical), 1.714 (in-between),
6 (very unusual).
d. Calculate 39+3.5z for each z value. Answers: 40.05, 33.75, and 46.35 minutes.
e. normalpdf(38,39,3.5) ≈ 0.109 or normalcdf(37.5,38.5,39,3.5) ≈ 0.109, or 10.9%.
f. “normal distribution” or “bell curve”
g. You need to pick some large number for the upper end of the interval. You’ll get roughly
the same answer regardless of the choice: normalcdf(45,1000,39,3.5) ≈ 0.0432 = 4.32%.
h. Experiment with normalcdf(39 – n, 39 + n, 39, 3.5) for various values of n.
(You could try them one-by-one, or make a table on your calculator.)
The choice n = –7.6 gives normalcdf(31.4, 46.6, 39, 3.5) ≈ 0.970 = 97%.
Answer: 31.4 to 46.6 minutes.
7. a.

n

p (1  p )
n
Name
b.
c.
Honors Math 3 review problems
Chapter 3 page 6

n
gets smaller and smaller as n gets larger and larger. The limit is 0.
0.6  0.4
1000000
 4.899  10 4 or 0.0004899 or 0.04899%
8. a. Sample: Put all 40 names in a hat, pull 20 names and assign to Treatment Group, rest to
Control Group.
b. Treatment group should attend the course and the Control Group should not
c. Sample: Put the numbers 1-40 in a hat, pull 25 numbers and count how many odd
numbers (this represents how many students in the Treatment Group passed)
d. i. Around 12 or 13 passed in the Treatment Group
ii. Around 10-11 or 14-15 passed in the Treatment Group
iii. Around fewer than 9 or more than 16 passed in the Treatment Group
e. 14 students passing in the Treatment Group falls in the “possible evidence” range, so you
should you should not believe the claim without further research.