Download Semester 1 Final Review - Phoenix Union High School District

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Inductive probability wikipedia , lookup

Regression toward the mean wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
Experiments & Observations Review
There is no recovery from poorly collected data.
So the first priority in a study is properly collecting and organizing the data to avoid the common
pitfalls. On the advanced placement exam, using the standard vocabulary is paramount to earning a top
score. Fully, yet concisely, explaining the methods as well as the reasons behind the methods is
important.
So what’s important and why?
Randomization – to reduce bias – def. the use of chance or probability during the selection process
Types of bias
1. voluntary response bias – when only those that choose to participate do participate. Those that choose
to participate usually feel very strongly one way or the other.
2. response bias – when participants are put in position that makes them uncomfortable to respond
truthfully. If a teacher asks for a show of hands of those who have ever cheated on a test many would
not raise their hands even if they have cheated. Poorly worded questions would also lead to response
bias. For instance, the question “Do you prefer essay questions or tricky worded multiple choice
questions” would lead many to respond in favor of essay questions.
3. undercoverage bias – when certain groups are left out of a survey often due to the difficulty in
including them. For instance, high school drop outs are rarely surveyed for issues on teenage opinions
since most surveys are done at schools.
4. selection bias – when one group is more heavily studied than any other group. If only members of the
Sierra Club are surveyed on their opinions of saving the rain forest, the results will be strongly skewed
in an environmental direction.
To avoid bias we must randomly select subjects or experimental units from the population being
studied. There are 4 basic systems of random selection we have studied.
simple random samples – the best method overall – number ALL possible subjects in the
population. Then use a random number generator or table of random digits to select a specified number
from the population. All possible combinations are possible. The chance of getting a biased group is
small and taken into consideration with a statistic called the sampling error or standard deviation of the
sample. Ask students to tell you what “the idiot factor” is.
stratified random samples – when we first group the subjects by some similar characteristic then take
a random sample from each group. For instance, first group the subjects by gender and then
randomly select 20 males and 20 females. This is done for comparison purposes.
systematic random sample – often done for convenience. Theoretically, line up the subjects and
choose every, say, 10th one. Since you are alphabetically listed in my grade book, I could simply go
down the list and choose every 5th student for a study.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
cluster sampling – first splitting the population into similar groups, then completing a census of the
groups selected. For instance, second block Westwood students are separated into clusters (classes).
Randomly select 5 classes and survey everyone in each of the 5 classes.
Blocking – to reduce variation – def. creating groups that are similar with respect to a particular variable
Blocking is when groups that are already similar in some way are grouped together. This technique
helps control certain lurking or confounding variables and limits the variation in the study statistics.
(Note: blocking in an experiment is pretty much the same as stratifying when you choose a sample. It
means you group subjects by something like gender, age, grade level, political party affiliation, since
these differences often give different results due to the nature of the group.)
Control group – to reduce the effects of confounding variables def. a group that receives no
treatment or a placebo treatment
Blindness – to reduce bias – def. when the subject, the evaluator, or both (double blind) do not know
which treatment is being administered. This is done so neither the subject nor the researcher can bias the
study for or against the new drug. (Bias is often not intentional. We humans cannot help it)
Describing confounding variables: When there is uncertainty with regard to which variable is causing
an effect, we say the variables are confounded. IMPORTANT: In order to receive credit for a
confounding variable, you must describe how it confounds the data AND relate the results to BOTH
groups.
Generalizability: Results may only be generalized to the population randomly selected. If we study
only Westwood students we may draw conclusions about only Westwood students, not all high school
students.
Experiments versus Observational Studies: Experiments impose a treatment on the subject or
experimental unit. Only a well designed, controlled experiment can show a causal relationship. One
must randomly separate a control group from the experimental group for comparison. The control group
may receive no treatment, a placebo, or an alternate treatment.
1. Dr. Bicep is studying muscle growth. He randomly selects 30 patients to add instant protein to their
daily diet and 30 patients to eat as they normally would. Both groups are required to hit the weight
training room three times a week. The hypothesis is that the instant protein group will increase their
muscle mass more than the group without the extra protein.
a) What is the treatment imposed in this experiment?
b) Describe a possible confounding variable.
c) Describe a possible observational study for the same problem.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
2. Design a study
a) Marine iguanas do not really pay attention to humans. Historically they have had no reason to fear
them. Now with the influx of tourists, the iguanas are becoming more timid. Conservationists are
interested in the distance at which an iguana begins to show alarm with and without exposure to
tourists. Alarm is shown by a rapid head movement accompanied by a low clicking sound. Design an
experiment to determine the distance at which iguanas become alarmed by human contact.
b) On the Galapagos islands, both marine iguanas and land iguanas are present. How could your
design above be improved to include this knowledge? Why is this change necessary?
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
3. 2004 #2
Researchers who are studying a new shampoo formula plan to compare the condition of hair for
people who use the new formula with the condition of hair for people who use the current formula.
Twelve volunteers are available to participate in this study. Information on these volunteers
(numbered 1 through 12) is shown in the table below.
Volunteer
1
2
3
4
5
6
7
8
9
10
11
12
Gender
Male
Female
Male
Female
Female
Male
Male
Female
Male
Female
Male
Female
Age
21
20
47
60
62
61
58
44
44
24
23
46
a) These researchers want to conduct an experiment involving the two formulas (new and current) of
shampoo. They believe that the condition of hair changes with age but not gender. Because researchers
want the size of the blocks in an experiment to be equal to the number of treatments, they will use
blocks of size 2 in their experiment. Identify the volunteers (by number) that would be included in each
of the six blocks and give the criteria you used to form the blocks.
b) Other researchers believe that hair condition differs with both age and gender. These researchers will
also use blocks of size 2 in their experiment. Identify the volunteers (by number) that would be included
in each of the six blocks and give the criteria you used to form the blocks.
c) The researchers in part (b) decide to select three of the six blocks to receive the new formula and to
give the other three blocks the current formula. Is this an appropriate way to assign treatments? If so,
describe a method for selecting the three blocks to receive the new formula. If not, describe an
appropriate method for assigning treatments.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
4. In one study subjects were randomly given either 500 or 1000 milligrams of vitamin C daily, and
the number of colds they came down with during a winter season was noted. In a second study
people responded to a questionnaire asking about the average number of hours they sleep per night
and the number of colds they came down with during a winter season.
A) The first study was an experiment without a control group, while the second was an observational
study.
B) The first study was an observational study, while the second was a controlled experiment.
C) Both studies were controlled experiments.
D) Both studies were observational studies.
E) None of the above is a correct statement.
5. Ann Landers, who wrote a daily advice column appearing in newspapers across the country, once
asked her readers, “If you had it to do over again, would you have children?” Of the more than
10,000 readers who responded, 70% said no. (I’m certain your parents would say yes!) What does
this show?
A) The survey is meaningless because of voluntary response bias.
B) No meaningful conclusion is possible without knowing something more about the characteristics
of her readers.
C) The survey would have been more meaningful if she had picked a random sample of the 10,000
readers who responded.
D) The survey would have been meaningful if she had used a control group.
E) This was a legitimate sample drawn from her readers and of sufficient size to allow the
conclusion that most of her readers who are parents would have second thoughts about having
children.
6. To survey the opinions of bleacher fans at Wrigley Field, a surveyor plans to select every onehundredth fan entering the bleachers one afternoon. Will this result in a simple random sample of
Cub fans who sit in the bleachers?
A) Yes, because each bleacher fan has the same chance of being selected.
B) Yes, but only if there is a single entrance to the bleachers.
C) Yes, because the 99 out of 100 bleacher fans who are not selected will form a control group.
D) Yes, because this is an example of systematic sampling, which is a special case of simple random
sampling.
E) No, because not every sample of the intended size has an equal chance of being selected.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
7. A study is made to determine whether studying Latin helps students achieve higher scores on the
verbal section of the SAT exam. In comparing records of 200 students, half of whom have taken at
least 1 year of Latin, it is noted that the average SAT verbal score is higher for those 100 students
who have taken Latin than for those who have not. Based on this study, guidance counselors begin
to recommend Latin for students who want to do well on the SAT exam. Which of the following are
true statements?
I. While this study indicates relation, it does not prove causation.
II. There could well be a confounding variable responsible for the seeming relationship.
III. Self-selection here makes drawing the counselors’ conclusion difficult.
A) I and II
set
B) I and III
C) II and III
D) I, II, and III
E) None of these gives a true complete
8. A researcher planning a survey of heads of households in a particular state has census lists for
each of the 23 counties in that state. The procedure will be to obtain a random sample of 10 heads of
households from each of the 23 counties. Which of the following is a true statement about the
resulting sample?
I. This is not a proper study because children were not included.
II. This stratified random sample is a type of simple random sample because subjects were
randomly selected from each county.
III. This is not a simple random sample because all possible groups of 230 subjects did not have the
same probability of being selected.
IV. This study may give important information about the similarities and differences of the 23
counties.
A) III and IV
set
B) I and II
C) I and III
D) I, II, and III
E) None of these gives a complete
9. A nutritionist believes that having each player take a vitamin pill before a game enhances the
performance of the football team. During the course of one season, each player takes a vitamin pill
before each game, and the team achieves a winning season for the first time in several years. Is this
an experiment or an observational study?
A) An experiment, but with no reasonable conclusion possible about cause and effect.
B) An experiment, thus making cause and effect a reasonable conclusion.
C) An observational study, because there was no use of a control group.
D) An observational study, but a poorly designed one because randomization was not used.
E) An observational study, thus allowing a reasonable conclusion of association but not of cause and
effect.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
10. Researchers were interested to know whether internal vehicle temperatures vary by outside
temperatures. To evaluate this, temperature rise was measured continuously over a 60-minute period
in a dark sedan on 16 different clear, sunny days with outside temperatures ranging from 72ºF to
96ºF. the researchers’ method of analysis is best described as
A) a census
B) a survey
C) an observational study
D) a randomized comparative experiment
E) a single-blind randomized comparative experiment
11. Respondents to a randomly distributed questionnaire answered the question, “Do you agree that
nuclear weapons should never be used because they are immoral?” The study that uses the results of
this questionnaire will most likely suffer from which type(s) of bias?
A) undercoverage
B) voluntary response
C) response
E) all of the above
D) nonresponse
12. In a certain community, 20% of cable subscribers also subscribe to the company’s broadband
service for their Internet connection. You would like to design a simulation to estimate the
probability that one of six randomly selected subscribers has the broadband service. Using digits 0
through 9, which of the following assignments would be appropriate to model this situation?
A) Assign even digits to broadband subscribers and odd digits to cable-only subscribers.
B) Assign 0 and 1 to broadband subscribers and 2,3,4,5,6,7,8, and 9 to cable-only subscribers.
C) Assign 0,1, and 2 to broadband subscribers and 3,4,5,6,7,8, and 9 to cable-only subscribers.
D) Assign 1,2,3,4,5, and 6 to broadband subscribers and 7,8,9, and 0 to cable-only subscribers.
E) Assign 0,1, and 2 to broadband subscribers; 3,4,5, and 6 to cable-only subscribers; and ignore
digits 7,8, and 9.
13. A cause-and-effect relationship between two variables can best be determined from which of the
following?
A) A survey conducted using a simple random sample of individuals.
B) a survey conducted using a stratified random sample of individuals.
C) When the two variables have a correlation coefficient near 1 or ─1.
D) An observational study where the observational units are chosen randomly.
E) A controlled experiment where the observational units are assigned randomly to treatments.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
14. Which of the following is a true statement about experimental design?
A) Replication is a key component in experimental design. Thus, an experiment needs to be
conducted on repeated samples before generalizing results.
B) Control is a key component of experimental design. Thus, a control group that receives a placebo
is a requirement for experimentation.
C) Randomization is a key component in experimental design. Randomization is used to reduce
bias.
D) Blocking eliminates the effects of all lurking variables.
E) The placebo effect is a concern for all experiments.
15. An experimenter believes that two new exercise programs are more effective than any current
exercise routines and wishes to compare the effectiveness of these two new exercise programs on
physical fitness. The experimenter is trying to determine whether or not a control group, which
follows neither of these new programs but continues with current exercise routines, would be
beneficial. Which of the following can be said about the addition of a control group?
A) A control group would eliminate the placebo effect.
B) A control group would eliminate the need for blinding in the study.
C) A control group would allow the experimenter to determine which of the two exercise programs
improves physical fitness the most.
D) A control group would allow the experimenter to determine if either of the exercise programs is
more effective than current programs for physical fitness.
E) There would be no added benefit to having a control group.
16. A drug company wishes to test a new drug. A researcher assembles a group of volunteers and
randomly assigns them to one of two groups---one to take the drug and one to take a placebo. In
addition, the company wants the experiment to be double-blind. What is the meaning of doubleblind in this situation?
A) The volunteers in both groups are blindfolded when they take the drug or placebo.
B) the volunteers in both groups do not know whether they are taking the drug or the placebo.
C) Neither the volunteers nor the drug company executives know which volunteers are taking the
drug and which are taking the placebo.
D) Neither the volunteers nor the evaluator now which volunteers are taking the drug and which are
taking the placebo.
E) as long as the subjects are randomly assigned to the two groups, there is no need to make the
experiment double-blind.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
17. A psychologist from Austin, Texas interested in sleep’s effect on the ability to learn randomly
selects high school students from the area’s local high schools. Half the students are randomly
selected to sleep between 7 and 8 hours a night while the remaining half sleep as they normally
would. At the end of the study, those students who got between 7 and 8 hours of sleep a night
scored significantly higher on tests over the curriculum studied at their school. From this the
psychologist can conclude
A) Austin high school students getting between 7 and 8 hours of sleep per night score higher on tests
over their curriculum.
B) Students in the United States who get between 7 and 8 hours of sleep per night get better grades.
C) Austin students who sleep between 7 and 8 hours per night get better grades.
D) High school students getting between 7 and 8 hours of sleep per night score higher on tests over
their curriculum.
E) Since a placebo was not included in the study, no significant conclusions can be made.
Descriptive Statistics Review
Key Concepts
a) What is the difference between quantitative and categorical data?
b) I can create graphical displays of data such as a dot plot, stem and leaf plot, histogram, bar chart,
pie chart, contingency table, cumulative frequency plot.
c) I can calculate a five number summary, mean and standard deviation.
d) I can describe a data set by addressing shape, center and spread and describing any unusual
features.
e) I can rescale a set of data so the data can be more easily understood.
f) I can calculate a z-score.
g) I can develop a normal model so quantitative data can be interpreted.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
1. School administrators collect data on students attending the school. Which of the following
variables is quantitative?
A) class (freshman, soph., junior, senior)
B) grade point average
C) whether the student is in AP classes
D) whether the student has taken the SAT
E) none of these
2. A professor has kept records on grades that students have earned in his class. If he wants to
examine the percentage of students earning the grades A, B, C, D, and F during the most recent
term, which kind of plot could he make?
A) boxplot
B) timeplot
C) dotplot
D) pie chart
E) histogram
3. We might choose to display data with a stemplot rather than a boxplot because a stemplot
I. reveals the shape of the distribution.
II. is better for large data sets.
III. displays the actual data.
A) I only
B) II only
C) III only
D) I and III
E) I, II, and III
4. Two sections of a class took the same quiz. Section A had 15 students who had a mean
score of 80, and Section B had 20 students who had a mean score of 90. Overall, what was the
approximate mean score for all of the students on the quiz?
A) 84.3
B) 85.0
C) 85.7
D) none of these
E) It cannot be determined.
5. Your Stats teacher tells you your test score was the 3rd quartile for the class. Which is true?
I. You got 75% on the test.
II. You can’t really tell what this means without knowing the standard deviation.
III. You can’t really tell what this means unless the class distribution is nearly Normal.
A) none of these
B) I only
C) II only
D) III only
E) II and III
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
6. The SPCA collects the following data about the dogs they house. Which is categorical?
A) breed
B) age
C) weight
D) number of days housed
E) veterinary costs
7. Which is true of the data whose distribution is shown?
I. The distribution is skewed to the right.
II. The mean is probably smaller than the median.
III. We should summarize with mean and standard deviation.
A) I only
B) II only
C) I and II
D) II and III
E) I, II, and III
8. The advantage of making a stem-and-leaf display instead of a dotplot is that a stem-andleaf display
A) satisfies the area principle.
B) shows the shape of the distribution better than a dotplot.
C) preserves the individual data values.
D) A stem-and-leaf display is for quantitative data, while a dotplot shows categorical data
E) none of these
9. Suppose that a Normal model described student scores in a history class. Parker has a
standardized score (z-score) of +2.5. This means that Parker
A) is 2.5 points above average for the class.
B) is 2.5 standard deviations above average for the class.
C) has a standard deviation of 2.5.
D) has a score that is 2.5 times the average for the class.
E) None of the above.
10. Suppose a Normal model describes the number of pages printer ink cartridges last. If we
keep track of printed pages for the 47 printers at a company’s office, which must be true?
I. The page counts for those ink cartridges will be normally distributed.
II. The histogram for those page counts will be symmetric.
III. 95% of those page counts will be within 2 standard deviations of the mean.
A) none of these
B) I only
C) II only
D) II and III
E) I, II, and III
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
11. The five-number summary of credit hours for 24 students in a statistics class is:
Min
13.0
Q1
15.0
Median
16.5
Q3
18.0
Max
22.0
Which statement is true?
A) There are no outliers in the data.
B) There is at least one low outlier in the data.
C) There is at least one high outlier in the data.
D) There are both low and high outliers in the data.
E) None of the above.
12. Cats and dogs The table shows whether students in an introductory statistics class like dogs
and/or cats.
Like Cats Yes
No
Totals
Like Dogs Yes
194
110
304
No
21
10
31
Totals
215
120
335
a. What is the marginal distribution (in %) of “liking dogs”? ___________________
b. What is the conditional distribution (in %) of “liking dogs” for students who like cats?
___________________
c. What kind of display(s) would you use to examine the association between “liking dogs” and
“liking cats”? (Just name a graph.) ___________________
d. Do “liking dogs” and “liking cats” appear to be independent? Give statistical
to support your conclusion.
13. Book sales A publishing company pays its sales staff $600 a week plus a commission of $0.50
per book sold. For example, a salesman who sold 440 books earned 600 + 0.50(440) = $820.
Statistic
Mean
Standard deviation
IQR
Maximum
Books Sold
640
360
450
1420
$ Earned
a. The table shows summary statistics for the number of books the large sales staff sold
last week. Fill in the table to show the statistics for the pay these people earned.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
14. Commuting to work The table shows how a company’s employees commute to work.
Job class
Management
Labor
Total
Car
26
56
82
Transportation
Bus
20
106
126
Train
44
168
212
Total
90
330
420
a. What is the marginal distribution (in %) of mode of transportation?
Car _____ Bus _____ Train _____
b. What is the conditional distribution (in %) of mode of transportation for management?
Car _____ Bus _____ Train _____
c. What kind of display would you use to show the association between job class and mode of
transportation? (Just name a graph.) _______________________________
d. Do job classification and mode of transportation appear to be independent? Give statistical
evidence to support your conclusion.
Linear Regression Review
Explanatory variables – attempt to explain the observed outcome  the independent variable (x)
Response variable – measures the outcome of the study  the dependent variable (y) because it
depends on what x is
A
1. Study the scatter plot to the right
B
a) Would it be most appropriate to remove case A or case B?
b) Do the points have a positive or negative association, why?
c) What does the “least-squares regression” line mean?
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
d) If a set of points has a least squares regression equation of y = 2.3x + 17, what is the residual of
the actual point (3,19.5)?
e) Would this point be above or below the linear regression line?
What to do with a set of quantitative bivariate data
1) make a scatter plot and look at it
2) find linear regression equation
3) make a residual plot to check linear regression equation
4) if regression equation has a pattern, try to straighten the curve using a transformation (We
haven’t done this yet)
2. If the linear regression equation to find the temperature on top of Flattop Mountain based on the
temperature of Denver is y = 1.2x  31 where x is Denver’s temperature and y is the temperature on
the top of Flattop Mountain, describe what the coefficients mean.
a) If r = 0.87, what does it mean.
b) If r2 = .7569, what does it mean?
3. A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a
group of many young children. The equation of the line is
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
4. A wildlife biologist is interested in the relationship between the number of chirps per minute for
5. Residuals are
(A) possible models not explored by the researcher.
(B) variation in the response variable that is explained by the model.
(C) the difference between the observed response and the values predicted by
the model.
(D) data collected from individuals that is not consistent with the rest of the group.
(E) a measure of the strength of the linear relationship between x and y
6. Child development researchers studying growth patterns of children collect data on
the heights of fathers and sons. The correlation between the fathers’ heights and the
heights of their 16-year-old sons is most likely to be…
(A) near −1.0
(B) near 0
(C) near +0.7
(D) exactly +1.0
(E) somewhat greater than +1.0
7. If the coefficient of determination r2 is calculated as 0.49, then the correlation
coefficient
(A) cannot be determined without the data
(B) is − 0.70
(C) is 0.2401
(D) is 0.70
(E) is 0.7599
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
8. Regression output gives the equation of the regression line as
Normal Model Review
Characteristics of a normal model
• The shape is unimodal, symmetric, and mound-shaped.
• The mean is equal to the median.
• A normal model is continuous, although it is often used to approximate a discrete distribution
like a histogram.
• The shape, center, and spread of a normal model can be quickly given by writing N(μ,σ ).
Z-scores
1. A normal distribution of scores has a standard deviation of 10. Find the z-scores corresponding to
each of the following values:
a. A score of 60, where the mean score of the sample data values is 40.
b. A score that is 30 points below the mean.
c. A score of 80, where the mean score of the sample data values is 30.
d. A score of 20, where the mean score of the sample data values is 50.
2. IQ scores have a mean of 100 and a standard deviation of 16. Albert Einstein reportedly had an IQ
of 160.
a. What is the difference between Einsteins IQ and the mean?
b. How many standard deviations is that?
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
c. Convert Einstein’s IQ score to a z score.
d. If we consider “usual IQ scores to be those that convert z scores between -2 and 2, is Einstein’s IQ
usual or unusual?
3. Womens heights have a mean of 63.6 in. and a standard deviation of 2.5 inches. Find the z score
corresponding to a woman with a height of 70 inches and determine whether the height is unusual.
4. Three students take equivalent stress tests. Which is the highest relative score (meaning which has
the largest z score value)?
a. A score of 144 on a test with a mean of 128 and a standard deviation of 34.
b. A score of 90 on a test with a mean of 86 and a standard deviation of 18.
c. A score of 18 on a test with a mean of 15 and a standard deviation of 5.
5. Sketch a normal curve with the given mean and standard deviation. Calculate and label μ + 𝜎, μ −
𝜎, μ + 2𝜎, μ − 2𝜎, μ + 3𝜎, 𝑎𝑛𝑑 μ − 3𝜎
a. Mean of 15, standard deviation of 2.
b. Mean of 7.8, standard deviation of 1.2.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
6. By the empirical rule, what percentage of the area under the normal curve lies
a. To the right of μ?
b. Between μ+2𝜎 and μ−2𝜎?
c. To the right of μ+3𝜎?
7. The mean life of a tire is 30,000 km. The standard deviation is 2000 km.
a. 68% of all tires will have a life between _________km and _______ km.
b. 95% of all tires will have a life between _______km and ______ km.
8. Given an approximately normal distribution with a mean of 159 and a standard deviation of 70.
a. What percent of values are within the interval (89, 299)?
b. What percent of values are within the interval (19, 159)?
c. What interval contains 99.7% of all values?
d. What percent of values are above 229?
e. What percent of values are outside the interval (19, 229)?
9. The heights of male students is normally distributed with a mean of 170 cm and a standard deviation
of 8 cm. Find the percentage of male students whose height is: (Draw and label a bell curve to help)
a. between 162 cm and 170 cm
b. between 170 cm and 186 cm
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
c. between 178 cm and 186 cm
d. less than 162 cm
e. less than 154 cm
f. greater than 162 cm
10. Suppose that the wrapper of a certain candy bar lists its weight as 2.13 ounces. Naturally, the
weights of individual bars vary somewhat. Suppose that the weights of these candy bars vary
according to a normal distribution with mean μ=2.2 ounces and standard deviation 𝜎=0.04 ounces.
a. What proportion of candy bars weigh less than the advertised weight?
b. What proportion of candy bars weigh more than 2.25 ounces?
c. What proportion of candy bars weigh between 2.2 and 2.3 ounces?
11. Sample data from the National Center for Health Statistics reveal that weights of American men
aged 20 – 29 have a mean of about 175 pounds and a standard deviation of about 35 pounds. For
women the mean is about 140 pounds and the standard deviation is about 30 pounds.
a. If these distributions are roughly normal, what percentage of men would you expect to weigh less
than 150 pounds? Less than 200 pounds? Less than 250 pounds?
12. A trucking firm determines that its fleet of trucks averages a mean of 12.4 miles per gallon with
a standard deviation of 1.2 miles per gallon on cross-country hauls. What is the probability that one
of the trucks averages fewer than 10 miles per gallon?
(A) 0.0082
(B) 0.0228
(C) 0.4772
(D) 0.5228
(E) 0.9772
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
13. A factory dumps an average of 2.43 tons of pollutants into a river every week. If the standard
deviation is 0.88 tons, what is the probability that in a week more than 3 tons are dumped?
(A) 0.2578
(B) 0.2843
(C) 0.6500
(D) 0.7157
(E) 0.7422
14. An electronic product takes an average of 3.4 hours to move through an assembly line. If the
standard deviation is 0.5 hour, what is the probability that an item will take between 3 and 4 hours?
(A) 0.2119
(B) 0.2295
(C) 0.3270
(D) 0.3811
(E) 0.6730
15. An educational testing service has designed a new test of mechanical aptitude. Scores on this test
are normally distributed with μ = 400 and σ = 60.
a) What score would you need to be in the top 15%?
b) What score represents the 45th percentile?
c) If 200 students at McCallum took the test, how many would you expect to score below 300?
16. The length of human pregnancies from conception to birth varies according to a distribution that
is approximately normal with mean 266 days and standard deviation 16 days. How long would a
pregnancy last to be in the 85th percentile?
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
17. An educational testing service has designed a new test of mechanical aptitude. Scores on this test
are normally distributed with μ = 400 and σ = 60.
a) What score would you need to be in the top 15%?
b) What score represents the 45th percentile?
c) If 200 students at McCallum took the test, how many would you expect to score below 300?
18. Cucumbers grown on a certain farm have weights with a standard deviation of 2 ounces. What is
the mean weight if 85% of the cucumbers weigh less than 16 ounces?
(A) 13.92
(B) 14.30
(C) 14.40
(D) 14.88
(E) 15.70
19. A coffee machine can be adjusted to deliver any fixed number of ounces of coffee. If the
machine has a standard deviation in delivery equal to 0.4 ounce, what should be the mean setting so
that an 8-ounce cup will overflow only 0.5% of the time?
(A) 6.97 ounces
(B) 7.22 ounces
(C) 7.34 ounces
(D) 7.80 ounces
(E) 9.03 ounces
20. Some descriptive statistics for a set of data describing the relationship between average expenses
and pet adoption.
A dog owner has a standardized score of z = 1.2, how much money did this owner spend? How about a cat
owner with standardized score of z = -.8.
Probability Review
The probability of an event is its long-run relative frequency.
We usually cannot predict a particular individual outcome; however we can estimate the probability of it
happening in the long run. For example we know that if you flip a coin, the probability of it landing on heads is
about .5 over time. However we cannot predict with certainty if the coin will land on heads or tails if we only
flip it once.
A phenomenon consists of trials. Each trial has an outcome. Outcomes combine to make events.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
Law of Large Numbers – the long run relative frequency of repeated independent events settles down to the
true probability as the number of trials increases.
Remember that random phenomena do not necessarily compensate for what happened in the past; landing on
heads 6 times in a row does not mean it is more likely for the next flip to land on tails, (probability still = 0.5).
Probability Rules
The basics
Suppose P(A) = .35, P(B) = .6 and P(A B) = .27. Determine
a.__________ P(Ac) (Ac represents the complement of A)
b.__________ P(A  B)
c.__________ P(A  B)
d.__________ are A and B independent events? (yes or no)
1: Suppose the probability that a construction company will be awarded a certain contract is .25, the
probability that it will be awarded a second contract is .21, and the probability that it will get both
contracts is .13. What is the probability that the company will win at least one of the two contracts?
2: A researcher interested in eye color versus success in a math program collected the following data
from a random sample of 2000 high school students.
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
brown
blue
fail
190
10
200
pass
1710
90
1800
1900
100
2000
a) What is the probability that a student from this group fails the math program?
b) What is the probability that a student from this group fails the math program given that he/she has blue
eyes?
c) Are blue eyes and failing the math program independent or dependent?
3: Of the 60 obese teenagers in a recent study, 15 had type II diabetes, 20 had high blood pressure, and 10
had both high blood pressure and type II diabetes. Suppose one of these 60 obese teenagers is randomly
selected.
a) Given that the teenager has type II diabetes, what is the probability that he or she also has high blood
pressure?
b) If the obese teenager does NOT have high blood pressure, what is the probability that he or she also
does not have type II diabetes?
4: The probability that Michael misses a free throw shot is .1. If he goes to the line to shoot three free
throws (due to a foul on a three-point shot),
a) What is the probability that Michael misses all three shots? What assumptions did you make in order to
calculate this probability?
b) What is the probability that Michael makes at least one of the three shots?
c) What is the probability that Michael makes the first shot but not the second or third?
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
5: Of the 10,000 freshman at the University of Texas, 7000 must take English, 6000 must take History,
and 5000 must take both. Suppose that a student is randomly selected.
a) What is the probability that the selected student must take English?
b) What is the probability that the selected student must take both English and History?
c) Suppose you learn that the selected student must take English, what is the probability that this student
must take both English and History?
d) Are the outcomes must take English and must take History independent? Explain.
e) Answer the question posed in part d if only 4200 of the students must take both English and History.
6: Two office assistants at Penny Lane High School are responsible for getting the daily tardy list to
the appropriate principals by 3:00pm daily. Rudy works on the lists 30% of the days and Fawn
works on the tardy lists 70% of the days. Rudy fails to get the lists to the correct principals in time
10% of the time (which would make him tardy with the tardy lists). Fawn, not much better, manages
to get the tardy lists to the correct principals 92% of the time. Let’s say you are Principal Sac and
the tardy list is late. What is the probability that today Rudy is responsible for the list?
AP Statistics Final Review
Name: _____________________________
Semester 1
Date: ______________
Experiments & Observations/ Descriptive Statistics/ Linear Regression/ Normal Model/ Probability
Questions 7 and 8 refer to the following situation. The class of 1968 and 1998 held a joint reunion in 2008
at the local high school. Attendees were asked to complete a survey to determine what they did after
graduation. Here is the information obtained.
7. What is the probability that a randomly selected attendee graduated in 1998 and went into the
military?
(A) 0.072
(B) 0.127
(C) 0.303
(D) 0.596
(E) 0.669
8. What is the probability that a randomly selected 1968 graduate went to college after graduation?
(A) 0.245
(B) 0.253
(C) 0.560
(D) 0.592
(E) 0.755