Download AP Stats Test Review

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Law of large numbers wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
AP Stats Test Review
 What are the four parts of the course?
Inference, Experimental Design, Probability, and Data
Analysis
 How many multiple choice and free response?
40 and 5
 Tell me about #6, What is its style?
Investigative Task, they will combine topics and ask you
to do something new…..DO NOT LEAVE IT BLANK.
•How do you “describe a distribution”?
CUSS
•When do you use a bar chart as opposed to a histogram?
Categorical vs quantitative
(ex. Categorical would be fav. soda brands, quantitative would
be test scores)
•What does R mean? What is it’s name?
Measures the association between the variables. It is called the
correlation coefficient
(ex. There is a strong positive linear association between the #
of hot dogs eaten and # of sodas purchased.)
•What does R2 mean? What is it’s name?
Tells how well the linear model is at making predictions. It is called
the Coefficient of Determination.
(ex. 78.3% of the variation the # of sodas purchased can be
explained by the approximate linear relationship with hot dogs
eaten.
•What does the slope mean in context of the problem?
It is the letter b
(ex. For every hot dog eaten we can expect an average increase
in sodas purchased by .78.
•What is the formula that involves slope, correlation, and
standard deviation?
•What does resistant and non-resistant mean? Name things
that are non-resistant? Resistant.
Affected by outliers or not. Median and IQR are resistant.
Mean and StDev are non-resistant.
•Cumulative frequency and relative frequency, what do you
always convert this to?
A boxplot, think of the percentiles of a boxplot.
•What does a “good” residual plot look like?
Randomly scattered..no curved pattern
•Name ways to plot univariate data.
Boxplot, dotplot, stemplot, histogram
•Name ways to plot bivariate data.
Scatterplot
•What is the meaning of least squares?
Minimizing the distance of the regression line from the
observed points.
•What are 3 ways to check for normality?
Boxplot, stemplot, histogram, empirical rule, normal
probability plot, compare mean and median
•What is the difference between influential points and
outliers?
Outliers are in the y direction and influential points are in
the x-direction.
•What is the empirical rule?
68-95-99.7, the percents of data that is within 1,2,and 3
Stdev’s from the mean.
•What is the meaning of standard deviation?
The average distance away from the mean.
•What is the difference between blocking and stratifying?
Blocking is the word when doing an experiment and stratifying
is the word used in surveys.
•What is the purpose of blocking and stratifying?
Placing people in similar groups to see if different groups have
different effects or different opinions.
•What is the purpose of a control group?
To see how much of an effect the treatment is having
•What are the three or four main elements of an experiment?
Randomization, Replication, and Control
•What is the difference between an observational study and
an experiment?
Treatment is imposed in an experiment Observational studies
are based on previous outcomes.
•Can you name the three major types of experimental design?
Block design, matched pair, completely randomized
•When do you use matched pair?
When your subjects can be used as their own control, a
before and after experiment.
•When do you use a block design?
When you have different groups of similar subjects.
•When do you use a completely randomized design?
When all your subjects are the same.
•What does double blind mean? When do you employ such
a technique?
Neither subjects or experimenter know which treatment is
being given. When the experimenter could possibly bias the
responses.
•What are the explanatory variable and response variable?
X and Y
•Why randomize?
To minimize bias in selecting subjects.
•How do you calculate the number of treatments?
Flow map and count the last column or blocks x variables
x treatments
•What is extrapolation?
When you go beyond the domain(x) to make
predictions, your model cannot be trusted
•What are the two calculations for outliers?
Q1 – 1.5 x IQR, Q3 + 1.5 x IQR, also can do a boxplot.
•What is the meaning of a p-value?
Probability of an event happening if Ho is true
•What is the meaning of a confidence interval in context?
In repeated samples of this size we can expect 95% of our
intervals to contain the true value.
•Name the 7-9 major tests we run?
Z-test, t-test, 1-prop z test,……
•Name the confidence intervals we run?
Z-interval, t-interval,….
•What is the difference between a Z and a T?
Whether or not the population stDev is known.
•Name the symbols that we use in these tests for the null
hypothesis and the alternative.
Ho, Ha
•Name the test statistic symbols
Z, t, X2,
•Name the conditions for all 7 tests
You do it.
•Describe the central limit theorem.
As sample sizes get larger they approach the normal
distribution. Sample sizes that are larger than 30 we can
consider approx. normal due to the CLT.
•How do you calculate the number of samples needed for a
mean or proportion?
Use the appropriate margin of error formula,
•If you want to cut the standard deviation in half, how many
samples should you have.
Multiply your sample size by 4.
•What is a type I error and what are the consequences?
It is the alpha level and the probability of rejecting the null
hypothesis when it’s true. You have to read the problem to
determine the consequences.
•What is a type II error and what are the consequences?
It is β which is failing to reject the null when it’ false. You have
to read the problem to determine the consequences.
•What is power?
It is the probability of successfully rejecting the null when it’s
false. Power = 1 - β
•What is the relationship between alpa, beta, and N?
Alpha is the probability of making a type I error. The
probability of a type II error is β. Power = 1 – β. Increasing the
alpha level and using a larger sample will increase the power
of a test.
•When do you pool?
When both sets of data have the same standard deviation.
•What are the reference numbers for all of the different
confidence intervals?
1.645 = 90%
1.960 = 95%
2.576 = 99%
•What do bias and variability mean?
Bias has to do with center(mean/med) and variability is
how spread the data is.(stDev)
•What is the parameter of interest?
It is the true mean, true proportion, true slope of the
population. It is what we are trying to estimate, the reason
we take samples.
•Name 2 ways to shrink a confidence interval?
Increase sample size or lower your confidence level.
•What does independent mean?
One event has no effect on another event.
P(A) and P(B) = P(A)P(B)
•What does mutually exclusive mean?
Two events cannot both happen
P(A)P(B) = 0
•What does expected value mean?
It is the mean.
•Which of the above has to do with and (multiply) problems?
Independent
•Which of the above has to do with or (addition problems?
Mutually Exclusive
•How do you find the mean of a discrete random variable?
E(x) = ΣxiP(xi)
•How do you find the standard deviation of a discrete
random variable?
•sqrtΣ(Xi – Ex)2*P(xi)
•What is a discrete random variable?
Something that can be counted.
(ex. The number of eggs in a basket)
•What is a continuous random variable? Give me an example.
An interval of numbers.
(ex. The range of temperatures for a city in the month of June)
•What is conditional probability?
The probability of an event happening given another event
has happened….P(AIB) = P( A and B)/P(B)
•What is the mean of a binomial distribution?
np
•What is the standard deviation of a binomial distribution?
Sqrt(np(1-p)
•What are the conditions for a binomial?
P.O.T.I. Copy them off the wall.
•What is the mean of a geometric distribution?
1/p
•What are the conditions for a geometric distribution?
Same as binomial but trials are not fixed and you go until first
success
•What is the standard deviation of a geometric distribution?
Sqrt(1-p)/p2
•What is the formula for combining standard deviations?
Add their variances and then take the square root
•What is a standard score?
Z-score
•For a proportion problem, when is the standard deviation at
its largest?
When p = .50
•How do you find the median of a discrete random variable?
It is the number in the middle of a set of data
•What is replacement and non-replacement?
When sampling you place the subject/unit back in the
sampling pool or do not place the subject/unit back.
When sampling without replacement the sample may not be
larger than 10% of the population it comes from.
•Complement. What is it?
It is 1 – the probability of an event.
•How do you calculate payout?
It is like finding the expected value. It is the amount of
money you can win times the probability of winning that
amount.
•What is the law of large numbers?
In the long run the probability of an event happening will
move closer to its’ expected value.
•What are the degrees of freedom for each test we run?
n -1 for most, (r-1)(c-1) for chi-squared tests, n – 2 for
inference for regression