* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download AP Stats Test Review
Survey
Document related concepts
Transcript
AP Stats Test Review What are the four parts of the course? Inference, Experimental Design, Probability, and Data Analysis How many multiple choice and free response? 40 and 5 Tell me about #6, What is its style? Investigative Task, they will combine topics and ask you to do something new…..DO NOT LEAVE IT BLANK. •How do you “describe a distribution”? CUSS •When do you use a bar chart as opposed to a histogram? Categorical vs quantitative (ex. Categorical would be fav. soda brands, quantitative would be test scores) •What does R mean? What is it’s name? Measures the association between the variables. It is called the correlation coefficient (ex. There is a strong positive linear association between the # of hot dogs eaten and # of sodas purchased.) •What does R2 mean? What is it’s name? Tells how well the linear model is at making predictions. It is called the Coefficient of Determination. (ex. 78.3% of the variation the # of sodas purchased can be explained by the approximate linear relationship with hot dogs eaten. •What does the slope mean in context of the problem? It is the letter b (ex. For every hot dog eaten we can expect an average increase in sodas purchased by .78. •What is the formula that involves slope, correlation, and standard deviation? •What does resistant and non-resistant mean? Name things that are non-resistant? Resistant. Affected by outliers or not. Median and IQR are resistant. Mean and StDev are non-resistant. •Cumulative frequency and relative frequency, what do you always convert this to? A boxplot, think of the percentiles of a boxplot. •What does a “good” residual plot look like? Randomly scattered..no curved pattern •Name ways to plot univariate data. Boxplot, dotplot, stemplot, histogram •Name ways to plot bivariate data. Scatterplot •What is the meaning of least squares? Minimizing the distance of the regression line from the observed points. •What are 3 ways to check for normality? Boxplot, stemplot, histogram, empirical rule, normal probability plot, compare mean and median •What is the difference between influential points and outliers? Outliers are in the y direction and influential points are in the x-direction. •What is the empirical rule? 68-95-99.7, the percents of data that is within 1,2,and 3 Stdev’s from the mean. •What is the meaning of standard deviation? The average distance away from the mean. •What is the difference between blocking and stratifying? Blocking is the word when doing an experiment and stratifying is the word used in surveys. •What is the purpose of blocking and stratifying? Placing people in similar groups to see if different groups have different effects or different opinions. •What is the purpose of a control group? To see how much of an effect the treatment is having •What are the three or four main elements of an experiment? Randomization, Replication, and Control •What is the difference between an observational study and an experiment? Treatment is imposed in an experiment Observational studies are based on previous outcomes. •Can you name the three major types of experimental design? Block design, matched pair, completely randomized •When do you use matched pair? When your subjects can be used as their own control, a before and after experiment. •When do you use a block design? When you have different groups of similar subjects. •When do you use a completely randomized design? When all your subjects are the same. •What does double blind mean? When do you employ such a technique? Neither subjects or experimenter know which treatment is being given. When the experimenter could possibly bias the responses. •What are the explanatory variable and response variable? X and Y •Why randomize? To minimize bias in selecting subjects. •How do you calculate the number of treatments? Flow map and count the last column or blocks x variables x treatments •What is extrapolation? When you go beyond the domain(x) to make predictions, your model cannot be trusted •What are the two calculations for outliers? Q1 – 1.5 x IQR, Q3 + 1.5 x IQR, also can do a boxplot. •What is the meaning of a p-value? Probability of an event happening if Ho is true •What is the meaning of a confidence interval in context? In repeated samples of this size we can expect 95% of our intervals to contain the true value. •Name the 7-9 major tests we run? Z-test, t-test, 1-prop z test,…… •Name the confidence intervals we run? Z-interval, t-interval,…. •What is the difference between a Z and a T? Whether or not the population stDev is known. •Name the symbols that we use in these tests for the null hypothesis and the alternative. Ho, Ha •Name the test statistic symbols Z, t, X2, •Name the conditions for all 7 tests You do it. •Describe the central limit theorem. As sample sizes get larger they approach the normal distribution. Sample sizes that are larger than 30 we can consider approx. normal due to the CLT. •How do you calculate the number of samples needed for a mean or proportion? Use the appropriate margin of error formula, •If you want to cut the standard deviation in half, how many samples should you have. Multiply your sample size by 4. •What is a type I error and what are the consequences? It is the alpha level and the probability of rejecting the null hypothesis when it’s true. You have to read the problem to determine the consequences. •What is a type II error and what are the consequences? It is β which is failing to reject the null when it’ false. You have to read the problem to determine the consequences. •What is power? It is the probability of successfully rejecting the null when it’s false. Power = 1 - β •What is the relationship between alpa, beta, and N? Alpha is the probability of making a type I error. The probability of a type II error is β. Power = 1 – β. Increasing the alpha level and using a larger sample will increase the power of a test. •When do you pool? When both sets of data have the same standard deviation. •What are the reference numbers for all of the different confidence intervals? 1.645 = 90% 1.960 = 95% 2.576 = 99% •What do bias and variability mean? Bias has to do with center(mean/med) and variability is how spread the data is.(stDev) •What is the parameter of interest? It is the true mean, true proportion, true slope of the population. It is what we are trying to estimate, the reason we take samples. •Name 2 ways to shrink a confidence interval? Increase sample size or lower your confidence level. •What does independent mean? One event has no effect on another event. P(A) and P(B) = P(A)P(B) •What does mutually exclusive mean? Two events cannot both happen P(A)P(B) = 0 •What does expected value mean? It is the mean. •Which of the above has to do with and (multiply) problems? Independent •Which of the above has to do with or (addition problems? Mutually Exclusive •How do you find the mean of a discrete random variable? E(x) = ΣxiP(xi) •How do you find the standard deviation of a discrete random variable? •sqrtΣ(Xi – Ex)2*P(xi) •What is a discrete random variable? Something that can be counted. (ex. The number of eggs in a basket) •What is a continuous random variable? Give me an example. An interval of numbers. (ex. The range of temperatures for a city in the month of June) •What is conditional probability? The probability of an event happening given another event has happened….P(AIB) = P( A and B)/P(B) •What is the mean of a binomial distribution? np •What is the standard deviation of a binomial distribution? Sqrt(np(1-p) •What are the conditions for a binomial? P.O.T.I. Copy them off the wall. •What is the mean of a geometric distribution? 1/p •What are the conditions for a geometric distribution? Same as binomial but trials are not fixed and you go until first success •What is the standard deviation of a geometric distribution? Sqrt(1-p)/p2 •What is the formula for combining standard deviations? Add their variances and then take the square root •What is a standard score? Z-score •For a proportion problem, when is the standard deviation at its largest? When p = .50 •How do you find the median of a discrete random variable? It is the number in the middle of a set of data •What is replacement and non-replacement? When sampling you place the subject/unit back in the sampling pool or do not place the subject/unit back. When sampling without replacement the sample may not be larger than 10% of the population it comes from. •Complement. What is it? It is 1 – the probability of an event. •How do you calculate payout? It is like finding the expected value. It is the amount of money you can win times the probability of winning that amount. •What is the law of large numbers? In the long run the probability of an event happening will move closer to its’ expected value. •What are the degrees of freedom for each test we run? n -1 for most, (r-1)(c-1) for chi-squared tests, n – 2 for inference for regression