* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Stat200_Objectives
Inductive probability wikipedia , lookup
Foundations of statistics wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Regression toward the mean wikipedia , lookup
Resampling (statistics) wikipedia , lookup
OBJECTIVES: STAT 200 1. Describing Data 1A: Summarizing with Graphs: 1. 2. 3. 4. 5. 6. 7. Given a variable, determine whether it is categorical or quantitative. Given a variable, determine whether it is categorical (binary, ordinal, nominal) or quantitative (continuous, discrete). Given a variable, determine which choice of graph (pie, bar, histogram, stemplot, boxplot) would be most (least) appropriate. Given a set of raw data, use software to create the appropriate graphs. Given a histogram or boxplot, identify the distribution shape: left (negative) skew, right (positive) skew, or symmetric. Given a graphical summary, propose an explanation of the distribution of the data. Given some basic statistics of a variable (mean, median, standard deviation, quartiles), predict what distributional shape that variable would most likely take. 1B: Summarizing with Numbers: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Given a set of raw data, calculate the principal summary statistics (mean, median, quartiles, inter-quartile range, variance, standard deviation) by using appropriate software. Given a set of raw data, calculate the principal summary statistics (mean, median, quartiles, and inter-quartile range) by hand. Explain how the mean and median are related for different shapes of a distribution (skewed left, skewed right or symmetric). Understand the various measures for the center of data (mode, mean, and median) Understand the various measures for the spread of data (range, interquartile range, variance and standard deviation). Given the variance of a set of data, know that the square root results in the standard deviation for that set of data. Describe what the effect is to the mean and standard deviation when adding (subtracting) a constant to every observation or when multiplying (dividing) every observation a constant. Explain the impact of outliers on summary statistics such as mean, median and standard deviation. Know what summary measures constitute a five number summary for a set of data. Given a five number summary, be able to interpret the meaning of the quartiles. Given a variable and a percentile for that variable, provide the correct interpretation of that percentile. Given the mean and standard deviation for a bell shaped set of data, correctly apply and interpret the Empirical Rule. 1 2. Data Collection: 1. 2. 3. 4. 5. 6. 7. 8. Given a study, identify population, sample, parameter, and statistic. Given a survey sample, determine whether the sample is a simple random sample, cluster sample, stratified sample, a voluntary response sample, or a convenience sample. Given a study, recognize typical forms of biases such as nonresponse, response, or selection. Given a study sample, describe the proper population represented by the sample. Given a study sample and population of interest, determine if sample is representative of the population of interest. Given a study, determine whether a SRS, cluster sample, or a stratified random sample method was used.. Given a study’s objective, decide which probability sampling method – SRS, cluster, or stratified – would be best to use. Given the sample size for sample survey, calculate a conservative margin of error using 1/√n 3. Experiments: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Given a study, identify subjects, factors and treatments. Display an understanding on the difference between random selection and random assignment. Understand the difference in conclusions resulting from an observational study and an experiment. Describe the advantages of an experiment to an observational study. Given a study, determine whether it is an observational study or an experiment. Given a study, determine if a causal conclusion can be made. When designing an experiment, describe the placebo effect. When designing an experiment, describe the experimenter effect. When designing an experiment, explain how a researcher can control for the placebo and researcher effects. Given a study objective, explain why randomization should (or cannot) be used. Given a study objective, decide whether the experiment should be double-blinded. Given a study, determine whether a matched pairs experiment was used. 4. Basic Probability: 1. 2. 3. 4. 5. Explain the difference between subjective, relative frequency, and classical probability. Given a scenario and probability, determine if the probability was a subjective, relative frequency, or classical probability Given an experiment, describe the sample space. Given an event, list the outcomes that make up the event. Given a set of probabilities, determine if they are legitimate. That is, check the following criteria 2 6. 7. 8. 9. 10. 11. 12. 1. The probability of any event is a number between zero and one. 2. If we assign a probability to every possible outcome, the sum of these probabilities must be one. Given a probability distribution, find a specified probability for an event of interest. Find the probability of an event by summing the probabilities of the individual outcomes that make up the event. Find the probability that an event does not occur by one minus the probability that the event does occur. Given probability of events, use this information correctly to determine if events are independent. Given event probabilities, correctly select and apply the appropriate conditional probability. Explain the difference between disjoint (mutually exclusive) and independent events. Given a set of events, determine if the events are disjoint (mutually exclusive). 5. Probability Distributions 5A: Discrete Distribution: 1. 2. 3. 4. 5. 6. 7. Given a continuous variable, determine if discrete or continous. Given a discrete probability distribution, find the probability of a missing event. Given a discrete probability distribution, find the expected value and standard deviation. Understand that expected value refers to the mean. Given a discrete probability distribution, create the cumulative distribution. Given a discrete probability (or cumulative) distribution, find various probabilities for given scenarios (example "more than", "less than", "exact"). For a discrete (or cumulative) probability distribution, understand the difference between including "equal to". For example, explain the difference in probability of an outcome if finding "less than" compared to "less than or equal to". 5B: Binomial Distribution: 1. 2. Given a study, determine if all conditions of a binomial experiment are satisfied. Given the parameters of a binomial experiment, calculate the mean and standard deviation. 3. Given a binomial experiment, use software to find probabilities for a set of outcomes. 4. Given a situation, determine if binomial and if it is identify the probability of success and the number of trials. 5C: Continuous (Normal) Distribution: 1. List the key characteristics of the normal distribution. 3 2. 3. 4. 5. 6. 7. 8. 9. Given a mean and standard deviation, use the Empirical Rule to find the percentage of the normal distribution within one, two, or three standard deviations of the mean. Given a mean, standard deviation, and observed value, calculate the standardized value (z- score). Given a mean, standard deviation, and observed value, use a normal table to find the corresponding probability. Given a mean, standard deviation, and observed value, use software to find the corresponding probability. Given a z-score, use a standard normal table to find the corresponding probability. Given a mean and standard deviation, find a specified percentile of the normal distribution. Given a mean, standard deviation, and percentile, calculate the appropriate observed value. For a continous probability distribution, understand there is no difference between including "equal to". Sampling Distributions: 1. 2. 3. 4. 5. 6. 7. 8. 9. Given a study, describe how the Central Limit Theorem applies. Given a study, describe how the rules for sample proportions apply. Describe the sampling distribution of the sample mean and sample proportion. Given a study, describe the sampling distribution of the sample mean as specifically as possible. This involves stating whether this distribution is at least approximately normal. Given a population mean and standard deviation, sample size, calculate the zscore for a sample mean. Given a population mean and standard deviation, sample size, and sample mean, use standard normal table to find the corresponding probability. Given a population mean and standard deviation, sample size, and sample mean, use software to find the corresponding probability. Given a population proportion and sample size, calculate the z-score for a sample proportion. Given population proportion, sample size, and sample proportion, use standard normal table to find the corresponding probability. 6. Confidence Intervals for One Mean and One Proportion: Given a study, determine whether the study meets the “simple” conditions under which inferences on a population mean or population proportion may be performed. (For example, satisfies CLT or rules for proportions). 2. Given a study, describe the primary purpose for using a confidence interval. 3. Distinguish between the standard error of a sample statistic and the margin of error for a confidence interval. 4. Given a sample size, calculate the correct degrees of freedom for a one-mean confidence interval. 1. 4 5. 6. 7. 8. 9. 10. 11. 12. Given a confidence level and sample size, select from a T-table the necessary tmultiplier to construct the confidence interval for a mean. Given a confidence level for a proportion, select the z-multiplier necessary to construct the confidence interval. Given a desired margin of error and confidence interval, calculate appropriate sample size for estimating population proportion. Given a desired margin of error, confidence interval, and standard deviation, calculate appropriate sample size for estimating population proportion. Explain that confidence intervals are random quantities varying from sample to sample and that they may miss the true population parameter. Explain that the confidence level is that proportion of possible samples for which the confidence interval will capture the true parameter. Construct a one-mean confidence interval – by hand and with software - for a 𝑆 𝑛−1 population mean when the standard deviation is unknown using: 𝑥̅ ± 𝑡1−𝛼/2 ∗ 𝑛 √ Construct a confidence interval - by hand and with software - for a population 𝑝̂(1−𝑝̂) proportion using the formula: 𝑝̂ ± 𝑍1−𝛼/2 ∗ √ 𝑛 13. Given a study, interpret the result of a confidence interval in the context of the problem. 14. Given a study and confidence interval for the population mean, describe how the following will affect the width of the confidence interval. 1. Increasing the sample size 2. Increasing the confidence level 3. A larger population standard deviation 15. Given a study and confidence interval for the population proportion, describe how the following will affect the width of the confidence interval. 1. Increasing the sample size 2. Increasing confidence level 7. Tests of Hypothesis for One Mean and One Proportion: 1. 2. 3. 4. 5. 6. 7. 8. Given a study objective, determine whether significance testing is appropriate. Given a study, correctly apply the Steps to Hypothesis Testing: checking assumptions, setting hypotheses, stating alpha, calculate test statistic, calculate pvalue, make correct decision, and state overall conclusion. Given summary measures, calculate the appropriate T- or Z- test statistic. Given a study objective, choose appropriate null and alternative hypotheses, including determining whether the alternative should be one-sided or two-sided. Given a study and p-value, explain in context that p-value is a probability of getting a sample statistic as extreme or more extreme that what was seen in the sample given, that the null hypothesis is true. Given a test statistic, alternative hypothesis and sample size, calculate the range for the p-value using a T-table. Given a test statistic and alternative hypothesis, calculate a p-value for a test of one proportion using a standard normal table. Given a study, interpret the results of a test of significance in context of the study. 5 Given a study objective, significance level (α), and raw data, use software to perform the appropriate hypothesis test. 10. Given software output and study objectives, determine the result of a hypothesis test. Explain the relationship between a confidence interval and a two-sided hypothesis test. 11. Given software output and study objectives, determine if the appropriate significance test was conducted. 9. 8. Inference for Two Means and Two Proportions: 1. 2. 3. 4. 5. 6. 7. 8. 9. Given summary measures, calculate the appropriate T- or Z- test statistic. Given a study objective, choose appropriate null and alternative hypotheses. Given a study objective on the test of two means, determine if the test should use dependent (paired) means, independent means, or two proportions. Given two sample standard deviations for a test of two means, use a "rule of thumb" to determine if the variances can be pooled. Given the results of a study of two means, determine if the appropriate test – dependent or independent – was applied. Given the results of a two independent means study, determine if pooled (or unpooled methods) were used and if this choice was correct. Given the results of test, explain how the confidence interval can be used to support the conclusion on whether or not there is a significant difference between the two groups. Given a study, interpret the results of a test of significance in context of the study. Given a study objective, significance level (α), and raw data, use software to perform the appropriate hypothesis test. 9. Comparing Two Quantitative Variables: 9A: Correlation and Scatterplots 1. 2. 3. 4. 5. 6. 7. 8. Given a study, distinguish between explanatory and response variables. Given a set of raw data, use software to create scatterplot. Given a scatterplot, identify patterns such as positive and negative associations, non-linear patterns and outliers. Given a correlation coefficient, determine if it is legitimate. Given two variables and their correlation coefficient, describe how the correlation changes if the units of either variable are changed. Given two variables (x and y), describe the correlation you would expect to find between x and y. Match given scatterplots with possible values of the correlation coefficient. Given a scatterplot with outliers, determine which could be considered influential outliers. 6 9B: Simple Linear Regression: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. Given a set of raw data, use software to find the simple linear regression equation. Given a set of raw data, use software to conduct a simple linear regression analysis. Given the objective of a simple linear regression study, determine which variable would be the response and which the explanatory (predictor). Given a simple linear regression equation, identify the response variable, yintercept, slope, and explanatory variable. Given a simple linear regression equation, explain the interpretation of the slope in context of the data. Given simple linear regression output, identify the appropriate regression equation. Given a two variables where one is designated Y and the other X, determine if a simple linear regression analysis would be appropriate. Explain the relationship between the slope of the regression line and the correlation coefficient. Given the least squares line and a value of x, calculate the predicted value of y. Understand that in a linear regression study the regression line is estimating the mean of the response for given levels of the predictor. Given the results of a simple linear regression study, explain the value of the coefficient of determination, R-squared in context of the study. Given the results of a simple linear regression study, determine the correct hypotheses and decision for a test of the regression slope. Given least squares regression line and an observation, calculate the residual for that observation. Given a simply linear regression study, explain why applying the equation to observations outside the data range of X would not be appropriate. Given a scatterplot, explain the effect on the least squares line by the presence of different outliers. State the appropriate assumptions for conducting a simple linear regression analysis. Use software to graphically test assumption of error normality. 10. Comparing Two Categorical Variables: 1. Given a set of raw data, use software to conduct a chi-square test of independence. 2. Given the objectives of test of independence, state the appropriate hypotheses. 3. Given output from a chi-square test of independence, identify the appropriate response and explanatory variables. 4. Given output from a chi-square test of independence, state the correct test of hypothesis decision. 7 5. 6. 7. 8. 9. Given a 2 x 2 table of observed data, find the expected table and calculate the chisquare test statistic. When conducting a chi-square test of independence for an R x C table, explain how the degrees of freedom are calculated. Explain how a chi-square test for a 2 x 2 table is analogous to a test of two proportions. Given two variables, determine if a chi-square test of independence would be appropriate. State the appropriate assumptions needed to conduct a chi-square test of independence. 11. Comparing More Than Two Means: One-Way Analysis of Variance: 1. 2. 3. 4. 5. 6. 7. 8. Given a set of raw data, use software to conduct a one-way ANOVA. Given sample standard deviations for each group, apply a "rule of thumb" to test if assumption of equal variances has been satisfied. Use graph(s) to check assumption of normality. Given a one-way ANOVA study, state the correct hypotheses. Given results of a one-way ANOVA study, state the correct conclusion for the test of significance. Given results of a one-way ANOVA study, explain how the F-test statistic is calculated. Given the results of a one-way ANOVA, explain how the degrees of freedom for the F-test were calculated. Given a design of a one-way ANOVA study, calculate the three sets of degrees of freedom. 8