Download Unit I Review Sheet This is NOT a comprehensive list – However, if

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression toward the mean wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

Data assimilation wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Instrumental variables estimation wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Unit I Review Sheet
This is NOT a comprehensive list – However, if you understand how to do all of these
things you are well-prepared for the exam
Chapter 1


Determine the context for data values using the W’s:
o Who, What (and in what units), When, Where, Why, and How
Classify a variable as categorical or quantitative and determine the units for
quantitative variables.
Chapter 2




Appropriately display categorical data using a frequency table, bar chart, or pie
chart.
Using a contingency table, determine marginal and conditional distributions
Determine if two categorical variables are independent (or if not independent, are
associated” by comparing the conditional distributions of one variable conditional
on the different cases of the other variable.
Use a contingency table to answer questions about the proportion of cases
satisfying certain conditions.
Chapter 3









Display quantitative data using a frequency distribution, histogram, relative
frequency histogram, or stem-and-leaf display
Describe the shape of a distribution with regard to peaks (number of modes),
symmetry/skewness, and usual features such as outliers or gaps.
Select a suitable measure of center and a suitable measure of spread for a variable
based on information about its distribution.
Compute Mean and Median by hand and using technology
Compute standard deviation and interquartile range by hand or using technology
Create a five number summary of a variable.
Understand how the shape of the distribution (symmetry/skewness and/or
presence of outliers) will affect where the mean and median are relative to one
another.
o Also understand how changes in data will affect the mean, median,
standard deviation and IQR.
Construct a modified boxplot using fences.
Use the 1.5 IQR rule to identify possible outliers
Chapter 4

Compare the distributions of two or more groups by comparing their shapes,

centers, and spreads.
Compare two or more groups by comparing their boxplots.
Chapter 5




Compare values from two different distributions using their z-scores.
Use Normal models (when appropriate) and the 68-95-99.7 Rule to estimate
the percentage of observations falling within one, two, or three standard
deviations of the mean.
Determine the percentages of observations that satisfy certain conditions by using
the Normal model and determine “extraordinary” values.
o (you can use z-scores and the Z-table but I recommend using your calculator
normalcdf function for this)
Solve for the values that “cut off” a given percentage of the normal model
o (you can use the Z-Table and reverse solve the z-scores but I recommend
using your calculator’s invNorm function)
Chapter 6


Use a scatterplot to determine if a linear correlation is suggested between two
variables and describe the correlation with regard to direction, form (i.e. linear)
and scatter (strength).
Compute the correlation of two variables and use it as part of the description of a
scatterplot.
Chapter 7




Compute a linear equation (regression line) that models the relationship between
two variables.
Determine whether the slope of a regression line makes sense and interpret the
slope in the context of the problem. (e.g. for every unit increase in variable x, the
model predicts that variable y will increase/decrease by this amount)
Use regression to predict a value of y for a given x.
Find the residual for a given x and interpret it in the context of the problem (i.e.
did the linear model over or underestimate the dependent variable?)