Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Topic: Chapter 7 Scatterplots, Associations, and Correlation Name: _____________________________________________________ Objectives: We will be able to use scatterplots to display the relationships between two sets of quantitative data. What do we use to describe a distribution? When describing distributions, we need to discuss shape, center, and spread. How we measure the center and spread of a distribution depends on its __________________. The center of a distribution is a “typical” value. If the shape is unimodal and symmetric, a “typical” value is in the _______________________. If the shape is skewed, however, a “typical” value is not necessarily in the middle. Skewed Distributions: How can we describe skewed distributions? For _______________________ distributions, use the ______________________ to determine the ______________________ of the distribution and the ____________________________________________ to describe the ______________________ of the distribution. The median: is the _______________ data value (when the data have been _______________) that divides the histogram into two equal _______________ has the same _______________ as the data is _______________ to outliers (extreme data values) The range: is the difference between the _______________ value and the _______________ value is a _______________, NOT an _______________ is _ ___to outliers The interquartile range: contains the _______________ of the data is the difference between the _______________ and _______________ quartiles is a _______________, NOT an _______________ is _______________ to outliers Symmetrical Distributions: How can we describe symmetrical distributions? For _______________ distributions, use the _________________ _to determine the _______________ of the distribution and the _______________ to describe the _______________ of the distribution. The mean: is the arithmetic _______________ of the data values is the ____ _ of a histogram has the same _______________ as the data is _______________ to outliers is given by the formula The standard deviation: measures the “typical” distance each data value is from the _______________ Because some values are above the mean and some are below the mean, finding the sum is not useful (positives cancel out negatives); therefore we first _______________ the deviations, then calculate an _______________ _______________ . This is called the _______________. This statistics does not have the same units as the data, since we squared the deviations. Therefore, the final step is to take the _______________ of the variance, which gives us the _______________ . is given by the formula How can calculate standard deviation by hand? is _______________ _to outliers, since its calculation involves the _______________ It takes into account how far EACH data point is from the mean of the data set. A High standard deviation shows that many of the data values are scattered far from the mean. A Low Standard Deviation show that many of the data values are close to the mean. Example: To find the Standard Deviation using the data set below: 2 s 2 (x x) 2 n 1 1) Find the mean of the data set Original Values 6 (x x) 2 Divide by n – 1 n 1 Finally take the square root 9 11 14 ___________ Deviations Now add up the squared deviations. s 6 Squared Deviations Why do we call a box plot a five-numbersummary? Create a box plot for the following information: 1) Create a box plot for the following information: Max 47 years Q3 22 Median 19 Q1 17 Min 13 2) 80, 82, 84, 86, 90, 81, 91, 82, 83, 77 Thinking about Variation: ________________________ is an important fundamental concept in Statistics. It helps us to be precise about what we don’t know. If the data values are scattered far from the center, the IQR and the standard deviation will be large. If the data values are __________ to the center, then these measures of _________________ will be large. Shape, center, and spread: You should always report the shape of a distribution and include a center and spread. o Skewed: report the __________________ o You might want to include the _________________________________, but you should point out why the ________________________________ differ. The fact that the o Symmetric: report the mean and standard deviation and possibly the median and IQR. The ___________ is usually a bit larger than the_______________________. o If there are any clear outliers present and you are reporting the mean and standard deviation, report them with: Summary: