Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SECTION 2.3 – HOW CAN WE DESCRIBE THE CENTER OF QUANTITATIVE DATA? As guests leave Hershey Park, they are asked how many rides they have ridden in that day’s visit. This is an example of what type of variable? 1. 2. 3. 4. 5. Binary categorical Discrete categorical Binary quantitative Discrete quantitative Continuous quantitative As guests leave Hershey Park, a random sample of them are asked how many rides they have ridden in that day’s visit. We will use the data for some of today’s work: 2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 6 7 7 8 8 10 12 15 Graphs vs. Numerical Summaries Graphs help give a sense of the shape of the distribution when using quantitative data. Numerical summaries for quantitative data typically take two forms. ◦ Center ◦ Spread The shape of the data The shape of the data Mean (Average) The mean is the sum of all of the observations divided by the number of observations. x x =∑ n What is the mean number of rides ridden, in the sample of Hershey Park visitors? Visual interpretation of the mean The “balance point” Median The median is the midpoint of the observations. Half of the observations are above it and half are below it. ◦ For an odd data set this is the middle number. ◦ For an even data set we use the average of the two middle numbers. Example 1 In a small class a teacher notes that the grades on a 10-point quiz for her 5 students are: 10, 10, 7, 6, 4. ◦ What is the mean? ◦ What is the median? Example 2 In a small class a teacher notes that the grades on a 10-point quiz for her 6 students are: 10, 10, 9, 7, 6, 4 ◦ What is the mean? ◦ What is the median? In the previous example, if the last score was a 1 instead of a 4, would the mean change? 1. Yes 2. No In the previous example, if the last score was a 1 instead of a 4, would the median change? 1. Yes 2. No Mean vs. Median The mean takes into account the value of every observation. Thus the mean is sensitive to very large and very small values. The median only takes into account the middle of the data. Thus values on either extreme do not affect it. We call extreme values outliers. A numerical summary which is NOT sensitive to outliers is said to be resistant. Mean vs. Median When the shape is symmetric the mean and the median are usually close. When the shape is skewed left, the mean is to the left of the median. When the shape is skewed right, the mean is to the right of the median. Mean vs Median Mean vs. Median The mean is a good numerical summary when the data is symmetric and bell-shaped. Height of 25 women in a class Mean vs. Median When the shape is irregular the mean is not as meaningful of a summary. What could account for the irregular shape of this graph? Height of plants by color 5 red Number of plants 4 pink blue 3 2 1 0 Height in centimeters Mode For discrete data that takes on a few values, the median can become meaningless. In this case, the mode which is the most frequently chosen value is often used as the measure of center. Example 3 25 faculty members at a university are asked how many children they have. 13 faculty members have no children, 6 faculty members have 1 child, and 6 faculty members have two children. ◦ The median number of children is 0. ◦ If the number of faculty members with one child was 0 and the number of faculty members with two children was 12, the median would still be 0. SECTION 2.4 – HOW CAN WE DESCRIBE THE SPREAD OF QUANTITATIVE DATA? Range The range is the difference between the largest and the smallest observation. If quiz scores are 5, 5, 7, 8, 9, 9, 10, then the range of scores is 10 – 5 = 5. It is not resistant. It ignores most individual observations. Standard Deviation How far does each observation fall from the mean? A deviation is the difference between the observation and the mean. ◦ Positive deviation when the value is above the mean. ◦ Negative deviation when the value is below the mean. Standard Deviation We square the deviations to make all of the deviations positive. We then average all of the deviations to find the variance. The standard deviation is the square root of the variance. Using 1-Var Stats on your TI-83/84 you can get this and more for a data set. Standard Deviation Example 1, again There are 5 students in class and the scores on a quiz are 10, 10, 7, 6, 4. Determine the standard deviation of the scores “by hand.” Determine the standard deviation of the scores using a TI calculator. Standard Deviation Large standard deviations represent data which is spread out. Small standard deviations represent data which is bunched together. Standard deviation is not resistant. What would a data set with standard deviation 0 look like? Empirical Rule If the distribution of the data is bell-shaped, then approximately: ◦ 68% of the observations fall within one standard deviation of the mean ◦ 95% of the observations fall within two standard deviations of the mean ◦ 99.7% of the observations fall within three standard deviations of the mean Example We know that the heights of adult males follow a generally bell shaped distribution with a mean 68 inches and standard deviation of 2.5 inches. Sketch the shape of this distribution. If I have a random sample of adult males whose mothers breastfed them and the average height from this sample is 73 inches. Does this seem like a significant result?