Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1-4: Review of histograms, normal curve, empirical rule, central tendancy, spread 1-4 Vocabulary: Frequency Histogram: A graph that displays the classes on the horizontal axis and the frequencies of the classes on the vertical axis. Bars DO touch. - Frequencies represented by bars Relative-frequency Histogram: A graph that displays the classes on the horizontal axis and the relative frequencies of the classes on the vertical axis. Bars DO touch. - Relative-frequency Histograms are better since the vertical scale is always the same (0 to 1). These types of histograms are easy to compare. Measures of Center Mode – Most common or popular; used for categorical or numerical data Advantage: Easy to calculate, Can used for qualitative data to see most popular response/ choices Disadvantage: Not aimed at finding the center; Not always relevant for quantitative data Choose- When we want to know what shows up the most Mean – The average of every value in the set of data; used for only quantitative data Advantage: Useful when there are NO outliers, quick to calculate Disadvantage: affected by outliers (large outlier skews the mean high; small outlier skews the mean low), may not be representative; cannot determine outliers Choose – When we want to know the middle taking into account how big and small numbers are. Median – The middle number of a set of data; useful when there are outliers; only for quantitative data Advantage: Takes all numbers into account; Can determine if numbers are above or below 50% of the data; Not affected by outliers Disadvantage: Tedious to calculate; cannot determine outliers Choose – When we want to know the middle and there may be an outlier affecting the mean. Measures of Spread Range – Difference between the minimum & maximum Advantage: How far apart the numbers are; Easy to compute Disadvantage: What numbers are in the data set; only takes two numbers into account IQR (Inner Quartile Range) – Difference between Q3 and Q1. Advantage: Not affected by outliers; shows where the middle 50% of the data is Disadvantage: Doesn’t take into account the first 25% of data or the last 25% of data Standard Deviation – Measures how far, on average, numbers are from the mean; useful when the Mean is the best measure of center Advantage: Takes into account all observations Disadvantage: Easily affected by observations Example - Select an appropriate measure of center. . . a) A student takes four exams in a biology class. His grades are 88, 75, 95, and 100. Which measure of center is the student likely to report? b) The National Association of REALTORS publishes data on resale prices of U.S. homes. Which measure of center is most appropriate for such resale prices? c) The 2005 Boston Marathon had two categories of official finishers: male and female, of which there were 10,894 and 6,655, respectively. Which measure of center should be used? - Of the mean, median, and mode, the mode is the only measure of center that can be used for QUALITATIVE DATA. Practice H) A retail store had total sales of $436, $650, $530, $500, $650, $489, and $423 last week. Which measure of center would make the store’s sales last week appear the MOST PROFITABLE? Explain. I) The heights of 5 starting players on the basketball team at Goose Neck High School are 75in, 74in, 73in, 70in, and 68in. Decide which measure of center best describes the data. Explain. J) The monthly car payments for Brandon’s last 8 customers were $266, $285, $285, $285, $315, $325, $344. Decide which measure of center best describes the data. Explain. K) At the end of the winter, the seven sweaters left in stock at the GAP Outlet store were brown, orange, green, orange, brown, orange, orange, and orange. Decide which measure of center is appropriate for this data. Explain. L) The profits at 5 schools that sold Gems & Jewels products were $318.22, $440.79, $607.16, $1,090.38, $4,790.15. Decide whether the mean or median is the best measure of center for this data. Explain. Rule of 3 Standard Deviations (Empirical Rule) - Almost all of the observations in any data set lie within three standard deviations to either side of the mean. - A data set that varies a lot will have a large standard deviation, so 3 SD from the mean will be extensive - A data set that varies very little will have a small standard deviation, so 3 SD from the mean will be narrow. If the distribution of data is bell-shaped, the Empirical Rule says that . . . o 68% of observations lie within 1 standard deviation o 95% of observations lie within 2 standard deviations o 99.7% of our observations are within 3 standard deviations. Normal Distribution – Bell shaped curve that is centered about the mean. The curve is approximately symmetrical with its spread determined by its standard deviations.