Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Review of Descriptive Graphs and Measures Here is a quick review of what we have covered so far. •Pie Charts •Bar Charts •Pareto •Tables •Dotplots •Stem-and-leaf •Histograms •Ogives •Boxplots •Time Series •Mean •Median •Mode •Weighted mean •Range •IQR •Variance and Standard Deviation •Mean and Standard Deviation of a frequency distribution •Median of a distribution •Empirical Rule •Z-scores •Quartiles, percentiles Review of Descriptive Graphs and Measures Here are some ways to display Categorical Data: Pie Chart hours of detention 8.33% Pareto Chart 2.78% 11.11% 36.11% tardies no id talk walk throw 16.67% lie 25.00% Hours of Detention frequency 14 12 Category 10 tardies 8 6 frequency percent cumulative 13 36.11% 13 no id 9 25.00% 22 talk 6 16.67% 28 walk 4 11.11% 32 throw 3 8.33% 35 lie 1 2.78% 36 36 1.00% 4 Bar Graph 2 0 tardies no id talk walk offense throw lie total Review of Descriptive Graphs and Measures Stem-and-leaf 6 |7 7 |18 8 |25677 9 |25799 10 | 0 1 2 3 3 4 5 5 7 8 9 11 | 2 6 8 12 | 2 4 5 Low/High Stem-and-leaf 6|7 7|1 7|8 8|2 8|5677 9|2 9|5799 10 | 0 1 2 3 3 4 10 | 5 5 7 8 9 11 | 2 11 | 6 8 Review of Descriptive Graphs and Measures Dot-plot or Line Plot Phone 66 76 86 96 minutes 106 116 126 Review of Descriptive Graphs and Measures Min Q1 median Q3 42 30 45 17 15 max 55 25 35 45 55 Interquartile Range = 45 – 30 = 15 Review of Descriptive Graphs and Measures Absences x 8 2 5 12 15 9 6 Scatterplot Final grade (y) 95 90 85 80 75 70 65 60 55 50 45 40 0 2 4 6 8 10 12 Absences (x) 14 16 Grade y 78 92 90 58 43 74 81 Measures of Central Tendency •The mode is the value that occurs the most. There can be more than one mode. •The median is the middle value in an ordered data set •The arithmetic mean is the center of gravity of the data set. This is obtained by summing all of the values and dividing by the number of values. Measures of Central Tendency We can also find the mean of a frequency distribution xf . This is usually easier by calculating mean to do with a table: n x f xf 2 1 2 3 4 12 4 6 24 5 2 10 6 1 6 14 54 Mean = 54/14 3.86 Measures of Central Tendency For classes containing multiple values, you use the midpoint of the class as the x. xf Class midpoint f xf 0-1.9 1 1 1 2-3.9 3 4 12 4-5.9 5 6 30 6-7.9 7 2 14 8-9.9 9 1 9 14 66 mean n Mean =66/14 4.71 Measures of Central Tendency The class with the highest frequency is called the modal class. Class midpoint f xf modal class 0-1.9 1 1 1 2-3.9 3 4 12 4-5.9 5 6 30 6-7.9 7 2 14 8-9.9 9 1 9 14 66 Measures of Central Tendency We estimate the median as the midpoint of the class it lies in. Class midpoint f xf 0-1.9 1 1 1 2-3.9 3 4 12 4-5.9 5 6 30 6-7.9 7 2 14 8-9.9 9 1 9 14 66 median lies in here, so we estimate the median as 5. Measures of Central Tendency Finally, there is the weighted mean: x weight xw 86 .5 43 Classwork 90 /homewk .25 22.5 Quizzes .25 19 Tests 76 84.5 xw mean n Measures of Variation • The range is the largest value minus the smallest value • The Interquartile range is the Third Quartile minus the First Quartile Measures of Variation The Variance is : 2 ( x ) 2 n Example data set one: 1, 3, 5, 7, 8, 9, 9, 11, 12, 12, 15 The mean is about 8.36 The variance is [(1-8.36)2+(3-8.36)2+(5-8.36)2+(78.36)2+(8-8.36)2+(9-8.36)2+(9-8.36)2+(11-8.36)2+(128.36)2+(12-8.36)2+(15-8.36)2]/11=3.98 The Standard Deviation is the square root of the Variance. Measures of Variation The standard deviation is the easier to find using a calculator with the function built in. Example data set one: 1, 3, 5, 7, 8, 9, 9, 11, 12, 12, 15 TI-83: Put the data in L1 Press Stat. Cursor right to choose Calc. Enter for one variable stats. The mean, standard deviation and several other measures will be displayed. Measures of Variation The standard deviation can be calculated using a table. x f xf 2 x f xf x2f 2 1 2 4 3 4 12 36 4 6 24 96 5 2 10 50 6 1 6 36 14 54 222 2 n ( n )2 222 54 2 ( ) 14 14 2 .9796 .9897 2 Measures of Variation However the calculator is still probably easier: x f 2 1 3 4 4 6 5 2 6 1 14 TI-83 Enter values in L1 Enter frequencies in L2 One-variable-stats L1, L2 This will give you the standard deviation of the frequency table Empirical Rule The Empirical Rule for Normal Distributions About 68% of all values fall within 1 standard deviation of the mean About 95% of all values fall within 2 standard deviation of the mean About 99.7% of all values fall within 3 standard deviation of the mean. Empirical Rule The Empirical Rule for Normal Distributions About 68% of all values fall within 1 standard deviation of the mean About 95% of all values fall within 2 standard deviation of the mean About 99.7% of all values fall within 3 standard deviation of the mean. Empirical Rule The Empirical Rule for Normal Distributions About 68% of all values fall within 1 standard deviation of the mean About 95% of all values fall within 2 standard deviation of the mean About 99.7% of all values fall within 3 standard deviation of the mean. Example: A normal dataset has a mean of 50 and a standard deviation of 5. Between what two numbers does 95% of the data fall? (50-2*5, 50+2*5) (40, 60) Percentiles •Count the number of data points that lie below the value •Divide this by the total number of data points •Convert to a percent (multiply by 100) Reading a percentile chart: Age of Executives 120 100 percentile 80 60 40 20 0 0 20 40 60 age 80 100 Z-scores Z-score: The number of standard deviations a data point is from the mean. Find the raw distance from the mean. Divide by the standard deviation. Example: A data set has a mean of 50 and a SD of 5. What is the z-score of 62? Z = (x – mean)/SD = (62 – 50)/5 = 12/5 = 2.4 Measures of Variation Z-score The number of standard deviations a data point is from the mean.