Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Center • Midrange: taking the average of the maximum & the minimum values. Sensitive to outliers. • Median: the value that divides the data into 2 equal areas. More variation, less effective. • Mean: sum of all the data values divided by the count. The point at which the histogram would balance. Outliers & skewness “pull” the mean in that direction. Spread • Range: the difference between the maximum and minimum values • Interquartile range: when you divide the data into quartiles, the difference between the first and third quartiles. (Q3 - Q1) Q1 = 25%ile; Q3 = 75%ile • 5 Number Summary: reports the median, quartiles, maximum and minimum values. Used to construct a boxplot. • Constructing a boxplot: – 1) mark Q1, Q3 & median to create your box. – 2) Erect “fences” at 1.5 and 3 IQR’s below Q1 and above Q3 (OUTLIERS!) – 3) Add whiskers to extreme values within fences – 4) Mark any outliers with special symbols County Size of New Jersey • Interpretation of boxplots – Median centered in box…middle half is symmetric • If not, skewed to the other side – Whiskers same length…symmetric • If not, skewed to side with longer whisker Histogram Aaron vs. Bonds Barry Bonds Hank Aaron 16 25 24 19 33 13 27 26 44 30 28 34 46 37 33 39 40 34 45 44 42 40 37 34 49 24 32 44 39 29 73 46 45 45 5 44 38 47 34 40 26 28 20 12 10 Comparative B & W • Deviation: how far each value is from the mean. • Variance: adding the squared deviations and finding their average. s 2 y y n 1 2 s y y 2 n 1 • Standard Deviation… s y y n 1 2 Counting Chips • Rules for center/shape/spread: – Skewed Data: report median & IQR – Symmetric Data: Report mean and standard deviation – If outliers, report mean and standard deviation with and without the outliers. MC ???