Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Describing Distributions Numerically (continued) Suppose a random sample of 25 grades from the first Math 2200 test were selected and put in order. 42, 55, 62, 67, 68, 70, 72, 72, 75, 76, 78, 78, 79, 80, 80, 83, 83, 85, 88, 88, 90, 91, 92, 94, 99 Another Measure of Spread Inter-Quartile Range: IQR -quartiles split the data into four ‘equal’ parts: only 3 quartiles exist Q1 = lower quartile Q2 = middle quartile Q3 = upper quartile -note that different technology have different methods for finding quartiles IQR = upper quartile – lower quartile -not affected by outliers -good for skewed data : we use the median, not the mean, when finding IQR Q1 = _________ Q2 = __________ Q3 = __________ IQR = _______________ Summary: When describing a distribution, one should (if possible) report (1) the measure of center used and its value (2) the measure of spread used and its value (3) give a graph with explanation of general shape, modes, gaps, outliers, etc. Selecting a Measure of Center and Measure of Spread for a set of data Mean and standard deviation for the most popular measure of center and spread. They work well for symmetrical data and data without outliers. If data has outliers, you can report the mean and standard deviation with the outliers included and again without the outliers included. Median and IQR may be used for skewed data or with data with outliers Another Method for Describing a Distribution: Five number summary and box plot. -gives a measure of center, spread and an idea of the shape of the distribution -can be used to identify outliers -can be used to quickly compare multiple distributions Five Number Summary: 5 numbers describing the distribution, they are: Minimum (smallest) value, Q1, Q2, Q3, & Maximum (largest) value Box Plot: Graph describing a distribution using the five number summary. -the graph consists of boxes comprised of Q1, Q2, & Q3, and ‘whiskers’ showing maximum and minimum values, unless outliers are present 1 . . . . . . . . . . . . . Identifying Outliers: ‘Fences’ help identify outliers. ‘Fences’ are the values calculated from Q1 - 1.5(IQR) & Q3 + 1.5(IQR). Values more than 1.5(IQR) below the 1st quartile and values more than 1.5(IQR) above the 3rd quartile may be considered outliers. Outliers should be denoted with a specified symbol. Extra Optional Information on ‘Fences’ Values more than 3 IQRs below the 1st quartile and values more than 3 IQRs above the 3rd quartile may be considered ‘extreme outliers’, and may be denoted with a different symbol than other ‘non-extreme’ outliers. Additional Information: (1) Percentiles: divide the distribution up into 100 equal pieces, Examples: 95th percentile is the value separating the lower 95% of a distribution from the upper 5% 25th percentile is the value separating the lower 25% of a distribution from the upper 75%. (2) Statistics estimate parameters When utilizing a sample, we calculate statistics. Examples: When utilizing a population, we calculate parameters. Examples: 2