Download 1 Describing Distributions Numerically (continued) Suppose a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Describing Distributions Numerically (continued)
Suppose a random sample of 25 grades from the first Math 2200 test were selected and put in order.
42, 55, 62, 67, 68, 70, 72, 72, 75, 76, 78, 78, 79, 80, 80, 83, 83, 85, 88, 88, 90, 91, 92, 94, 99
Another Measure of Spread
Inter-Quartile Range: IQR
-quartiles split the data into four ‘equal’ parts: only 3 quartiles exist
Q1 = lower quartile Q2 = middle quartile Q3 = upper quartile
-note that different technology have different methods for finding quartiles
IQR = upper quartile – lower quartile
-not affected by outliers
-good for skewed data : we use the median, not the mean, when finding IQR
Q1 = _________
Q2 = __________
Q3 = __________
IQR = _______________
Summary: When describing a distribution, one should (if possible) report
(1) the measure of center used and its value
(2) the measure of spread used and its value
(3) give a graph with explanation of general shape, modes, gaps, outliers, etc.
Selecting a Measure of Center and Measure of Spread for a set of data
Mean and standard deviation for the most popular measure of center and spread. They work
well for symmetrical data and data without outliers. If data has outliers, you can report the
mean and standard deviation with the outliers included and again without the outliers included.
Median and IQR may be used for skewed data or with data with outliers
Another Method for Describing a Distribution: Five number summary and box plot.
-gives a measure of center, spread and an idea of the shape of the distribution
-can be used to identify outliers
-can be used to quickly compare multiple distributions
Five Number Summary: 5 numbers describing the distribution, they are:
Minimum (smallest) value, Q1, Q2, Q3, & Maximum (largest) value
Box Plot: Graph describing a distribution using the five number summary.
-the graph consists of boxes comprised of Q1, Q2, & Q3, and ‘whiskers’ showing
maximum and minimum values, unless outliers are present
1
.
.
.
.
.
.
.
.
.
.
.
.
.
Identifying Outliers:
‘Fences’ help identify outliers. ‘Fences’ are the values calculated from Q1 - 1.5(IQR) &
Q3 + 1.5(IQR). Values more than 1.5(IQR) below the 1st quartile and values more than
1.5(IQR) above the 3rd quartile may be considered outliers. Outliers should be denoted
with a specified symbol.
Extra Optional Information on ‘Fences’
Values more than 3 IQRs below the 1st quartile and values more than 3 IQRs above the
3rd quartile may be considered ‘extreme outliers’, and may be denoted with a different
symbol than other ‘non-extreme’ outliers.
Additional Information:
(1) Percentiles: divide the distribution up into 100 equal pieces,
Examples:
95th percentile is the value separating the lower 95% of a distribution from the upper 5%
25th percentile is the value separating the lower 25% of a distribution from the upper 75%.
(2) Statistics estimate parameters
When utilizing a sample, we calculate statistics.
Examples:
When utilizing a population, we calculate parameters.
Examples:
2
Related documents