Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 12
Describing Distributions with
Numbers
Chapter 12
1
Chapter 12
2
Median
• Example 1 data: 2 4 6
Median (M) = 4
• Example 2 data: 2 4 6 8
Median = 5 (avg. of 4 and 6)
• Example 3 data: 6 2 4
Median  2
(order the values: 2 4 6 , so Median = 4)
Chapter 12
3
Example: Finding the median
• Consider the following set of data:
16 19 24 25 25 33 33 34 34 37 37 40 42 45 45 46 46 49 73
• There are 19 observations, hence the median
is the one in the middle, the 10th one. Hence
the median of these data is 37.
Chapter 12
4
Example: Finding the median
• Now consider the following data where there
is an even number of observations:
3 9 9 22 29 32 32 33 39 39 42 49 52 58 65 70
• The ones in the middle are 33 and 39, and
hence the median is the average of these
values, 36.
Chapter 12
5
Quartiles
• Three numbers that divide the ordered data
into four equal-sized groups.
• Q1 has 25% of the data below it.
• Q2 has 50% of the data below it. (Median)
• Q3 has 75% of the data below it.
Chapter 12
6
Chapter 12
7
Example: Finding the quartiles
• Consider the following set of data:
16 19 24 25 25 33 33 34 34 37 37 40 42 45 45 46 46 49 73
• We already know that the median is 37. To find the first quartile we
consider the first half of this data, namely,
16 19 24 25 25 33 33 34 34
• There are 9 observations here, so the median is the 5th one, namely, 25.
This is the first quartile.
• To find the third quartile we consider the second half of the original data,
that is,
37 40 42 45 45 46 46 49 73
• There are again 9 observations and the median is the one in the middle
which is 45, this is the third quartile.
Chapter 12
8
Example: Finding the quartiles
• Next, consider the following set of data:
3 9 9 22 29 32 32 33 39 39 42 49 52 58 65 70
• We already know that the median is 36. To find the first quartile we
consider the first half of this data, namely,
3 9 9 22 29 32 32 33
• There are 8 observations here, so the median is the average of the 4th and
5th one, namely, 25.5. This is the first quartile.
• To find the third quartile we consider the second half of the original data,
that is,
39 39 42 49 52 58 65 70
• There are again 8 observations and the median is again the average of 4th
and 5th observations, that is 50.5, this is the third quartile.
Chapter 12
9
Chapter 12
10
Chapter 12
11
Example: Five-number summary and boxplot
• Here are Roger Maris’ home run counts for his 12 years in the Major
Leagues, arranged in order:
5 8 9 13 14 16 23 26 28 33 39 61
• Since there are an even number of observations, the median is the
average of two middle values, that is, (16+23)/2 = 19.5
• The first quartile is the median of
5 8 9 13 14 16
that is (9+13)/2 = 11
• The third quartile is the median of the second half,
23 26 28 33 39 61
that is (28+33)/2 = 30.5
• The minimum and maximum values are obviously 5 and 61, respectively.
• Hence a five-number summary of these data is
5 11 19.5 30.5 61
Chapter 12
12
Example: Five-number summary and boxplot
•
How does the boxplot of this distribution compare with those of the data
corresponding to Barry Bonds and Mark McGwire given in the following figure:
Chapter 12
13
Example: Education and Income
•
Interpret the following figure which contains the boxplots of income among adults
with different levels of education.
Chapter 12
14
Chapter 12
15
Average or Mean
n
1
1
x   x 1  x 2  xn    xi
n
n i 1
Variance
n
1
2
2
s 
( xi  x )

( n  1) i 1
Chapter 12
16
Comparing the Mean & Median
• The mean and median of data from a
symmetric distribution should be close
together. The actual (true) mean and
median of a symmetric distribution are
exactly the same.
• In a skewed distribution, the mean is
farther out in the long tail than is the
median [the mean is “pulled” in the
direction of the possible outlier(s)].
Chapter 12
17
Example: Computing the mean and standard deviation
• Consider the following data
16 25 24 19 33 25 34 46 37 33 42 40 37 34 49 73 46 45 45
• The mean is
Chapter 12
18
Example: Computing the mean and standard deviation
• The previous figure shows Barry Bonds’s home run counts,
with their mean and distance of one observation from the
mean indicated.
• The idea behind the standard deviation is to average these 19
distances. To find the standard deviation we can use the
following table
Chapter 12
19
Example: Computing the mean and standard deviation
The average of these distances (the variance) is:
Notice that we “average” by dividing by one less than
the number of observations.
Finally, the standard deviation is computed by taking
the square root of the variation:
Chapter 12
20
Chapter 12
21
Chapter 12
22
Key Concepts
• Numerical Summaries
– Center (mean, median)
– Spread (variance, standard deviation, quartiles)
– Five-number summary & Boxplots
• Choosing mean versus median
• Choosing standard deviation versus fivenumber summary
Chapter 12
23
Exercise 12.7
• College tuitions. The following figure is a
stemplot of the tuition charged by 121
colleges in Illinois. The stems are thousands of
dollars and the leaves are hundreds of dollars.
• Find the five-number summary of Illinois
college tuitions.
• Would the mean tuition be clearly smaller
than the median, about the same, or larger?
Chapter 12
24
Figure 11.7 Stemplot of the Illinois tuition and fee data. (Data from the Web site
www.collegeillinois.com/en/collegefunding/costs.htm. This figure was created using
the Minitab software package.)
Exercise 12.11
• The richest 1%. The distribution of individual
incomes in the U.S. is strongly skewed to the
right. In 2004, the mean and median incomes
of the top 1% of Americans were $315,000
and $1,259,000.
• Which of these numbers is the mean?
Chapter 12
26
Exercise 12.31
• Raising Pay. A school system employs teachers at salaries
between $30,000 and $60,000. The teachers’ union and the
school board are negotiating the form of next year’s increase
in the salary schedule. Suppose that every teacher is given a
flat $1,000 raise.
• How much will the mean salary increase? The median salary?
• Will a flat $1,000 raise increase the spread as measured by
the distance between the quartiles?
• Will a flat $1,000 raise increase the spread as measured by
the standard deviation of the salaries?
Chapter 12
27
Related documents