Download Chapter 3

Document related concepts
no text concepts found
Transcript
Chapter 3
Descriptive Measures
Copyright © 2008 Pearson Education, Inc.
Slide 3-1
Breaking News
Kobe Sucks.
Copyright © 2008 Pearson Education, Inc.
Slide 3-2
MEASURES OF CENTER
MEAN, MEDIAN, MODE
Copyright © 2008 Pearson Education, Inc.
Slide 3-3
Definition 3.1
Mean of a Data Set
The mean of a data set is the sum of the observations
divided by the number of observations.
Copyright © 2008 Pearson Education, Inc.
Slide 3-4
Mean (Average): Example
NBA Scoring Leaders (2008-2009)
1. Dwayne Wade:
2. Lebron James:
3. Kobe Bryant:
4. Dirk Nowitzki:
5. Danny Granger:
Copyright © 2008 Pearson Education, Inc.
30.2
28.4
26.8
25.9
25.8
Slide 3-5
Mean (Average): Example Continued
Mean is calculated as follows:
(30.2 + 28.4 + 26.8 + 25.9 + 25.8)/5 =
Copyright © 2008 Pearson Education, Inc.
Slide 3-6
Definition 3.2
Median of a Data Set
Arrange the data in increasing order.
• If the number of observations is odd, then the median is
the observation exactly in the middle of the ordered list.
• If the number of observations is even, then the median is
the mean of the two middle observations in the ordered list.
In both cases, if we let n denote the number of observations,
then the median is at position (n + 1) / 2 in the ordered list.
Copyright © 2008 Pearson Education, Inc.
Slide 3-7
Median: Example
Yearly Salaries (Thousands)
75
85
90
100
Median: (85 + 90)/2 = 87.5
Copyright © 2008 Pearson Education, Inc.
Slide 3-8
Definition 3.3
Mode of a Data Set
Find the frequency of each value in the data set.
• If no value occurs more than once, then the data set has
no mode.
• Otherwise, any value that occurs with the greatest
frequency is a mode of the data set.
Copyright © 2008 Pearson Education, Inc.
Slide 3-9
Example 3.1
Ala spent one summer cutting grass for his mommy-duck.
Ala’s mom also hired some of his friends to help out. The
tables list typical weekly earnings for the two halves of
the summer. 13 worked during the first half, while 10
worked during the second half.
Table 3.1
Data Set I
Table 3.2
Copyright © 2008 Pearson Education, Inc.
Data Set II
Slide 3-10
Solution Example 3.1
Table 3.4
Interpretation: The employees who worked in the
first half of the summer earned more, on average (a
mean salary of $483.85), than those who worked in
the second half (a mean salary of $474.00).
Copyright © 2008 Pearson Education, Inc.
Slide 3-11
This figure shows the relative positions of the mean and
median for right-skewed, symmetric, and left-skewed
distributions. Note that the mean is pulled in the
direction of skewness, that is, in the direction of the
extreme observations. For a right-skewed distribution,
the mean is greater than the median; for a symmetric
distribution, the mean and the median are equal; and,
for a left-skewed distribution, the mean is less than the
median.
Figure 3.1
Copyright © 2008 Pearson Education, Inc.
Slide 3-12
Examples: Measures of Center

Lizzie takes 4 exams in complex analysis. Her
grades are 80, 85, 90, 95.

Ala takes 4 exams in abstract algebra. His
grades are 0, 90, 92, 94.

10 math Ph.D students took oral exams in
measure theory (Pass/Fail).
Copyright © 2008 Pearson Education, Inc.
Slide 3-13
Remarks: Measures of Center

Mean is sensitive to outliers (unusual values).

When outliers present, median often better
measure of center (Ala’s abstract algebra).

Mode only used for qualitative data.
Copyright © 2008 Pearson Education, Inc.
Slide 3-14
Summation Notation: Example
Ala’s abstract algebra scores: 0, 90, 92, 94.
x1  0
x2  90
x3  92
x4  94
Copyright © 2008 Pearson Education, Inc.
Slide 3-15
Summation Notation: Example (cont.)
4
x

x

x

x

x
i 1 2 3 4
i 1
 0  90  92  94
 276
Copyright © 2008 Pearson Education, Inc.
Slide 3-16
Summation Notation: Example (cont.)
To calculate the mean, let n = 4 (sample size):
n
x 
Copyright © 2008 Pearson Education, Inc.
x
i 1
n
i
276

 69
4
Slide 3-17
Definition 3.4
Copyright © 2008 Pearson Education, Inc.
Slide 3-18
MEASURES OF VARIATION
Is there a quantitative measure for determining
the “spread” in our data?
Copyright © 2008 Pearson Education, Inc.
Slide 3-19
The “data sets” have the same Mean, Median, and Mode
yet clearly differ!
Measures of Variation or Measures of Spread
Figure 3.3
Copyright © 2008 Pearson Education, Inc.
Slide 3-20
Definition 3.5
Range of a Data Set
The range of a data set is given by the formula
Range = Max – Min,
where Max and Min denote the maximum and minimum
observations, respectively.
Copyright © 2008 Pearson Education, Inc.
Slide 3-21
Measures of Variation or Measures of Spread:
The Range
Team I has range 6 inches, Team II has range 17 inches.
Figure 3.4
Copyright © 2008 Pearson Education, Inc.
Slide 3-22
Range: Example
Scores: 0, 90, 92, 94
Range: 94 – 0 = 94
Remark: Range does not take into account the
intermediate values.
Copyright © 2008 Pearson Education, Inc.
Slide 3-23
Deviations: Example
Weight
150
160
170
Deviation
150 – 160 = -10
160 – 160 = 0
170 – 160 = 10
The sum of the deviations is always zero:
 ( x  x)  (10  0  10)  0
Copyright © 2008 Pearson Education, Inc.
Slide 3-24
So, whatcha gonna do?
Copyright © 2008 Pearson Education, Inc.
Slide 3-25
Squared Deviation
Weight Deviation
Squared Deviation
2
150 150 – 160 = -10
 10  100
2
160 160 – 160 = 0
0  0
2
10  100
170 170 – 160 = 10
100 + 0 + 100 = 200 (sum of squares)
Copyright © 2008 Pearson Education, Inc.
Slide 3-26
Sample Variance


 x  x

2


s 
n 1

Copyright © 2008 Pearson Education, Inc.
2
Slide 3-27
Sample Variance (Example)
100  0  100
s 
3 1
 100
2
Copyright © 2008 Pearson Education, Inc.
Slide 3-28
But aren’t the units squared?
How can we recover the original
units?
Copyright © 2008 Pearson Education, Inc.
Slide 3-29
Sample Standard Deviation (Example)
s  100  10
2
Copyright © 2008 Pearson Education, Inc.
Slide 3-30
Sample Standard Deviation
This implies that any given data point has
an average “spread” of 10 units from the
mean.
Note that the standard deviation is
expressed in the same units as the
original data.
Copyright © 2008 Pearson Education, Inc.
Slide 3-31
Definition 3.6
Copyright © 2008 Pearson Education, Inc.
Slide 3-32
Formula 3.1
Copyright © 2008 Pearson Education, Inc.
Slide 3-33
Standard Deviation: the more variation, the larger the
standard deviation. Data set II has greater variation.
Table 3.10
Table 3.11
Copyright © 2008 Pearson Education, Inc.
Slide 3-34
Data set II has greater variation and the visual clearly
shows that it is more spread out.
Figure 3.6
Data Set I
Figure 3.7
Data Set II
Copyright © 2008 Pearson Education, Inc.
Slide 3-35
Exercise
1. Which data set has the largest standard
deviation: A={1, 3, 5} or B={2, 3, 4}?
2. By hand, calculate the sample standard
deviation for {1, 3, 5}.
Copyright © 2008 Pearson Education, Inc.
Slide 3-36
Answers
1. Set A since the data points have more
variation.
2. 2
Copyright © 2008 Pearson Education, Inc.
Slide 3-37
1986 NBA Finals
Hakeem Olajuwon Point Totals: 33, 21, 23, 20, 32, 19
Bird Point Totals: 20, 32, 25, 21, 17, 29
Olajuwon (Mean): 24.7
Bird (Mean): 24.0
Olajuwon (Standard Deviation): 5.67
Bird (Standard Deviation): 5.23
How can we interpret the means and standard deviations?
Copyright © 2008 Pearson Education, Inc.
Slide 3-38
Solutions
Olajuwon had a slightly higher points per game average
(M = 24.7) than Larry Bird (M = 24.0).
Larry Bird was slightly more consistent in game-to-game scoring
(Standard Deviation = 5.23) than Hakeem Olajuwon (Standard
Deviation = 5.67).
Copyright © 2008 Pearson Education, Inc.
Slide 3-39
Chebychev’s Theorem
For ANY distribution:
At least 75% of the data lie within 2 standard
deviations of the mean.
At least 89% of the data lie within 3 standard
deviations of the mean.
Copyright © 2008 Pearson Education, Inc.
Slide 3-40
Chebychev: Formula
1
1
In general: At least
lie within k standard
2
k
deviations of the mean.
1
1
1
, or 1   75%
4
22
At least
lie within k = 2
standard deviations of the mean.
1
1
1
, or 1   89%
9
32
At least
lie within k = 3
standard deviations of the mean.
Copyright © 2008 Pearson Education, Inc.
Slide 3-41
Chebychev: Example
Suppose we collect data from 1,000 individuals and
record the number of (long) Costco hotdogs they
consume yearly. Also, suppose the mean
obtained is 30, with a sample standard deviation
of 4 hotdogs. What can you conclude from
Chebychev?
Copyright © 2008 Pearson Education, Inc.
Slide 3-42
Copyright © 2008 Pearson Education, Inc.
Slide 3-43
Chebychev: Solution
We may conclude:
1. At least 75% of the participants (or 750)
consume between 22 and 38 hotdogs yearly.
2. At least 89% of the participants (or 890)
consume between 18 and 42 hotdogs yearly.
Copyright © 2008 Pearson Education, Inc.
Slide 3-44
Definition 3.7
Quartiles
Arrange the data in increasing order and determine the
median.
• The first quartile is the median of the part of the entire
data set that lies at or below the median of the entire data
set.
• The second quartile is the median of the entire data set.
• The third quartile is the median of the part of the entire
data set that lies at or above the median of the entire data
set.
Copyright © 2008 Pearson Education, Inc.
Slide 3-45
Five Number Summary: Example
A={1, 2, 3, 4, 5, 6, 7}
Minimum: 1
First Quartile: 2
Second Quartile: 4
Third Quartile: 6
Maximum: 7
Copyright © 2008 Pearson Education, Inc.
Slide 3-46
Definition 3.9
Five-Number Summary
The five-number summary of a data set is Min, Q1, Q2, Q3, Max.
Copyright © 2008 Pearson Education, Inc.
Slide 3-47
Definition 3.8
Interquartile Range
The interquartile range, or IQR, is the difference between
the first and third quartiles; that is, IQR = Q3 – Q1.
Copyright © 2008 Pearson Education, Inc.
Slide 3-48
Definition 3.10
Copyright © 2008 Pearson Education, Inc.
Slide 3-49
Procedure 3.1
Copyright © 2008 Pearson Education, Inc.
Slide 3-50
Definition 3.11
Copyright © 2008 Pearson Education, Inc.
Slide 3-51
Definition 3.12
Copyright © 2008 Pearson Education, Inc.
Slide 3-52
Figure 3.14
Definition 3.13
Copyright © 2008 Pearson Education, Inc.
Slide 3-53
Parameter/Statistic
Parameter with Population
Statistic with Sample
Copyright © 2008 Pearson Education, Inc.
Slide 3-54
Parameter or Statistic?
1. 1,000 students from Virginia Tech had an
average GPA of 1.3. What does “1.3”
represent?
2. The average American eats 80 cheeseburgers a
year. What does “80” represent?
Copyright © 2008 Pearson Education, Inc.
Slide 3-55
Answers
1. Statistic---refers to measurement of a sample.
2. Parameter---refers to measurement of
population.
Copyright © 2008 Pearson Education, Inc.
Slide 3-56
Definition 3.14
Definition 3.15
Copyright © 2008 Pearson Education, Inc.
Slide 3-57
Z-Scores
The standard score, or z-score, represents the
number of standard deviations a given value x
falls from the population mean. To find the zscore, we use the following formula:
value  mean x  
z

s.d .

Copyright © 2008 Pearson Education, Inc.
Slide 3-58
Z –Score (Remark)
Gives us a standard way of comparing values in
a given data set:
(Mean = 0; standard deviation = 1)
Copyright © 2008 Pearson Education, Inc.
Slide 3-59
Z-Score (Example 1)
The mean speed of vehicles along a highway is
56 mph with standard deviation 4 mph. The
speed of three cars is 62 mph, 47 mph, 56
mph. Find the z-score that corresponds to each
speed. What can you conclude?
Copyright © 2008 Pearson Education, Inc.
Slide 3-60
Z-Score (Example 1)
X = 62 mph: z = (62 – 56)/4 = 1.5
X = 47 mph: z = (47 – 56)/4 = -2.25
X = 56 mph: z = (56 – 56)/4 = 0
Copyright © 2008 Pearson Education, Inc.
Slide 3-61
Z-Score (Interpretation)
From the z-scores, we can conclude that a speed
of 62 mph is 1.5 standard deviations above the
mean, a speed of 47 mph is 2.25 standard
deviations below the mean, and a speed of 56
mph is equal to the mean.
Copyright © 2008 Pearson Education, Inc.
Slide 3-62
Z-Score (Example 2)
The midterms are graded for Math 241 for all 55
students. The average score is 85 with a
standard deviation of 5. Suppose Lucy receives
a score of 75 and Bobby receives a score of 90.
What are their corresponding z-scores? Interpret
the results.
Copyright © 2008 Pearson Education, Inc.
Slide 3-63
Z-Score (Example 2)
Lucy: z = (75 – 85)/5 = -2
Bobby: z = (90 – 85)/5 = 1
Thus, Lucy’s score lies 2 standard deviations
below the mean, while Bobby’s score is 1
standard deviation above the mean.
Copyright © 2008 Pearson Education, Inc.
Slide 3-64
Related documents