Download Chap. 2: Methods for Describing Sets of Data

Document related concepts
no text concepts found
Transcript
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-1
Statistics for Business and
Economics
Chapter 2
Methods for Describing
Sets of Data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-2
Contents
1. Describing Qualitative Data
2. Graphical Methods for Describing
Quantitative Data
3. Numerical Measures of Central Tendency
4. Numerical Measures of Variability
5. Using the Mean and Standard Deviation to
Describe Data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-3
Contents
6. Numerical Measures of Relative Standing
7. Methods for Detecting Outliers: Box Plots
and z-scores
8. Graphing Bivariate Relationships
9. The Time Series Plot
10. Distorting the Truth with Descriptive
Techniques
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-4
Learning Objectives
1. Describe data using graphs
2. Describe data using numerical measures
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-5
2.1
Describing Qualitative Data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-6
Key Terms
A class is one of the categories into which
qualitative data can be classified.
The class frequency is the number of
observations in the data set falling into a
particular class.
The class relative frequency is the class
frequency divided by the total numbers of
observations in the data set.
The class percentage is the class relative
frequency multiplied by 100.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-7
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Frequency
Distribution
Histogram
2-8
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Frequency
Distribution
Histogram
2-9
Summary Table
1. Lists categories & number of elements in category
2. Obtained by tallying responses in category
3. May show frequencies (counts), % or both
Row Is
Category
Major
Accounting
Economics
Management
Total
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Count
130
20
50
200
Tally:
|||| ||||
|||| ||||
2-10
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Frequency
Distribution
Histogram
2-11
Bar Graph
Percent
Used
Also
Frequency
150
Equal Bar
Widths
Bar Height
Shows
Frequency or %
100
50
0
Acct.
Econ.
Major
Zero Point
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Mgmt.
Vertical Bars
for Qualitative
Variables
2-12
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Frequency
Distribution
Histogram
2-13
Pie Chart
1. Shows breakdown of
total quantity into
categories
2. Useful for showing
relative differences
Majors
Econ.
10%
Mgmt.
25%
36°
Acct.
65%
3. Angle size
•
(360°)(percent)
(360°) (10%) = 36°
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-14
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Frequency
Distribution
Histogram
2-15
Pareto Diagram
Like a bar graph, but with the categories arranged by
height in descending order from left to right.
Percent
Used
Also
Frequency
150
Equal Bar
Widths
Bar Height
Shows
Frequency or %
100
50
0
Acct.
Mgmt.
Major
Zero Point
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Econ.
Vertical Bars
for Qualitative
Variables
2-16
Summary
Bar graph: The categories (classes) of the qualitative
variable are represented by bars, where the height of
each bar is either the class frequency, class relative
frequency, or class percentage.
Pie chart: The categories (classes) of the qualitative
variable are represented by slices of a pie (circle). The
size of each slice is proportional to the class relative
frequency.
Pareto diagram: A bar graph with the categories
(classes) of the qualitative variable (i.e., the bars)
arranged by height in descending order from left to
right.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-17
Thinking Challenge
You’re an analyst for IRI. You want to show the
market shares held by Web browsers in 2006.
Construct a bar graph, pie chart, & Pareto diagram
to describe the data.
Browser
Firefox
Internet Explorer
Safari
Others
Mkt. Share (%)
14
81
4
1
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-18
Market Share (%)
Bar Graph Solution*
100%
80%
60%
40%
20%
0%
Firefox
Internet
Explorer
Safari
Others
Browser
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-19
Pie Chart Solution*
Market Share
Firefox,
14%
Safari, 4%
Others,
1%
Internet
Explorer,
81%
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-20
Market Share (%)
Pareto Diagram Solution*
100%
80%
60%
40%
20%
0%
Internet
Explorer
Firefox
Safari
Others
Browser
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-21
2.2
Graphical Methods for Describing
Quantitative Data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-22
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Histogram
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-23
Dot Plot
1. Horizontal axis is a scale for the quantitative variable,
e.g., percent.
2. The numerical value of each measurement is located
on the horizontal scale by a dot.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-24
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Histogram
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-25
Stem-and-Leaf Display
1. Divide each observation
into stem value and leaf
value
• Stems are listed in
order in a column
• Leaf value is placed in
corresponding stem
row to right of bar
2 144677
3 028
26
4 1
2. Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-26
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
Pie
Chart
Stem-&-Leaf
Display
Histogram
Pareto
Diagram
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-27
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-28
Histogram
Class
15.5 – 25.5
25.5 – 35.5
35.5 – 45.5
Count
5
Frequency
Relative
Frequency
Percent
4
Freq.
3
5
2
3
Bars
Touch
2
1
0
0
15.5
25.5
35.5
45.5
55.5
Lower Boundary
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-29
Summary
Dot plot: The numerical value of each quantitative
measurement in the data set is represented by a dot on a
horizontal scale. When data values repeat, the dots are
placed above one another vertically.
Stem-and-leaf display: The numerical value of the
quantitative variable is partitioned into a “stem” and a
“leaf.” The possible stems are listed in order in a column.
The leaf for each quantitative measurement in the data set is
placed in the corresponding stem row. Leaves for
observations with the same stem value are listed in
increasing order horizontally.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-30
Summary
Histogram: The possible numerical values of the
quantitative variable are partitioned into class intervals,
where each interval has the same width. These intervals
form the scale of the horizontal axis. The frequency or
relative frequency of observations in each class interval is
determined. A horizontal bar is placed over each class
interval, with height equal to either the class frequency or
class relative frequency.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-31
2.3
Numerical Measures
of Central Tendency
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-32
Thinking Challenge
$400,000
$70,000
$50,000
$30,000
... employees cite low pay -most workers earn only
$20,000.
$20,000
... President claims average
pay is $70,000!
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-33
Two Characteristics
The central tendency of the set of
measurements–that is, the tendency of the data to
cluster, or center, about certain numerical values.
Central Tendency
(Location)
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-34
Two Characteristics
The variability of the set of measurements–that
is, the spread of the data.
Variation
(Dispersion)
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-35
Standard Notation
Measure
Sample
Population
Mean
X

Size
n
N
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-36
Mean
1.
2.
3.
4.
Most common measure of central tendency
Acts as ‘balance point’
Affected by extreme values (‘outliers’)
Denoted x where
n
x 
x i
i 1
n

x 1  x 2 … x
n
n
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-37
Mean Example
Raw Data:
10.3 4.9 8.9 11.7 6.3 7.7
n
x 

x i
i 1
n

x1x2 x
3
x
4
x
5
x6
6
10 .3  4.9  8.9  11.7  6.3  7.7
6
 8.30
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-38
Median
1. Measure of central tendency
2. Middle value in ordered sequence
•
•
If n is odd, middle value of sequence
If n is even, average of 2 middle values
3. Position of median in sequence
n 1
Positioning Point 
2
4. Not affected by extreme values
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-39
Median Example
Odd-Sized Sample
• Raw Data: 24.1 22.6 21.5 23.7 22.6
• Ordered: 21.5 22.6 22.6 23.7 24.1
• Position:
1
2
3
4
5
n 1 5 1
Positioning Point 

 3.0
2
2
Median  22 .6
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-40
Median Example
Even-Sized Sample
• Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
• Position:
1
2
3
4
5
6
n 1 6 1
Positioning Point 

 3.5
2
2
7.7  8.9
Median 
 8.30
2
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-41
Mode
1. Measure of central tendency
2. Value that occurs most often
3. Not affected by extreme values
4. May be no mode or several modes
5. May be used for quantitative or qualitative
data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-42
Mode Example
• No Mode
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• One Mode
Raw Data: 6.3 4.9 8.9
6.3 4.9 4.9
• More Than 1 Mode
Raw Data: 21 28
41
28
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
43
43
2-43
Thinking Challenge
You’re a financial analyst
for Prudential-Bache
Securities. You have
collected the following
closing stock prices of new
stock issues: 17, 16, 21, 18,
13, 16, 12, 11.
Describe the stock prices
in terms of central
tendency.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-44
Central Tendency Solution*
Mean
n
x 

x i
i 1
n

x 1  x 2 … x
8
8
17  16  21  18  13  16  12  11
8
 15 .5
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-45
Central Tendency Solution*
Median
• Raw Data: 17 16 21
• Ordered: 11 12 13
• Position:
1 2 3
n
Positioning Point 
Median 
16  16
2
18 13 16 12 11
16 16 17 18 21
4 5 6 7 8
1 8 1

 4.5
2
2
 16
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-46
Central Tendency Solution*
Mode
Raw Data:
17 16 21 18 13 16 12 11
Mode = 16
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-47
Summary of
Central Tendency Measures
Measure
Mean
Median
Mode
Formula
x i / n
(n+1)
Position
2
none
Description
Balance Point
Middle Value
When Ordered
Most Frequent
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-48
Shape
1. Describes how data are distributed
2. Measures of Shape
• Skew = Symmetry
Left-Skewed
Mean Median
Symmetric
Mean = Median
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Right-Skewed
Median Mean
2-49
2.4
Numerical Measures
of Variability
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-50
Range
1. Measure of dispersion
2. Difference between largest & smallest
observations
Range = xlargest – xsmallest
3. Ignores how data are distributed
7 8 9 10
Range = 10 – 7 = 3
7 8 9 10
Range = 10 – 7 = 3
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-51
Variance &
Standard Deviation
1. Measures of dispersion
2. Most common measures
3. Consider how data are distributed
4. Show variation about mean (x or μ)
x = 8.3
4
6
8 10 12
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-52
Standard Notation
Measure
Mean
Sample
Population
x

s

Standard
Deviation
2
Variance
s
Size
n
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.

2
N
2-53
Sample Variance Formula
n
s 
2
 x
i 1
i
 x
2
n 1
x1  x    x2  x 


2
2

  xn  x 
2
n 1
n – 1 in denominator!
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-54
Sample Standard Deviation
Formula
s  s2
n

 x
i 1
i
 x
n 1
 x1  x    x2  x 
2

2
2

  xn  x 
2
n 1
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-55
Variance Example
Raw Data:
10.3 4.9 8.9 11.7 6.3 7.7
n
s
2

 (x i  x )
i 1
n
2
n 1
where x 
2
s
2
2
x i
i 1
n
 8.3
2
10 .3  8.3 )  (4.9  8.3 )  …  (7.7  8.3 )
(

6 1
 6.368
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-56
Thinking Challenge
• You’re a financial analyst
for Prudential-Bache
Securities. You have
collected the following
closing stock prices of
new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
• What are the variance
and standard deviation
of the stock prices?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-57
Variation Solution*
Sample Variance
Raw Data: 17 16 21 18 13 16 12 11
n
s
2

n
2
 (x i  x )
i 1
n 1
where x 
2
s
2
2
x i
i 1
n
 15 .5
2
17  15 .5 )  (16  15 .5 )  …  (11  15 .5 )
(

 11.14
8 1
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-58
Variation Solution*
Sample Standard Deviation
n
s  s2 
 x
i
 x
i1
n 1
2
 11.14  3.34
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-59
Summary of
Variation Measures
Measure
Formula
Description
X largest – X smallest
Range
Standard Deviation
(Sample)
n
 x  x 
2
i
Total Spread
Dispersion about
Sample Mean
i1
n 1
Standard Deviation
(Population)
n
 x  µ 
2
i
x
i1
Dispersion about
Population Mean
N
n
Variance
(Sample)
 xi  x 
2
i1
n 1
Squared Dispersion
about Sample Mean
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-60
2.5
Using the Mean and Standard
Deviation to Describe Data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-61
Interpreting Standard Deviation:
Chebyshev’s Theorem
• Applies to any shape data set
• No useful information about the fraction of data in the
interval x – s to x + s
• At least 3/4 of the data lies in the interval
x – 2s to x + 2s
• At least 8/9 of the data lies in the interval
x – 3s to x + 3s
• In general, for k > 1, at least 1 – 1/k2 of the data lies
in the interval x – ks to x + ks
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-62
Interpreting Standard Deviation:
Chebyshev’s Theorem
x  3s
x  2s
xs
x
xs
x  2s
x  3s
No useful information
At least 3/4 of the data
At least 8/9 of the data
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-63
Chebyshev’s Theorem Example
• Previously we found the mean
closing stock price of new stock
issues is 15.5 and the standard
deviation is 3.34.
• Use this information to form an
interval that will contain at least
75% of the closing stock prices of
new stock issues.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-64
Chebyshev’s Theorem Example
At least 75% of the closing stock prices of new stock
issues will lie within 2 standard deviations of the mean.
x = 15.5
s = 3.34
(x – 2s, x + 2s) = (15.5 – 2∙3.34, 15.5 + 2∙3.34)
= (8.82, 22.18)
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-65
Interpreting Standard Deviation:
Empirical Rule
• Applies to data sets that are mound shaped and
symmetric
• Approximately 68% of the measurements lie in
the interval x  s to x  s
• Approximately 95% of the measurements lie in
the interval x  2s to x  2s
• Approximately 99.7% of the measurements lie
in the interval x  3s to x  3s
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-66
Interpreting Standard Deviation:
Empirical Rule
x – 3s
x – 2s
x–s
x
x+s
x +2s
x + 3s
Approximately 68% of the measurements
Approximately 95% of the measurements
Approximately 99.7% of the measurements
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-67
Empirical Rule Example
Previously we found the mean
closing stock price of new
stock issues is 15.5 and the
standard deviation is 3.34. If
we can assume the data is
symmetric and mound shaped,
calculate the percentage of the
data that lie within the intervals
x + s, x + 2s, x + 3s.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-68
Empirical Rule Example
• According to the Empirical Rule, approximately 68%
of the data will lie in the interval (x – s, x + s),
(15.5 – 3.34, 15.5 + 3.34) = (12.16, 18.84)
• Approximately 95% of the data will lie in the interval
(x – 2s, x + 2s),
(15.5 – 2∙3.34, 15.5 + 2∙3.34) = (8.82, 22.18)
• Approximately 99.7% of the data will lie in the interval
(x – 3s, x + 3s),
(15.5 – 3∙3.34, 15.5 + 3∙3.34) = (5.48, 25.52)
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-69
2.6
Numerical Measures
of Relative Standing
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-70
Numerical Measures of
Relative Standing: Percentiles
• Describes the relative location of a
measurement compared to the rest of the data
• The pth percentile is a number such that p% of
the data falls below it and (100 – p)% falls
above it
• Median = 50th percentile
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-71
Percentile Example
• You scored 560 on the GMAT exam. This
score puts you in the 58th percentile.
• What percentage of test takers scored lower
than you did?
• What percentage of test takers scored higher
than you did?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-72
Percentile Example
• What percentage of test takers scored lower
than you did?
58% of test takers scored lower than 560.
• What percentage of test takers scored higher
than you did?
(100 – 58)% = 42% of test takers scored
higher than 560.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-73
Numerical Measures of
Relative Standing: z–Scores
• Describes the relative location of a
measurement compared to the rest of the data
• Sample z–score
xx
z
s
Population z–score
z
x µ

• Measures the number of standard deviations
away from the mean a data value is located
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-74
z–Score Example
• The mean time to assemble a
product is 22.5 minutes with a
standard deviation of 2.5 minutes.
• Find the z–score for an item that
took 20 minutes to assemble.
• Find the z–score for an item that
took 27.5 minutes to assemble.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-75
z–Score Example
x = 20, μ = 22.5 σ = 2.5
z = x σ– μ = 20 – 22.5 = –1.0
2.5
x = 27.5, μ = 22.5 σ = 2.5
z = x σ– μ = 27.5 – 22.5 = 2.0
2.5
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-76
Interpretation of z–Scores for
Mound-Shaped Distributions
of Data
1. Approximately 68% of the measurements
will have a z-score between –1 and 1.
2. Approximately 95% of the measurements
will have a z-score between –2 and 2.
3. Approximately 99.7% of the measurements
will have a z-score between –3 and 3.
(see the figure on the next slide)
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-77
Interpretation of z–Scores
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-78
2.7
Methods for Detecting Outliers:
Box Plots and z-Scores
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-79
Outlier
An observation (or measurement) that is unusually large
or small relative to the other values in a data set is called
an outlier. Outliers typically are attributable to one of
the following causes:
1. The measurement is observed, recorded, or entered
into the computer incorrectly.
2. The measurement comes from a different
population.
3. The measurement is correct but represents a rare
(chance) event.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-80
Quartiles
Measure of noncentral tendency
Split ordered data into 4 quarters
25%
25%
Q1
25%
Q2
25%
Q3
Lower quartile QL is 25th percentile.
Middle quartile m is the median.
Upper quartile QU is 75th percentile.
Interquartile range: IQR = QU – QL
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-81
Quartile (Q2) Example
• Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
• Position:
1
2
3
4
5
6
Q2 is the median, the average of the two middle
scores (7.7 + 8.9)/2 = 8.3
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-82
Quartile (Q1) Example
• Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
• Position:
1
2
3
4
5
6
QL is median of bottom half = 6.3
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-83
Quartile (Q3) Example
• Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
• Position:
1
2
3
4
5
6
QU is median of bottom half = 10.3
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-84
Interquartile Range
1. Measure of dispersion
2. Also called midspread
3. Difference between third & first quartiles
• Interquartile Range = Q3 – Q1
4. Spread in middle 50%
5. Not affected by extreme values
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-85
Thinking Challenge
• You’re a financial analyst for
Prudential-Bache Securities.
You have collected the
following closing stock prices
of new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
• What are the quartiles, Q1
and Q3, and the interquartile
range?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-86
Quartile Solution*
Q1
Raw Data:
Ordered:
Position:
17 16 21 18 13 16 12 11
11 12 13 16 16 17 18 21
1 2 3 4 5 6 7 8
QL is the median of the bottom half, the average
of the two middle scores (12 + 13)/2 = 12.5
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-87
Quartile Solution*
Q3
Raw Data:
Ordered:
Position:
17 16 21 18 13 16 12 11
11 12 13 16 16 17 18 21
1 2 3 4 5 6 7 8
QU is the median of the bottom half, the average
of the two middle scores (17 + 18)/2 = 17.5
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-88
Interquartile Range Solution*
Interquartile Range
Raw Data: 17 16 21 18 13 16 12 11
Ordered:
11 12 13 16 16 17 18 21
Position:
1 2 3 4 5 6 7 8
Interquartile Range = Q3 – Q1 = 17.5 – 12.5 = 5
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-89
Box Plot
1. Graphical display of data using 5-number
summary
Xsmallest Q 1 Median Q 3
4
6
8
10
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Xlargest
12
2-90
Box Plot
1. Draw a rectangle (box) with the ends
(hinges) drawn at the lower and upper
quartiles (QL and QU). The median data is
shown by a line or symbol (such as “+”).
2. The points at distances 1.5(IQR) from each
hinge define the inner fences of the data set.
Line (whiskers) are drawn from each hinge
to the most extreme measurements inside the
inner fence.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-91
Box Plot
3. A second pair of fences, the outer fences, are
defined at a distance of 3(IQR) from the hinges.
One symbol (*) represents measurements falling
between the inner and outer fences, and another (0)
represents measurements beyond the outer fences.
4. Symbols that represent the median and extreme data
points vary depending on software used. You may
use your own symbols if you are constructing a box
plot by hand.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-92
Shape & Box Plot
Left-Skewed
Q 1 Median Q3
Symmetric
Q1
Median Q 3
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Right-Skewed
Q 1 Median Q 3
2-93
Detecting Outliers
Box Plots: Observations falling between the
inner and outer fences are deemed suspect
outliers. Observations falling beyond the
outer fence are deemed highly suspect
outliers.
z-scores: Observations with z-scores greater than
3 in absolute value are considered outliers.
(For some highly skewed data sets,
observations with z-scores greater than 2 in
absolute value may be outliers.)
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-94
2.10
Distorting the Truth with
Descriptive Statistics
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-95
Errors in Presenting Data
1. Use area to equate to value
2. No relative basis in
comparing data batches
3. Compress the vertical axis
4. No zero point on the vertical
axis
5. Gap in the vertical axis
6. Use of misleading wording
7. Knowing central tendency
without knowing variability
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-96
Reader Equates Area to Value
Bad Presentation
Good Presentation
Minimum Wage
1960: $1.00
Minimum Wage
4
$
1970: $1.60
2
1980: $3.10
0
1990: $3.80
1960
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
1970
1980
1990
2-97
No Relative Basis
Bad Presentation
300
Freq.
Good Presentation
A’s by Class
A’s by Class
30%
200
20%
100
10%
0
0%
FR SO
JR
SR
%
FR SO JR SR
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-98
Compressing
Vertical Axis
Bad Presentation
Good Presentation
Quarterly Sales
200
$
Quarterly Sales
50
100
25
0
0
Q1 Q2 Q3 Q4
$
Q1
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Q2
Q3
Q4
2-99
No Zero Point
on Vertical Axis
Bad Presentation
Good Presentation
Monthly Sales
45
$
Monthly Sales
60
42
40
39
20
36
0
J M M J
S N
$
J
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
M M J
S
N
2-100
Gap in the Vertical Axis
Bad Presentation
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-101
Changing the Wording
Changing the title of the graph can influence the reader.
We’re not doing so well.
Still in prime years!
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-102
Knowing only central tendency
Knowing ONLY the central tendency might lead one
to purchase Model A. Knowing the variability as
well may change one’s decision!
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-103
Key Ideas
Describing Qualitative Data
1.
2.
3.
4.
Identify category classes
Determine class frequencies
Class relative frequency = (class freq)/n
Graph relative frequencies
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-104
Key Ideas
Graphing Quantitative Data
1 Variable
1. Identify class intervals
2. Determine class interval frequencies
3. Class relative relative frequency =
(class interval frequencies)/n
4. Graph class interval relative frequencies
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-105
Key Ideas
Graphing Quantitative Data
2 Variables
Scatterplot
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-106
Key Ideas
Numerical Description of Quantitative Data
Central Tendency
Mean
Median
Mode
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-107
Key Ideas
Numerical Description of Quantitative Data
Variation
Range
Variance
Standard Deviation
Interquartile range
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-108
Key Ideas
Numerical Description of Quantitative Data
Relative standing
Percentile score
z-score
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-109
Key Ideas
Rules for Detecting Quantitative Outliers
Interval
Chebyshev’s Rule
Empirical Rule
x s
x  2s
x  3s
At least 0%
At least 75%
At least 89%
≈ 68%
≈ 95%
All
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-110
Key Ideas
Rules for Detecting Quantitative Outliers
Method
Box plot:
z-score
Suspect
Values
between inner
and outer
fences
Highly Suspect
Values beyond
outer fences
|z| > 3
2 < |z| < 3
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
2-111