Download chap03

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Basic Business Statistics
(9th Edition)
Chapter 3
Numerical Descriptive Measures
© 2004 Prentice-Hall, Inc.
Chap 3-1
Chapter Topics

Measures of Central Tendency

Mean, Median, Mode, Geometric Mean

Quartiles

Measures of Variation


Range, Interquartile Range, Variance and Standard
Deviation, Coefficient of Variation
The Empirical Rule and the BienaymeChebyshev Rule
© 2004 Prentice-Hall, Inc.
Chap 3-2
Chapter Topics



(continued)
Shape

Symmetric, Skewed

Using Box-and-Whisker Plots
Coefficient of Correlation
Pitfalls in Numerical Descriptive Measures and
Ethical Issues
© 2004 Prentice-Hall, Inc.
Chap 3-3
Summary Measures
Summary Measures
Central Tendency
Mean
Quartiles
Mode
Median
Range
Variation
Coefficient of
Variation
Variance
Geometric Mean
© 2004 Prentice-Hall, Inc.
Standard Deviation
Chap 3-4
Measures of Central Tendency
Central Tendency
Mean
Median
Mode
n
X 
X
i 1
N

i 1
Geometric Mean
X G   X1  X 2 
n
X
i
 Xn 
1/ n
i
N
© 2004 Prentice-Hall, Inc.
Chap 3-5
Mean (Arithmetic Mean)

Mean (Arithmetic Mean) of Data Values

Sample mean
Sample Size
n
X

X
i 1
i
n
Population mean
N

© 2004 Prentice-Hall, Inc.
X
i 1
N
i
X1  X 2 

n
 Xn
Population Size
X1  X 2 

N
 XN
Chap 3-6
Mean (Arithmetic Mean)
(continued)


The Most Common Measure of Central
Tendency
Affected by Extreme Values (Outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 5
© 2004 Prentice-Hall, Inc.
0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 6
Chap 3-7
Mean (Arithmetic Mean)
From a Frequency Distribution
(continued)

Approximating the Arithmetic Mean

Used when raw data are not available
c

X
m
j 1
j
fj
n
n  sample size
c  number of classes in the frequency distribution
m j  midpoint of the jth class
f j  frequencies of the jth class
© 2004 Prentice-Hall, Inc.
Chap 3-8
Median


Robust Measure of Central Tendency
Not Affected by Extreme Values
0 1 2 3 4 5 6 7 8 9 10
Median = 5

0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
In an Ordered Array, the Median is the ‘Middle’
Number


If n or N is odd, the median is the middle number
If n or N is even, the median is the average of the
2 middle numbers
© 2004 Prentice-Hall, Inc.
Chap 3-9
Mode






A Measure of Central Tendency
Value that Occurs Most Often
Not Affected by Extreme Values
There May Not Be a Mode
There May Be Several Modes
Used for Either Numerical or Categorical Data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
© 2004 Prentice-Hall, Inc.
0 1 2 3 4 5 6
No Mode
Chap 3-10
Geometric Mean

Useful in the Measure of Rate of Change of a
Variable Over Time
X G   X1  X 2 

 Xn 
1/ n
Geometric Mean Rate of Return

Measures the status of an investment over time
RG  1  R1   1  R2  
© 2004 Prentice-Hall, Inc.
 1  Rn  
1/ n
1
Chap 3-11
Example
An investment of $100,000 declined to $50,000 at the
end of year one and rebounded back to $100,000 at end
of year two:
R1  0.5 (or  50%) R2  1 (or 100% )
Average rate of return:
( 0.5)  (1)
R
 0.25 (or 25%)
2
Geometric rate of return:
RG  1  0.5  1  1 
1/ 2
  0.5   2  
1/ 2
© 2004 Prentice-Hall, Inc.
1
 1  11/ 2  1  0 (or 0%)
Chap 3-12
Quartiles

Split Ordered Data into 4 Quarters
25%
25%
 Q1 

25%
 Q2 
Position of i-th Quartile
25%
Q3 
i  n  1
 Qi  
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
1 9  1
Position of Q1 
 2.5
4
Q1
12  13


 12.5
2
Q1 and Q3 are Measures of Noncentral Location
 Q = Median, a Measure of Central Tendency
2

© 2004 Prentice-Hall, Inc.
Chap 3-13
Measures of Variation
Variation
Range
Interquartile
Range
Variance
Population
Variance
Sample
Variance
© 2004 Prentice-Hall, Inc.
Standard
Deviation
Coefficient
of Variation
Population
Standard
Deviation
Sample
Standard
Deviation
Chap 3-14
Range


Measure of Variation
Difference between the Largest and the
Smallest Observations:
Range  X Largest  X Smallest

Ignores How Data are Distributed
Range = 12 - 7 = 5
Range = 12 - 7 = 5
7
8
© 2004 Prentice-Hall, Inc.
9
10
11
12
7
8
9
10
11
12
Chap 3-15
Interquartile Range


Measure of Variation
Also Known as Midspread


Spread in the middle 50%
Difference between the First and Third
Quartiles
Data in Ordered Array: 11 12 13 16 16 17
17 18 21
Interquartile Range  Q3  Q1  17.5  12.5  5

Not Affected by Extreme Values
© 2004 Prentice-Hall, Inc.
Chap 3-16
Variance


Important Measure of Variation
Shows Variation about the Mean

Sample Variance:
n
S 
2

 X
i 1
X
i
2
n 1
Population Variance:
N
 
2
© 2004 Prentice-Hall, Inc.
 X
i 1
i

N
2
Chap 3-17
Standard Deviation



Most Important Measure of Variation
Shows Variation about the Mean
Has the Same Units as the Original Data

Sample Standard Deviation:
S

Population Standard Deviation:

© 2004 Prentice-Hall, Inc.
n
 X
i 1
X
i
2
n 1
N
 X
i 1
i

2
N
Chap 3-18
Standard Deviation
From a Frequency Distribution
(continued)

Approximating the Standard Deviation

Used when the raw data are not available and the
only source of data is a frequency distribution
m
c

S
j 1
 X  fj
2
j
n 1
n  sample size
c  number of classes in the frequency distribution
m j  midpoint of the jth class
© 2004 Prentice-Hall, Inc.
f j  frequencies of the jth class
Chap 3-19
Comparing Standard Deviations
Data A
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 3.338
Data B
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = .9258
Data C
11 12 13 14 15 16 17 18 19 20 21
© 2004 Prentice-Hall, Inc.
Mean = 15.5
s = 4.57
Chap 3-20
Coefficient of Variation

Measure of Relative Variation

Always in Percentage (%)

Shows Variation Relative to the Mean



Used to Compare Two or More Sets of Data
Measured in Different Units
S
CV  
X

100%

Sensitive to Outliers
© 2004 Prentice-Hall, Inc.
Chap 3-21
Comparing Coefficient
of Variation

Stock A:



Stock B:



Average price last year = $50
Standard deviation = $2
Average price last year = $100
Standard deviation = $5
Coefficient of Variation:

Stock A:

Stock B:
© 2004 Prentice-Hall, Inc.
S
CV  
X

 $2 
 100%  
 100%  4%

 $50 
S
CV  
X

 $5 
100%  
100%  5%

 $100 
Chap 3-22
Shape of a Distribution

Describe How Data are Distributed

Measures of Shape

Symmetric or skewed
Left-Skewed
Mean < Median < Mode
© 2004 Prentice-Hall, Inc.
Symmetric
Mean = Median =Mode
Right-Skewed
Mode < Median < Mean
Chap 3-23
Exploratory Data Analysis

Box-and-Whisker Plot

Graphical display of data using 5-number summary
X smallest Q
1
4
© 2004 Prentice-Hall, Inc.
6
Median( Q2)
8
Q3
10
Xlargest
12
Chap 3-24
Distribution Shape &
Box-and-Whisker Plot
Left-Skewed
Q1
© 2004 Prentice-Hall, Inc.
Q2 Q3
Symmetric
Q1Q2Q3
Right-Skewed
Q1 Q2 Q3
Chap 3-25
The Empirical Rule

For Data Sets That Are Approximately Bellshaped:



Roughly 68% of the Observations Fall Within 1
Standard Deviation Around the Mean
Roughly 95% of the Observations Fall Within 2
Standard Deviations Around the Mean
Roughly 99.7% of the Observations Fall Within 3
Standard Deviations Around the Mean
© 2004 Prentice-Hall, Inc.
Chap 3-26
The Bienayme-Chebyshev Rule

The Percentage of Observations Contained
Within Distances of k Standard Deviations
Around the Mean Must Be at Least 1  1/ k 2 100%




Applies regardless of the shape of the data set
At least 75% of the observations must be contained
within distances of 2 standard deviations around the
mean
At least 88.89% of the observations must be
contained within distances of 3 standard deviations
around the mean
At least 93.75% of the observations must be
contained within distances of 4 standard deviations
around the mean
© 2004 Prentice-Hall, Inc.
Chap 3-27
Coefficient of Correlation

Measures the Strength of the Linear
Relationship between 2 Quantitative Variables
n

r
 X
i 1
n
 X
i 1
© 2004 Prentice-Hall, Inc.
i
i
 X Yi  Y 
X
2
n
 Y  Y 
i 1
2
i
Chap 3-28
Features of Correlation
Coefficient

Unit Free

Ranges between –1 and 1



The Closer to –1, the Stronger the Negative
Linear Relationship
The Closer to 1, the Stronger the Positive
Linear Relationship
The Closer to 0, the Weaker Any Linear
Relationship
© 2004 Prentice-Hall, Inc.
Chap 3-29
Scatter Plots of Data with
Various Correlation Coefficients
Y
Y
Y
X
r = -1
X
r = -.6
Y
© 2004 Prentice-Hall, Inc.
X
r=0
Y
r = .6
X
r=1
X
Chap 3-30
Pitfalls in Numerical Descriptive
Measures and Ethical Issues

Data Analysis is Objective


Data Interpretation is Subjective


Should report the summary measures that best
meet the assumptions about the data set
Should be done in a fair, neutral and clear manner
Ethical Issues

Should document both good and bad results

Presentation should be fair, objective and neutral

Should not use inappropriate summary measures
to distort the facts
© 2004 Prentice-Hall, Inc.
Chap 3-31
Chapter Summary

Described Measures of Central Tendency

Mean, Median, Mode, Geometric Mean

Discussed Quartiles

Described Measures of Variation


Range, Interquartile Range, Variance and Standard
Deviation, Coefficient of Variation
Described the Empirical Rule and the
Bienayme-Chebyshev Rule
© 2004 Prentice-Hall, Inc.
Chap 3-32
Chapter Summary



(continued)
Illustrated Shape of Distribution

Symmetric, Skewed

Using Box-and-Whisker Plots
Discussed Correlation Coefficient
Addressed Pitfalls in Numerical Descriptive
Measures and Ethical Issues
© 2004 Prentice-Hall, Inc.
Chap 3-33
Related documents