Download chap03-Numerical Des..

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
IE 340/440
PROCESS IMPROVEMENT
THROUGH PLANNED EXPERIMENTATION
Numerical Descriptive Measures
Dr. Xueping Li
University of Tennessee
© 2003 Prentice-Hall, Inc.
Chap 3-1
Chapter Topics

Measures of Central Tendency

Mean, Median, Mode, Geometric Mean

Quartile

Measure of Variation


Range, Interquartile Range, Variance and Standard
Deviation, Coefficient of Variation
Shape

Symmetric, Skewed, Using Box-and-Whisker Plots
© 2003 Prentice-Hall, Inc.
Chap 3-2
Chapter Topics



(continued)
The Empirical Rule and the BienaymeChebyshev Rule
Coefficient of Correlation
Pitfalls in Numerical Descriptive Measures and
Ethical Issues
© 2003 Prentice-Hall, Inc.
Chap 3-3
Summary Measures
Summary Measures
Central Tendency
Mean
Quartile
Mode
Median
Range
Variation
Coefficient of
Variation
Variance
Geometric Mean
© 2003 Prentice-Hall, Inc.
Standard Deviation
Chap 3-4
Measures of Central Tendency
Central Tendency
Mean
Median
Mode
n
X 
X
i 1
n

i 1
Geometric Mean
X G   X1  X 2 
N
X
i
 Xn 
1/ n
i
N
© 2003 Prentice-Hall, Inc.
Chap 3-5
Mean (Arithmetic Mean)

Mean (Arithmetic Mean) of Data Values

Sample mean
Sample Size
n
X

X
i 1
i
n
Population mean
N

© 2003 Prentice-Hall, Inc.
X
i 1
N
i
X1  X 2 

n
 Xn
Population Size
X1  X 2 

N
 XN
Chap 3-6
Mean (Arithmetic Mean)
(continued)


The Most Common Measure of Central
Tendency
Affected by Extreme Values (Outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 5
© 2003 Prentice-Hall, Inc.
0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 6
Chap 3-7
Mean (Arithmetic Mean)
(continued)

Approximating the Arithmetic Mean

Used when raw data are not available
c

X
m
j 1
j
fj
n
n  sample size
c  number of classes in the frequency distribution
m j  midpoint of the jth class
f j  frequencies of the jth class
© 2003 Prentice-Hall, Inc.
Chap 3-8
Median


Robust Measure of Central Tendency
Not Affected by Extreme Values
0 1 2 3 4 5 6 7 8 9 10
Median = 5

0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5
In an Ordered Array, the Median is the ‘Middle’
Number


If n or N is odd, the median is the middle number
If n or N is even, the median is the average of the
2 middle numbers
© 2003 Prentice-Hall, Inc.
Chap 3-9
Mode






A Measure of Central Tendency
Value that Occurs Most Often
Not Affected by Extreme Values
There May Not Be a Mode
There May Be Several Modes
Used for Either Numerical or Categorical Data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
© 2003 Prentice-Hall, Inc.
0 1 2 3 4 5 6
No Mode
Chap 3-10
Geometric Mean

Useful in the Measure of Rate of Change of a
Variable Over Time
X G   X1  X 2 

 Xn 
1/ n
Geometric Mean Rate of Return

Measures the status of an investment over time
RG  1  R1   1  R2  
© 2003 Prentice-Hall, Inc.
 1  Rn  
1/ n
1
Chap 3-11
Example
An investment of $100,000 declined to $50,000 at the
end of year one and rebounded back to $100,000 at end
of year two:
R1  0.5 (or  50%) R2  1 (or 100% )
Average rate of return:
( 0.5)  (1)
R
 0.25 (or 25%)
2
Geometric rate of return:
RG  1  0.5  1  1 
1/ 2
  0.5   2  
1/ 2
© 2003 Prentice-Hall, Inc.
1
 1  11/ 2  1  0 (or 0%)
Chap 3-12
Quartiles

Split Ordered Data into 4 Quarters
25%
25%
 Q1 

25%
 Q2 
Position of i-th Quartile
25%
Q3 
i  n  1
 Qi  
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
1 9  1
Position of Q1 
 2.5
4
Q1
12  13


 12.5
2
Q1 and Q3 are Measures of Noncentral Location
 Q = Median, a Measure of Central Tendency
2

© 2003 Prentice-Hall, Inc.
Chap 3-13
Measures of Variation
Variation
Variance
Range
Population
Variance
Sample
Variance
Interquartile Range
© 2003 Prentice-Hall, Inc.
Standard Deviation
Coefficient
of Variation
Population
Standard
Deviation
Sample
Standard
Deviation
Chap 3-14
Range


Measure of Variation
Difference between the Largest and the
Smallest Observations:
Range  X Largest  X Smallest

Ignores How Data are Distributed
Range = 12 - 7 = 5
Range = 12 - 7 = 5
7
8
© 2003 Prentice-Hall, Inc.
9
10
11
12
7
8
9
10
11
12
Chap 3-15
Interquartile Range


Measure of Variation
Also Known as Midspread


Spread in the middle 50%
Difference between the First and Third
Quartiles
Data in Ordered Array: 11 12 13 16 16 17
17 18 21
Interquartile Range  Q3  Q1  17.5  12.5  5

Not Affected by Extreme Values
© 2003 Prentice-Hall, Inc.
Chap 3-16
Variance


Important Measure of Variation
Shows Variation about the Mean

Sample Variance:
n
S 
2

 X
i 1
X
i
2
n 1
Population Variance:
N
 
2
© 2003 Prentice-Hall, Inc.
 X
i 1
i

N
2
Chap 3-17
Standard Deviation



Most Important Measure of Variation
Shows Variation about the Mean
Has the Same Units as the Original Data

Sample Standard Deviation:
S

Population Standard Deviation:

© 2003 Prentice-Hall, Inc.
n
 X
i 1
X
i
2
n 1
N
 X
i 1
i

2
N
Chap 3-18
Standard Deviation

(continued)
Approximating the Standard Deviation

Used when the raw data are not available and the
only source of data is a frequency distribution
m
c

S
j 1
 X  fj
2
j
n 1
n  sample size
c  number of classes in the frequency distribution
m j  midpoint of the jth class
© 2003 Prentice-Hall, Inc.
f j  frequencies of the jth class
Chap 3-19
Comparing Standard Deviations
Data A
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 3.338
Data B
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = .9258
Data C
11 12 13 14 15 16 17 18 19 20 21
© 2003 Prentice-Hall, Inc.
Mean = 15.5
s = 4.57
Chap 3-20
Coefficient of Variation

Measure of Relative Variation

Always in Percentage (%)

Shows Variation Relative to the Mean



Used to Compare Two or More Sets of Data
Measured in Different Units
S
CV  
X

100%

Sensitive to Outliers
© 2003 Prentice-Hall, Inc.
Chap 3-21
Comparing Coefficient
of Variation

Stock A:



Stock B:



Average price last year = $50
Standard deviation = $2
Average price last year = $100
Standard deviation = $5
Coefficient of Variation:

Stock A:

Stock B:
© 2003 Prentice-Hall, Inc.
S
CV  
X

 $2 
 100%  
 100%  4%

 $50 
S
CV  
X

 $5 
100%  
100%  5%

 $100 
Chap 3-22
Shape of a Distribution

Describe How Data are Distributed

Measures of Shape

Symmetric or skewed
Left-Skewed
Mean < Median < Mode
© 2003 Prentice-Hall, Inc.
Symmetric
Mean = Median =Mode
Right-Skewed
Mode < Median < Mean
Chap 3-23
Exploratory Data Analysis

Box-and-Whisker

Graphical display of data using 5-number summary
X smallest Q
1
4
© 2003 Prentice-Hall, Inc.
6
Median( Q2)
8
Q3
10
Xlargest
12
Chap 3-24
Distribution Shape &
Box-and-Whisker
Left-Skewed
Q1
© 2003 Prentice-Hall, Inc.
Q2 Q3
Symmetric
Q1Q2Q3
Right-Skewed
Q1 Q2 Q3
Chap 3-25
The Empirical Rule



For Most Data Sets, Roughly 68% of the
Observations Fall Within 1 Standard Deviation
Around the Mean
Roughly 95% of the Observations Fall Within
2 Standard Deviations Around the Mean
Roughly 99.7% of the Observations Fall Within
3 Standard Deviations Around the Mean
© 2003 Prentice-Hall, Inc.
Chap 3-26
The Bienayme-Chebyshev Rule

The Percentage of Observations Contained
Within Distances of k Standard Deviations
Around the Mean Must Be at Least 1  1/ k 2 100%




Applies regardless of the shape of the data set
At least 75% of the observations must be contained
within distances of 2 standard deviations around the
mean
At least 88.89% of the observations must be
contained within distances of 3 standard deviations
around the mean
At least 93.75% of the observations must be
contained within distances of 4 standard deviations
around the mean
© 2003 Prentice-Hall, Inc.
Chap 3-27
Coefficient of Correlation

Measures the Strength of the Linear
Relationship between 2 Quantitative Variables
n

r
 X
i 1
n
 X
i 1
© 2003 Prentice-Hall, Inc.
i
i
 X Yi  Y 
X
2
n
 Y  Y 
i 1
2
i
Chap 3-28
Features of Correlation
Coefficient

Unit Free

Ranges between –1 and 1



The Closer to –1, the Stronger the Negative
Linear Relationship
The Closer to 1, the Stronger the Positive
Linear Relationship
The Closer to 0, the Weaker Any Linear
Relationship
© 2003 Prentice-Hall, Inc.
Chap 3-29
Scatter Plots of Data with
Various Correlation Coefficients
Y
Y
Y
X
r = -1
X
r = -.6
Y
© 2003 Prentice-Hall, Inc.
X
r=0
Y
r = .6
X
r=1
X
Chap 3-30
Pitfalls in Numerical Descriptive
Measures and Ethical Issues

Data Analysis is Objective


Data Interpretation is Subjective


Should report the summary measures that best
meet the assumptions about the data set
Should be done in a fair, neutral and clear manner
Ethical Issues

Should document both good and bad results

Presentation should be fair, objective and neutral

Should not use inappropriate summary measures
to distort the facts
© 2003 Prentice-Hall, Inc.
Chap 3-31
Chapter Summary

Described Measures of Central Tendency

Mean, Median, Mode, Geometric Mean

Discussed Quartiles

Described Measures of Variation


Range, Interquartile Range, Variance and Standard
Deviation, Coefficient of Variation
Illustrated Shape of Distribution

Symmetric, Skewed, Using Box-and-Whisker Plots
© 2003 Prentice-Hall, Inc.
Chap 3-32
Chapter Summary



(continued)
Described the Empirical Rule and the
Bienayme-Chebyshev Rule
Discussed Correlation Coefficient
Addressed Pitfalls in Numerical Descriptive
Measures and Ethical Issues
© 2003 Prentice-Hall, Inc.
Chap 3-33
Related documents