• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Statistics 800: Quantitative Business
Analysis for Decision Making
Measures of
Locations and
Variability
Lecture Outline
A. Discussion of Problems From
Chapter 3
 B. Population and Sample
 C. Averages (Mean, Median, Mode)
 D. Box Plot ; Cumulative Distribution
Curves
 E. Variance and Standard Deviation ;
Coefficient of Variation
403.4

2
Normal Bell Curve
Normal Bell Curve
0.4
Density
0.3
0.2
0.1
0.0
-5
0
5
X
403.4
3
Bell Curves Cont’d
Normal Bell Curve
0.4
DensityX
0.3
0.2
0.1
0.0
-10
0
10
X
403.4
4
Population and Sample
•Portion of the population is called sample
•Sample aspects are called statistics
•Population aspects are called parameters
•Population mean is denoted by µ
•Population standard deviation by
403.4
5
Notations
 Population mean is denoted by 
 Sample mean is denoted by
X
 Population standard deviation is denoted by 
 Sample standard deviation is denoted by S
403.4
6
Notations
Capital letters (X, Y, Z, etc.) denote
variables
 Lower case letters (x1, x2, x3, etc.)
denote observations for X.
 n = random sample size

403.4
7
Averages (Mean, Median,
Mode)

Mean of x1, x2, ….x300 for sample size n
= 300.

T = total,

Mean =
T   xi
x  T / 300
403.4
8
Mean (con’t)

Formula for computing weighted mean:
x   wixi
403.4
9
Median & Mode

Median
– has property that 50% of the numbers in
the data set are less, and 50% are greater
in magnitude
– The median need not be a number in the
data set

Mode
– most frequently occurring number in the
data set
403.4
10
Ranks and Percentiles

Ranks expressed as percentages are
called Percentiles of a data set are.
0th
percentile
25th
50th
percentile Percentile
(lower
quartile)
403.4
75th
Percentile
(upper
quartile)
100th
percentile
11
Box Plot

Five number summary:
– smallest value (0th percentile)
– lower quartile (25th percentile)
– median (50th percentile)
– upper quartile (75th percentile)
– largest value (100th percentile)

Box Plot is a graphical representation of
the five number summary
403.4
12
Two Box Plots
Box plot of a data set with median 3 and average 4.35
30
X
20
10
0
The distribution is skew tow ards extreme values
Box plot of a data set with median 3 and average 3
5
4
X
3
2
1
0
The distribution is symmetric around the average
Cumulative Distribution Curves
Plots data against their percentages
 Easy to estimate any percentile value

Cumulative distribution curve of a data set symmetric
around average 3
Cumulative Percent
100
50
0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
X
403.4
14
One More Cumulative
Distribution Curve
Cumulative distribution curve of a skew data set
with median 3 and average 4.35
Cumulative Percent
100
50
0
0
5
10
15
20
25
30
X
403.4
15
Standard Deviation and
Coefficient of Variation
Sampling Variation or Sampling Error:
Extent to which repeated samples may
differ from each other.
 Reliability of an estimate is measured
by its sampling error
 Standard deviation of a sample
quantifies the extent to which its values
vary from their mean

403.4
16
Notation and Formula for
Standard Deviation
s = sample standard deviation
 Formula:

 x  x 
2
s
n 1
403.4
17
An Empirical Rule

Nearly 67% of the population is inside
the one S.D. interval around the mean:
 

Nearly 95% of the population is inside
the two S.D. interval around the mean:
  2

Nearly 99.7% of the population is inside
the three S.D. interval around the mean:
  3
403.4
18
Coefficient of Variation
CV is a relative measure of variability; it
is the standard deviation divided by the
mean
 Useful when the variation is better
understood as a percentage

 
Population CV   /  100%, and sample CV  s / x 100%
403.4
19
Effects of Adding to or ReScaling Data
----------------------------------------------------------------------------------Original
-----------------------------------------------------------------------------------Variable
X
X+d
cX
cX + d
Average,
(median,
percentiles)
X
X +d
Standard
deviation
s
s
cX
X
c
cs
cs
403.4
+d
20
```
Related documents