Download Chapter 3 - Personal.kent.edu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Business Statistics:
A Decision-Making Approach
6th Edition
Chapter 3
Describing Data Using
Numerical Measures
Fundamentals of Business Statistics - Murali Shanker
Chap 3-1
Chapter Goals
To establish the usefulness of summary measures of data.
The Scientific Method
1. Formulate a theory
2. Collect data to test the theory
3. Analyze the results
4. Interpret the results, and make decisions
Murali Shanker – Fundamentals of Business Statistics
Chap 3-2
Summary Measures
Describing Data Numerically
Center and Location
Other Measures
of Location
Mean
Median
Mode
Weighted Mean
Variation
Range
Percentiles
Interquartile Range
Quartiles
Variance
Standard Deviation
Coefficient of
Variation
Murali Shanker – Fundamentals of Business Statistics
Chap 3-3
Measures of Center and Location
Overview
Center and Location
Mean
Median
Mode
Weighted Mean
n
x
x
i1
i
XW
n

i1
i
i
i
N
x
wx


w
wx


w
i
N
Murali Shanker – Fundamentals of Business Statistics
W
i
i
i
Chap 3-4
Mean (Arithmetic Average)
The mean of a set of quantitative data, X1,X2,…,Xn, is
equal to the sum of the measurements divided by
the number of measurements.

n = Sample Size
Sample mean
n
x

x
i1
x1  x 2    x n

n
i
n
N = Population Size
Population mean
N
x
x1  x 2    xN


N
N
i1
Murali Shanker – Fundamentals of Business Statistics
i
Chap 3-5
Example
Find the mean of the following 5 numbers 5, 3, 8,
5, 6
Murali Shanker – Fundamentals of Business Statistics
Chap 3-6
Mean (Arithmetic Average)
(continued)


Affected by extreme values (outliers)
For non-symmetrical distributions, the mean is located
away from the concentration of items.
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
1  2  3  4  5 15

3
5
5
Murali Shanker – Fundamentals of Business Statistics
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1  2  3  4  10 20

4
5
5
Chap 3-7
YDI 5.1 and 5.2


Kim’s test scores are 7, 98, 25, 19, and 26.
Calculate Kim’s mean test score. Does the
mean do a good job of capturing Kim’s test
scores?
The mean score for 3 students is 54, and the
mean score for 4 different students is 76. What
is the mean score for all 7 students?
Murali Shanker – Fundamentals of Business Statistics
Chap 3-8
Median
The median Md of a data set is the middle number when
the measurements are arranged in ascending (or
descending) order.
Calculating the Median:
1.
Arrange the n measurements from the smallest to the
largest.
2.
If n is odd, the median is the middle number.
3.
If n is even, the median is the mean (average) of the
middle two numbers.
Example: Calculate the median of 5, 3, 8, 5, 6
Murali Shanker – Fundamentals of Business Statistics
Chap 3-9
Median

Not affected by extreme values
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3

In an ordered array, the median is the “middle”
number. What if the values in the data set are
repeated?
Murali Shanker – Fundamentals of Business Statistics
Chap 3-10
Mode
Mode is the measurement that occurs with the greatest
frequency
Example: 5, 3, 8, 6, 6
The modal class in a frequency distribution with equal
class intervals is the class with the largest frequency. If
the frequency polygon has only a single peak, it is said
to be unimodal. If the frequency polygon has two peaks,
it is said to be bimodal.
Murali Shanker – Fundamentals of Business Statistics
Chap 3-11
Review Example

Five houses on a hill by the beach
$2,000 K
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
$500 K
$300 K
$100 K
$100 K
Murali Shanker – Fundamentals of Business Statistics
Chap 3-12
Summary Statistics
House Prices:
$2,000,000
500,000
300,000
100,000
100,000

Mean:

Median:

Mode:
Sum 3,000,000
Murali Shanker – Fundamentals of Business Statistics
Chap 3-13
Which measure of location
is the “best”?

Mean is generally used, unless
extreme values (outliers) exist

Then median is often used, since
the median is not sensitive to
extreme values.

Example: Median home prices may be
reported for a region – less sensitive to
outliers
Murali Shanker – Fundamentals of Business Statistics
Chap 3-14
Shape of a Distribution

Describes how data is distributed

Symmetric or skewed
Left-Skewed
Symmetric
Mean < Median < Mode Mean = Median =
Mode
(Longer tail extends to left)
Murali Shanker – Fundamentals of Business Statistics
Right-Skewed
Mode < Median < Mean
(Longer tail extends to right)
Chap 3-15
Other Location Measures
Other Measures
of Location
Percentiles
Let x1, x2,⋯, xn be a set of n
measurements arranged in increasing
(or decreasing) order. The pth
percentile is a number x such that p%
of the measurements fall below the pth
percentile.
Murali Shanker – Fundamentals of Business Statistics
Quartiles

1st quartile = 25th percentile

2nd quartile = 50th percentile
= median

3rd quartile = 75th percentile
Chap 3-16
Quartiles

Quartiles split the ranked data into 4 equal
groups
25% 25%
25%
25%
Q1

Q2
Q3
Example: Find the first quartile
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
Murali Shanker – Fundamentals of Business Statistics
Chap 3-17
Box and Whisker Plot

A Graphical display of data using 5-number
summary:
Minimum -- Q1 -- Median -- Q3 -- Maximum
Example:
25%
Minimum
Minimum
25%
1st
1st
Quartile
Quartile
Murali Shanker – Fundamentals of Business Statistics
25%
Median
Median
25%
3rd
3rd
Quartile
Maximum
Maximum
Quartile
Chap 3-18
Shape of Box and Whisker Plots

The Box and central line are centered between the
endpoints if data is symmetric around the median

A Box and Whisker plot can be shown in either vertical
or horizontal format
Murali Shanker – Fundamentals of Business Statistics
Chap 3-19
Distribution Shape and
Box and Whisker Plot
Left-Skewed
Q1
Q2 Q3
Murali Shanker – Fundamentals of Business Statistics
Symmetric
Q1 Q2 Q3
Right-Skewed
Q1 Q2 Q3
Chap 3-20
Box-and-Whisker Plot Example

Below is a Box-and-Whisker plot for the following
data:
Min
0
Q1
2
2
Q2
2
00 22 33 55

3
3
Q3
4
5
5
Max
10
27
27
27
This data is very right skewed, as the plot depicts
Murali Shanker – Fundamentals of Business Statistics
Chap 3-21
Measures of Variation
Variation
Range
Interquartile
Range
Variance
Standard Deviation
Population
Variance
Population
Standard
Deviation
Sample
Variance
Sample
Standard
Deviation
Murali Shanker – Fundamentals of Business Statistics
Coefficient of
Variation
Chap 3-22
Variation

Measures of variation give information on
the spread or variability of the data
values.
Same center,
different variation
Murali Shanker – Fundamentals of Business Statistics
Chap 3-23
Range


Simplest measure of variation
Difference between the largest and the smallest
observations:
Range = xmaximum – xminimum
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Murali Shanker – Fundamentals of Business Statistics
Chap 3-24
Disadvantages of the Range


Considers only extreme values
With a frequency distribution, the range of
original data cannot be determined exactly.
Murali Shanker – Fundamentals of Business Statistics
Chap 3-25
Interquartile Range

Can eliminate some outlier problems by using
the interquartile range

Eliminate some high-and low-valued
observations and calculate the range from the
remaining values.

Interquartile range = 3rd quartile – 1st quartile
Murali Shanker – Fundamentals of Business Statistics
Chap 3-26
Interquartile Range
Example:
X
minimum
Q1
25%
12
Murali Shanker – Fundamentals of Business Statistics
Median
(Q2)
25%
30
25%
45
X
Q3
maximum
25%
57
70
Chap 3-27
YDI 5.8
Consider three sampling designs to estimate
the true population mean (the total sample
size is the same for all three designs):
1.
simple random sampling
2.
stratified random sampling taking equal
sample sizes from the two strata
3.
stratified random sampling taking most units
from one strata, but sampling a few units
from the other strata

For which population will design (1) and (2)
be comparably effective?

For which population will design (2) be the
best?

For which population will design (3) be the
best?

Which stratum in this population should
have the higher sample size?
Murali Shanker – Fundamentals of Business Statistics
Chap 3-28
Variance

Average of squared deviations of values from
n
the mean

Sample variance:
Example: 5, 3, 8, 5, 6
Murali Shanker – Fundamentals of Business Statistics
s2 
 (x
i1
i
 x)
2
n -1
Chap 3-29
Variance



The greater the variability of the values in a data set,
the greater the variance is. If there is no variability of the
values — that is, if all are equal and hence all are equal
to the mean — then s2 = 0.
The variance s2 is expressed in units that are the
square of the units of measure of the characteristic
under study. Often, it is desirable to return to the original
units of measure which is provided by the standard
deviation.
The positive square root of the variance is called the
sample standard deviation and is denoted by s
s s
Murali Shanker – Fundamentals of Business Statistics
2
Chap 3-30
Population Variance

Population variance:
N
σ2 

 (x
i1
i
 μ)
2
N
Population Standard Deviation
N
σ
 (x
i1
Murali Shanker – Fundamentals of Business Statistics
i
 μ)
2
N
Chap 3-31
Comparing Standard Deviations
Data A
11
12
13
14
15
16
17
18
19
20 21
Mean = 15.5
s = 3.338
20 21
Mean = 15.5
s = .9258
20 21
Mean = 15.5
s = 4.57
Data B
11
12
13
14
15
16
17
18
19
Data C
11
12
13
14
15
Murali Shanker – Fundamentals of Business Statistics
16
17
18
19
Chap 3-32
Coefficient of Variation

Measures relative variation

Always in percentage (%)

Shows variation relative to mean

Is used to compare two or more sets of data
measured in different units
Population
σ
CV  
μ
  100%
 
Murali Shanker – Fundamentals of Business Statistics
Sample
 s 

CV  
 x   100%


Chap 3-33
YDI

Stock A:
 Average price last year = $50
 Standard deviation = $5

Stock B:


Average price last year = $100
Standard deviation = $5
Murali Shanker – Fundamentals of Business Statistics
Chap 3-34
Linear Transformations
The data on the number of children in a
neighborhood of 10 households is as
follows: 2, 3, 0, 2, 1, 0, 3, 0, 1, 4.
1.
If there are two adults in each of the above
households, what is the mean and
standard deviation of the number of people
(children + adults) living in each
household?
2.
If each child gets an allowance of $3, what
is the mean and standard deviation of the
amount of allowance in each household in
this neighborhood?
Murali Shanker – Fundamentals of Business Statistics
X  1.6
s  1.43
Chap 3-35
Definitions
Let X be the variable representing a set of values,
and sx and X be the standard deviation and
mean of X, respectively. Let Y = aX + b, where
a and b are constants. Then, the mean and
standard deviation of Y are given by
Y  aX b
sY  a s X
Murali Shanker – Fundamentals of Business Statistics
Chap 3-36
Standardized Data Values

A standardized data value refers to
the number of standard deviations a
value is from the mean

Standardized data values are
sometimes referred to as z-scores
Murali Shanker – Fundamentals of Business Statistics
Chap 3-37
Standardized Values
A standardized variable Z has a mean of 0 and a
standard deviation of 1.
xx
z
s
where:
 x = original data value
 x = sample mean
 s = sample standard deviation
 z = standard score
(number of standard deviations x is from the mean)
Murali Shanker – Fundamentals of Business Statistics
Chap 3-38
YDI


During a recent week in
Europe, the temperature X in
Celsius was as follows:
Day
M
T
W
H
F
S
S
X
40
41
39
41
41 40 38
Based on this
X  40
s X  1.14


Calculate the mean and
standard deviation in
Fahrenheit.
Calculate the standardized
score.
Murali Shanker – Fundamentals of Business Statistics
Chap 3-39
Related documents