Download sample statistic

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
2-1
Chapter Two
Descriptive Statistics
2-2
McGraw-Hill/Irwin
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Descriptive Statistics
2.1
2.2
2.3
2.4
Describing the Shape of a Distribution
Describing Central Tendency
Measures of Variation
Percentiles, Quartiles, and Box-andWhiskers Displays
2.5 Describing Qualitative Data
*2.6 Using Scatter Plots to Study the
Relationship Between Variables
*2.7 Misleading Graphs and Charts
2-3
2.1 Stem and Leaf Display:
Car Mileage
Example 2.1: The Car Mileage Case
1
5
12
21
(11)
17
7
1
2-4
29 8
30 1344
30 5666889
31 001233444
31 55566777889
32 0001122344
32 556788
33 3
Stem and Leaf Display: Payment Times
Example 2.2: The Accounts
Receivable Case
2-5
1
2
4
7
11
18
27
(8)
30
24
19
16
13
10
8
5
3
2
1
1
10 0
11 0
12 00
13 000
14 0000
15 0000000
16 000000000
17 00000000
18 000000
19 00000
20 000
21 000
22 000
23 00
24 000
25 00
26 0
27 0
28
29 0
Histograms
Example 2.4: The Accounts Receivable Case
Frequency Histogram
2-6
Relative Frequency Histogram
The Normal Curve
2-7
Skewness
Left Skewed
2-8
Symmetric
Right Skewed
Dot Plots
Scores on Exams 1 and 2
2-9
2.2 Population Parameters
and Sample Statistics
A population parameter is number calculated from
all the population measurements that describes some
aspect of the population.
The population mean, denoted , is a population
parameter and is the average of the population
measurements.
A point estimate is a one-number estimate of the
value of a population parameter.
A sample statistic is number calculated using
sample measurements that describes some aspect of
the sample.
2-10
Measures of Central Tendency
2-11
Mean, σ
The average or expected value
Median, Md
The middle point of the ordered
measurements
Mode, Mo
The most frequent value
The Mean
Population X1, X2, …, XN
Sample x1, x2, …, xn

x
Population Mean
Sample Mean
n
N

2-12
X
i =1
N
i
x
x
i =1
n
i
The Sample Mean
The sample mean x is defined as
n
x
x
i 1
n
i
x1  x2  ...  xn

n
and is a point estimate of the population mean, .
2-13
Example: Car Mileage Case
Example 2.5: Sample mean for first five car mileages from
Table 2.1
30.8, 31.7, 30.1, 31.6, 32.1
5
x
x
i 1
5
i
x1  x2  x3  x4  x5

5
30.8  31.7  30.1  31.6  32.1 156.5


 31.26
5
5
2-14
The Median
The population or sample median is a value such that
50% of all measurements lie above (or below) it.
The median Md is found as follows:
1. If the number of measurements is odd, the
median is the middlemost measurement in the
ordered values.
2. If the number of measurements is even, the
median is the average of the two middlemost
measurements in the ordered values.
2-15
Example: Sample Median
Example 2.6: Internists’ Salaries (x$1000)
127 132 138 141 144 146 152 154 165 171 177 192 241
Since n = 13 (odd,) then the median is the middlemost or
7th measurement, Md=152
2-16
The Mode
The mode, Mo of a population or sample
of measurements is the measurement that
occurs most frequently.
2-17
Example: Sample Mode
Example 2.2: The Accounts
Receivable Case
The value 16 occurs 9 times
therefore:
Mo = 16
2-18
1
2
4
7
11
18
27
(8)
30
24
19
16
13
10
8
5
3
2
1
1
10 0
11 0
12 00
13 000
14 0000
15 0000000
16 000000000
17 00000000
18 000000
19 00000
20 000
21 000
22 000
23 00
24 000
25 00
26 0
27 0
28
29 0
Relationships Among Mean,
Median and Mode
2-19
2.3 Measures of Variation
Range
Largest minus the smallest measurement
Variance
The average of the sum of the squared
deviations from the mean
Standard Deviation
The square root of the variance
2-20
The Range
Range = largest measurement - smallest measurement
Example:
Internists’ Salaries (in thousands of dollars)
127 132 138 141 144 146 152 154 165 171 177 192 241
Range = 241 - 127 = 114 ($114,000)
2-21
The Variance
Population X1, X2, …, XN
s2
σ2
Population Variance
N
2 
2-22
Sample x1, x2, …, xn
 (X
i
i=1
N
Sample Variance
n
- )2
s2 =

(x i - x ) 2
i =1
n -1
The Standard Deviation
Population Standard Deviation, s:
Sample Standard Deviation, s:
2-23
 
s
s
2
2
Example: Population
Variance/Standard Deviation
Population of annual returns for five junk bond
mutual funds:
10.0%, 9.4%, 9.1%, 8.3%, 7.8%
m= 10.0+9.4+9.1+8.3+7.8 = 44.6 = 8.92%
5
50
2
2
2
2
2
(
10
.
0

8
.
92
)

(
9
.
4

8
.
92
)

(
91
.

8
.
92
)

(
8
.
3

8
.
92
)

(
7
.
8

8
.
92
)
2 
5
= 1.1664+.2304+.3844+1.2544 = 3.068 = .6136
5
5
   2  .6136 .7833
2-24
Example: Sample Variance/Standard
Deviation
Example 2.11: Sample variance and standard
deviation for first five car mileages
from Table 2.1
30.8, 31.7, 30.1, 31.6, 32.1
 (x - x )
5
2
x  31.26
i
s2 =
i =1
5 -1
(30.8  31.26) 2  (31.7  31.26) 2  (30.1  31.26) 2  (31.6  31.26) 2  (32.1  31.26) 2
s =
4
2
s2 = 2.572  4 = 0.643
2-25
s  s 2  .643  0.8019
The Empirical Rule for Normal Populations
If a population has mean m and standard deviation s and
is described by a normal curve, then
68.26% of the population measurements lie within one
standard deviation of the mean: [m-s, m+s]
95.44% of the population measurements lie within two
standard deviations of the mean: [m-2s, m+2s]
99.73% of the population measurements lie within three
standard deviations of the mean: [m-3s, m+3s]
2-26
Example: The Empirical Rule
Example 2.13: The Car Mileage Case
2-27
Chebyshev’s Theorem
Let m and s be a population’s mean and standard
deviation, then for any value k>1,
At least 100(1 - 1/k2 )% of the population
measurements lie in the interval:
[m-ks, m+ks]
2-28
2.4 Percentiles and Quartiles
For a set of measurements arranged in increasing
order, the pth percentile is a value such that p percent
of the measurements fall at or below the value and
(100-p) percent of the measurements fall at or above
the value.
The first quartile Q1 is the 25th percentile
The second quartile (or median) Md is the 50th
percentile
The third quartile Q3 is the 75th percentile.
The interquartile range IQR is Q3 - Q1
2-29
Example: Quartiles
20 customer satisfaction ratings:
1 3 5 5 7 8 8 8 8 8 8 9 9 9 9 9 10 10 10 10
Md = (8+8)/2 = 8
Q1 = (7+8)/2 = 7.5
Q3 = (9+9)/2 = 9
IRQ = Q3 - Q1 = 9 - 7.5 = 1.5
2-30
Box and Whiskers Plots
2-31
2.5 Describing Qualitative Data
2-32
Population and Sample Proportions
Population X1, X2, …, XN
Sample x1, x2, …, xn
p̂
p
Population Proportion
Sample Proportion
n
pˆ 
x
i
i =1
n
xi = 1 if characteristic present, 0 if not
2-33
Example: Sample Proportion
Example 2.16: Marketing Ethics Case
117 out of 205 marketing researchers disapproved of
action taken in a hypothetical scenario
X = 117, number of researches who disapprove
n = 205, number of researchers surveyed
Sample Proportion:
2-34
X 117
p 
 .57
n 205
Bar Chart
Percentage of Automobiles Sold by Manufacturer,
1970 versus 1997
2-35
Pie Chart
Percentage of Automobiles Sold by Manufacturer,1997
2-36
Pareto Chart
Pareto Chart of Labeling Defects
2-37
2.6 Scatter Plots
Restaurant Ratings: Mean Preference vs. Mean Taste
2-38
2.7 Misleading Graphs
and Charts: Scale Break
Mean Salaries at a Major University, 1999 - 2002
2-39
Misleading Graphs and Charts:
Horizontal Scale Effects
Mean Salary Increases at a Major University, 1999-2002
2-40
Descriptive Statistics
Summary:
2.1
2.2
2.3
2.4
Describing the Shape of a Distribution
Describing Central Tendency
Measures of Variation
Percentiles, Quartiles, and Box-andWhiskers Displays
2.5 Describing Qualitative Data
*2.6 Using Scatter Plots to Study the
Relationship Between Variables
*2.7 Misleading Graphs and Charts
2-41
Related documents