Download Chapter 3 - Gordon State College

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Lecture Slides
Elementary Statistics
Eleventh Edition
and the Triola Statistics Series
by Mario F. Triola
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 1
Chapter 3
Statistics for Describing,
Exploring, and Comparing Data
3-1 Review and Preview
3-2 Measures of Center
3-3 Measures of Variation
3-4 Measures of Relative Standing and
Boxplots
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 2
Section 3-3
Measures of Variation
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 3
Key Concept
Discuss characteristics of variation, in
particular, measures of variation, such as
standard deviation, for analyzing data.
Make understanding and interpreting the
standard deviation a priority.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 4
Part 1
Basics Concepts of
Measures of Variation
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 5
Definition
The range of a set of data values is
the difference between the
maximum data value and the
minimum data value.
Range = (maximum value) – (minimum value)
It is very sensitive to extreme values; therefore
not as useful as other measures of variation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 6
Round-Off Rule for
Measures of Variation
When rounding the value of a
measure of variation, carry one more
decimal place than is present in the
original set of data.
Round only the final answer, not values in
the middle of a calculation.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 7
Definition
The standard deviation of a set of
sample values, denoted by s, is a
measure of variation of values about
the mean.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 8
6-B
Standard Deviation
Let A = {2, 8, 9, 12, 19} with a mean of 10. Use the data
set A above to find the sample standard deviation.
x (data value)
2
8
9
12
19
x – mean
(deviation)
2 – 10 = -8
8 – 10 = -2
9 – 10 = -1
12 – 10 = 2
19 – 10 = 9
Total
standard deviation =
(deviation)2
(-8)2 = 64
(-2)2 = 4
(-1)2 = 1
(2)2 = 4
(9)2 = 81
154
154 = 6.2
5 1
2
standard deviaton = sum of (deviations from the mean)
total number of data values  1
Copyright © 2005 Pearson Education, Inc.
Slide 6-9
3.3 - 9
Sample Standard
Deviation Formula
s=
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
 (x – x)
n–1
2
3.3 - 10
Rationale for using n – 1
versus n
Dividing by n – 1 yields better results
than dividing by n. It causes s2 to
target 2 whereas division by n causes
s2 to underestimate 2.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 11
Population Standard
Deviation
 =
 (x – µ)
2
N
This formula is similar to the previous
formula, but instead, the population mean
and population size are used.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 12
Standard Deviation Important Properties
 The standard deviation is a measure of
variation of all values from the mean.
 The value of the standard deviation s is
usually positive.
 The value of the standard deviation s can
increase dramatically with the inclusion of
one or more outliers (data values far away
from all others).
 The units of the standard deviation s are the
same as the units of the original data values.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 13
Variance
 The variance of a set of values is a
measure of variation equal to the
square of the standard deviation.
 Sample variance: s2 - Square of the
sample standard deviation s
 Population variance: 2 - Square of
the population standard deviation 
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 14
Unbiased Estimator
The sample variance s2 is an
unbiased estimator of the population
variance 2, which means values of
s2 tend to target the value of 2
instead of systematically tending to
overestimate or underestimate 2.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 15
Variance - Notation
s = sample standard deviation
s2 = sample variance
 = population standard deviation
 2 = population variance
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 16
Compare standard deviations of two
different data sets only if the they
use the same scale and units, and
they have means that are
approximately the same
So when comparing two different
sets with different units, use the
coefficient of Variation (C.V.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 17
Coefficient of Variation
The coefficient of variation (or CV) for a set
of nonnegative sample or population data,
expressed as a percent, describes the
standard deviation relative to the mean.
Sample
CV =
s  100%
x
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Population
CV =

 100%
m
3.3 - 18
Part 2
Beyond the Basics of
Measures of Variation
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 19
Range Rule of Thumb
is based on the principle that for many
data sets, the vast majority (such as
95%) of sample values lie within two
standard deviations of the mean.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 20
Range Rule of Thumb for
Interpreting a Known Value of the
Standard Deviation
Informally define usual values in a data set to be
those that are typical and not too extreme. Find
rough estimates of the minimum and maximum
“usual” sample values as follows:
Minimum “usual” value = (mean) – 2  (standard deviation)
Maximum “usual” value = (mean) + 2  (standard deviation)
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 21
Range Rule of Thumb for
Estimating a Value of the
Standard Deviation s
To roughly estimate the standard
deviation from a collection of known
sample data use
s
range
4
where
range = (maximum value) – (minimum value)
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 22
Properties of the
Standard Deviation
• Measures the variation among data
values
• Values close together have a small
standard deviation, but values with
much more variation have a larger
standard deviation
• Has the same units of measurement
as the original data
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 23
Empirical (or 68-95-99.7) Rule
For data sets having a distribution that is
approximately bell shaped, the following
properties apply:
 About 68% of all values fall within 1
standard deviation of the mean.
 About 95% of all values fall within 2
standard deviations of the mean.
 About 99.7% of all values fall within 3
standard deviations of the mean.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 24
The Empirical Rule
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 25
The Empirical Rule
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 26
The Empirical Rule
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 27
EXAMPLE
A random sample of 50 gas stations in Cook
County, Illinois, resulted in a mean price per gallon
of $1.60 and a standard deviation of $0.07. A
histogram indicated that the data follow a bellshaped distribution.
(a)
Use the Empirical Rule to determine the
percentage of gas stations that have prices
within three standard deviations of the
mean. What are these gas prices?
(b) Determine the percentage of gas stations
with prices between $1.46 and $1.74, according
to the
empirical rule.
3.3 - 28
EXAMPLES
1.
2.
The shortest home-run hit by Mark McGwire was 340 ft
and the longest was 550 ft. Use the range rule of thumb
to estimate the standard deviation.
Heights of men have a mean of 69.0 in and a standard
deviation of 2.8 in. Use the range rule of thumb to
estimate the minimum and maximum “usual” heights of
men. In this context, is it unusual for a man to be 6 ft, 6
in tall?
3.3 - 29
Chebyshev’s Theorem
The proportion (or fraction) of any set of
data lying within K standard deviations
of the mean is always at least 1–1/K2,
where K is any positive number greater
than 1.
 For K = 2, at least 3/4 (or 75%) of all
values lie within 2 standard
deviations of the mean.
 For K = 3, at least 8/9 (or 89%) of all
values lie within 3 standard
deviations of the mean.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 30
EXAMPLE
Using the weights of regular Coke listed in Data Set 17 from
Appendix B, we find that the mean is 0.81682 lb and the
standard deviation is 0.00751 lb. What can you conclude
from Chebyshev’s theorem about the percentage of cans of
regular Coke with weights between 0.79429 lb and 0.83935
lb?
3.3 - 31
Recap
In this section we have looked at:
 Range
 Standard deviation of a sample and
population
 Variance of a sample and population
 Range rule of thumb
 Empirical distribution
 Chebyshev’s theorem
 Coefficient of variation (CV)
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
3.3 - 32