Download 3.3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Section 3-3
Measures of Variation
Slide
1
Key Concept
Because this section introduces the concept
of variation, which is something so important
in statistics, this is one of the most important
sections in the entire book.
Place a high priority on how to interpret values
of standard deviation.
Slide
2
Bank Example, p. 93
1) Left side of the class: Find mean, median, mode, and midrange for
the single line system
Right side of the class: Find mean, median, mode, and midrange for
the multi-line system
2) Put the result on the board. Do both systems have the same
measure of center?
3) Examine/compare the two data sets. What is fundamentally
different about them?
Single
Line
6.5
6.6 6.7 6.8
7.1
7.3
7.4
7.7
7.7
7.7
Multiple
lines
4.2
5.4 5.8 6.2
6.7
7.7
7.7
8.5
9.3
10
Slide
3
Definition
The range of a set of data is the
difference between the maximum
value and the minimum value.
Range = (maximum value) – (minimum value)
Bank 1: Variable waiting lines
Bank 2: Single waiting lines
Bank 3: Multiple waiting lines
6
4
1
6
7
3
6
7
14
Slide
4
Definition
The standard deviation of a set of sample
values is a measure of variation of values
about the mean.
If the values are close together: small s
If the values are far apart: large s.
Slide
5
Sample Standard
Deviation Formula (3-4)
s=
 (x - x)
n-1
2
Slide
6
Sample Standard Deviation
(Shortcut Formula: 3-5)
n(x ) - (x)
n (n - 1)
2
s=
2
Slide
7
Banking Example
Jefferson Valley (single line):
6.5
6.6
6.7
6.8
7.1
7.7
7.3
7.4
7.7
7.7
Providence (multiple lines):
4.2
5.4
5.8
6.2
6.7
10.0
7.7
7.7
8.5
9.3
Slide
8
Using the formula for standard
deviation (TI83)
1) Put stuff in L1, L2. Do 2 variable stats. The mean for L1 is
____. Find the deviation of each value from the mean. (L3: L1___).
2) Show that the sum of deviations from Step 1 is 0. Will it always
be 0? (sum L3)
3) We want to avoid the canceling out of the positive and
negative deviations – so we square the deviations (L4: L3^2)
4) We need a mean of those squared deviations, so we find the
mean by dividing by n-1 (degrees of freedom). (sum L4/(n-1))
5) Track the units. If the original times are in minutes, the
deviations are in minutes, the squared deviations are in
minutes squared (what?), and the mean is in minutes squared
(huh?)
6) Since minutes squared doesn’t make much sense, take the
square root to get back to the original units. (sqr rt of ans).
Slide
9
Example
1) Use the data set (1, 3, 14) from the single line
system to find s using formula 3-5.
2) YOU: Use the data set (4, 7, 7 minutes) from a
multiple line system to find s using formula 3-5.
3) Which standard deviation is smaller? So which line
is better?
Slide
10
Standard Deviation Important Properties
 The standard deviation is a measure of
variation of all values from the mean.
 The value of the standard deviation s is
usually positive.
 The value of the standard deviation s can
increase dramatically with the inclusion of
one or more outliers (data values far away
from all others).
 The units of the standard deviation s are the
same as the units of the original data
values.
Slide
11
Population Standard
Deviation
 =
 (x - µ)
2
N
This formula is similar to the previous formula, but
instead, the population mean and population size
are used.
Slide
12
Definition
 The variance of a set of values is a measure of
variation equal to the square of the standard
deviation.
(A general description of the amount that values vary
among themselves)
Also: dispersion/spread
 Sample variance: Square of the sample standard
deviation s
 Population variance: Square of the population
standard deviation

Slide
13
Variance - Notation
standard deviation squared
}
Notation
s

2
2
Sample variance
Population variance
Slide
14
Round-off Rule
for Measures of Variation
Carry one more decimal place than
is present in the original set of
data.
Round only the final answer, not values in
the middle of a calculation.
Slide
15
Day 2 Warm Up:
Heart Rate Activity
Is there a difference between male and
female heart rates?
Male: 60 67 59 64 80 55 72 84 59 67 69 65 66
88 56 82 55 72 64 66 58 70 60 80 63 66 85
66 71 64
Female: 83 56 57 63 60 69 70 86 70 57 67 75
72 75 57 76 69 79 84 75 56 72 70 62 67 66
60 74 81 60
Compare the range, standard deviation, and
variances of these samples.
Slide
16
Estimation of Standard Deviation
Range Rule of Thumb
For estimating a value of the standard deviation s,
Use
s
Range
4
Where range = (maximum value) – (minimum value)
CRUDE estimate
Based on the principal: For many data sets, the
vast majority (95%) lie within 2 std. dev.’s of the
mean
Simple rule to help us interpret std. devs.
Slide
17
Age of Best Actresses
• Use the range rule of thumb to find a rough estimate
of the standard deviation of the sample of 76 ages of
actresses who won Oscars.
• Max age: 80
• Min age: 21
Slide
18
Estimation of Standard Deviation
Range Rule of Thumb
For interpreting a known value of the standard deviation s,
find rough estimates of the minimum and maximum
“usual” sample values by using:
Minimum “usual” value = (mean) – 2 X (standard deviation)
Maximum “usual” value = (mean) + 2 X (standard deviation)
Slide
19
Example
A statistics professor finds the times (in seconds)
required to complete a quiz have a mean of 180 sec
and a standard deviation of 30 secs. Is a time of 90
secs unusual? Why or why not?
YOU: Typical IQ tests have a mean of 100 and a
standard deviation of 15. Use the range rule of thumb
to find the usual IQ scores. Is a value of 140
unusual?
Slide
20
Definition
Empirical (68-95-99.7) Rule
For data sets having a distribution that is approximately
bell shaped, the following properties apply:
 About 68% of all values fall within 1 standard
deviation of the mean.
 About 95% of all values fall within 2 standard
deviations of the mean.
 About 99.7% of all values fall within 3 standard
deviations of the mean.
Slide
21
The Empirical Rule
Slide
22
The Empirical Rule
(applies only to bell-shaped distributions)
Slide
23
The Empirical Rule
Slide
24
Example: IQ Scores
• IQ scores are bell shaped with a mean of 100 and a
standard deviation of 15. What percent of IQ scores
fall between 70 and 130?
Slide
25
Definition
Applies to any distribution, but results are approximate.
Chebyshev’s Theorem
The proportion (or fraction) of any set of data lying
within K standard deviations of the mean is always at
least 1-1/K2, where K is any positive number greater
than 1.
 For K = 2, at least 3/4 (or 75%) of all values lie
within 2 standard deviations of the mean.
 For K = 3, at least 8/9 (or 89%) of all values lie
within 3 standard deviations of the mean.
Slide
26
Example
•
IQ scores have a mean of 100 and a standard
deviation of 15 (pretend we don’t know that IQ
scores are bell-shaped).
• According to Chebyshev’s Theorem, what can we
conclude about:
1) 75% of the IQ scores?
2) 89% of the IQ scores?
Slide
27
Rationale for using n-1
versus n
The end of Section 3-3 has a detailed
explanation of why n – 1 rather than n
is used. The student should study it
carefully.
Slide
28
Definition
The coefficient of variation (or CV) for a set of
sample or population data, expressed as a
percent, describes the standard deviation relative
to the mean.
Free of specific units of measure
For comparing variation for values taken from
different populations
Population
Sample
CV =
s  100%
x
CV =

 100%
m
Slide
29
Example: Statdisk
Height and Weight data for the 40 males
Data Set 1, Appendix B
• Although the difference in units makes it impossible
to compare the two standard deviations (inches vs.
pounds), we can compare CV’s (which have no
units).
• Find the CV for weights and the CV for heights.
Slide
30
Related documents