Download Measures of Variation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Section 2.4
Measures of Variation
Statistics
Mrs. Spitz
Fall 2008
Objectives
§ How to find the range of a data set
§ How to find the variance and standard
deviation of a population and of a sample
§ How to use the Empirical Rule and
Chebychev’s Theorem to interpret
standard deviation
§ How to approximate the sample standard
deviation for grouped data
Larson/Farber Ch 2
Assignment
§ Pp. 78-83 #1-8, 10-22, 24-32
Larson/Farber Ch 2
Range
§ The range of a data set is the difference
between the maximum and minimum
entries in the set.
Range = (maximum entry) – (minimum entry)
Larson/Farber Ch 2
Two Data Sets
Closing prices for two stocks were recorded on ten
successive Fridays. Calculate the mean, median and mode
for each.
Stock A
Mean = 61.5
Median = 62
Mode = 67
Larson/Farber Ch 2
56
56
57
58
61
63
63
67
67
67
33
42
48
52
57
67
67
77
82
90
Stock B
Mean = 61.5
Median = 62
Mode = 67
Measures of Variation
Range = Maximum value – Minimum value
Range for A = 67 – 56 = $11
Range for B = 90 – 33 = $57
The range is easy to compute but only uses two
numbers from a data set.
If only one number changes, the range can be vastly
changed. If a $56 stock dropped to $12, the range would
change. You need to different symbols for population
parameters
and sample statistics.
Larson/Farber Ch 2
Measures of Variation
To learn to calculate measures of variation that use each
and every value in the data set, you first want to know
about deviations.
The deviation for each value x is the difference between
the value of x and the mean of the data set.
In a population, the deviation for each value x is:
In a sample, the deviation for each value x is:
Larson/Farber Ch 2
Deviations
Stock A Deviation
56
– 5.5
56 – 61.5
56
– 5.5
56 – 61.5
57
– 4.5
57 – 61.5
58
– 3.5
58 – 61.5
61
– 0.5
63
1.5
63
1.5
67
5.5
67
5.5
67
5.5
Larson/Farber Ch 2
The sum of the deviations is always zero.
Population Variance
Population Variance: The sum of the squares of the
deviations, divided by N.
x
56
56
57
58
61
63
63
67
67
67
– 5.5
– 5.5
– 4.5
– 3.5
– 0.5
1.5
1.5
5.5
5.5
5.5
30.25
30.25
20.25
12.25
0.25
2.25
2.25
30.25
30.25
30.25
188.50
Larson/Farber Ch 2
Sum of squares
Population Standard Deviation
Population Standard Deviation: The square root of
the population variance.
The population standard deviation is $4.34.
Larson/Farber Ch 2
Sample Variance and Standard Deviation
To calculate a sample variance divide the sum of
squares by n – 1.
The sample standard deviation, s, is found by
taking the square root of the sample variance.
Larson/Farber Ch 2
Summary
Range = Maximum value – Minimum value
Population Variance
Population Standard Deviation
Sample Variance
Sample Standard Deviation
Larson/Farber Ch 2
Empirical Rule (68-95-99.7%)
Data with symmetric bell-shaped distribution have the
following characteristics.
13.5%
13.5%
2.35%
–4
–3
2.35%
–2
–1
0
1
2
3
4
About 68% of the data lies within 1 standard deviation of the mean
About 95% of the data lies within 2 standard deviations of the mean
About 99.7% of the data lies within 3 standard deviations of the mean
Larson/Farber Ch 2
Using the Empirical Rule
The mean value of homes on a street is $125 thousand with a
standard deviation of $5 thousand. The data set has a bell shaped
distribution. Estimate the percent of homes between $120 and
$135 thousand.
105
110
115
120
125
130
$120 thousand is 1 standard deviation below
the mean and $135 thousand is 2 standard
deviations above the mean.
135
140
145
68% + 13.5% = 81.5%
So, 81.5% have a value between $120 and $135 thousand.
Larson/Farber Ch 2
Chebychev’s Theorem
For any distribution regardless of shape the
portion of data lying within k standard
deviations (k > 1) of the mean is at least 1 – 1/k2.
For k = 2, at least 1 – 1/4 = 3/4 or 75% of the data lie
within 2 standard deviation of the mean.
For k = 3, at least 1 – 1/9 = 8/9 = 88.9% of the data lie
within 3 standard deviation of the mean.
Larson/Farber Ch 2
Chebychev’s Theorem
The mean time in a women’s 400-meter dash is
52.4 seconds with a standard deviation of 2.2
sec. Apply Chebychev’s theorem for k = 2.
Mark a number line in
standard deviation units.
2 standard deviations
A
45.8
48
50.2
52.4
54.6
56.8
59
At least 75% of the women’s 400-meter dash
times will fall between 48 and 56.8 seconds.
Larson/Farber Ch 2