Download Ch2-Sec2.4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Section 2.4
Measures of Variation
1
Section 2.4 Objectives
 Determine the range of a data set
 Determine the variance and standard deviation of a
population and of a sample
 Use the Empirical Rule and Chebychev’s Theorem to
interpret standard deviation
 Approximate the sample standard deviation for grouped data
2
Range
Range
 The difference between the maximum and minimum data entries
in the set.
 The data must be quantitative.
 Range = (Max. data entry) – (Min. data entry)
3
Example: Finding the Range
A corporation hired 10 graduates. The starting salaries for each
graduate are shown. Find the range of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
4
Solution: Finding the Range
 Ordering the data helps to find the least and greatest
salaries.
37 38 39 41 41 41 42 44 45 47
minimum
maximum
 Range = (Max. salary) – (Min. salary)
= 47 – 37 = 10
The range of starting salaries is 10 or $10,000.
5
Deviation, Variance, and Standard
Deviation
Deviation
 The difference between the data entry, x, and the mean of the
data set.
 Population data set:

Deviation of x = x – μ
 Sample data set:

6
Deviation of x = x – x
Example: Finding the Deviation
A corporation hired 10 graduates. The starting salaries for each
graduate are shown. Find the deviation of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Solution:
• First determine the mean starting salary.
x 415


 41.5
N
10
7
Solution: Finding the Deviation
 Determine the
deviation for each data
entry.
8
Salary ($1000s), x Deviation: x – μ
41
41 – 41.5 = –0.5
38
38 – 41.5 = –3.5
39
39 – 41.5 = –2.5
45
45 – 41.5 = 3.5
47
47 – 41.5 = 5.5
41
41 – 41.5 = –0.5
44
44 – 41.5 = 2.5
41
41 – 41.5 = –0.5
37
37 – 41.5 = –4.5
42
42 – 41.5 = 0.5
Σx = 415
Σ(x – μ) = 0
Deviation, Variance, and Standard
Deviation
Population Variance

2

(
x


)
2
 
N
Sum of squares, SSx
Population Standard Deviation

9
( x   )
  
N
2
2
Finding the Population Variance &
Standard Deviation
InWords
1. Find the mean of the
population data set.
In Symbols
x

N
2. Find deviation of each
entry.
x–μ
3. Square each deviation.
(x – μ)2
4. Add to get the sum of
squares.
SSx = Σ(x – μ)2
10
Finding the Population Variance &
Standard Deviation
In Words
In Symbols
5. Divide by N to get the
population variance.
2

(
x


)
2 
N
6. Find the square root to get
the population standard
deviation.
( x   ) 2

N
11
Example: Finding the Population
Standard Deviation
A corporation hired 10 graduates. The starting salaries for each
graduate are shown. Find the population variance and standard
deviation of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Recall μ = 41.5.
12
Solution: Finding the Population
Standard Deviation
 Determine SSx
 N = 10
 Note that
SSx = Σ(x – μ)2
13
Deviation: x – μ
Squares: (x – μ)2
41
41 – 41.5 = –0.5
(–0.5)2 = 0.25
38
38 – 41.5 = –3.5
(–3.5)2 = 12.25
39
39 – 41.5 = –2.5
(–2.5)2 = 6.25
45
45 – 41.5 = 3.5
(3.5)2 = 12.25
47
47 – 41.5 = 5.5
(5.5)2 = 30.25
41
41 – 41.5 = –0.5
(–0.5)2 = 0.25
44
44 – 41.5 = 2.5
(2.5)2 = 6.25
41
41 – 41.5 = –0.5
(–0.5)2 = 0.25
37
37 – 41.5 = –4.5
(–4.5)2 = 20.25
42
42 – 41.5 = 0.5
(0.5)2 = 0.25
Σ(x – μ) = 0
SSx = 88.5
Salary, x
Solution: Finding the Population
Standard Deviation
Population Variance
( x   )
88.5

 8.9
•  
N
10
2
2
Population Standard Deviation
•    2  8.85  3.0
The population standard deviation is about 3.0, or $3000.
14
Deviation, Variance, and Standard
Deviation
Sample Variance

( x  x )
s 
n 1
2
2
Sample Standard Deviation

15
2

(
x

x
)
2
s s 
n 1
Finding the Sample Variance &
Standard Deviation
InWords
In Symbols
x
n
1. Find the mean of the
sample data set.
x
2. Find deviation of each
entry.
xx
3. Square each deviation.
( x  x )2
4. Add to get the sum of
squares.
SS x  ( x  x ) 2
16
Finding the Sample Variance &
Standard Deviation
In Words
5. Divide by n – 1 to get the
sample variance.
6. Find the square root to get
the sample standard
deviation.
17
In Symbols
2

(
x

x
)
s2 
n 1
( x  x ) 2
s
n 1
Sample Standard Deviation Shortcut
Formula
2
n (x ) - (x)
s=
18
n (n - 1)
2
Symbols
for Standard Deviation
Sample
Textbook
Some graphics
calculators
Some
non-graphics
calculators
19
s
Sx
xn-1
Population

x
x n
Book
Some graphics
calculators
Some
non-graphics
calculators
Example: Finding the Sample Standard
Deviation
The starting salaries are for the Chicago branches of a
corporation. The corporation has several other branches, and
you plan to use the starting salaries of the Chicago branches to
estimate the starting salaries for the larger population. Find the
sample standard deviation of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
20
Solution: Finding the Sample Standard
Deviation
 Determine SSx
 n = 10
 Note that
SS x  ( x  x ) 2
Salary, x
Deviation:
xx
41
41 – 41.5 = –0.5
(–0.5)2 = 0.25
38
38 – 41.5 = –3.5
(–3.5)2 = 12.25
39
39 – 41.5 = –2.5
(–2.5)2 = 6.25
45
45 – 41.5 = 3.5
(3.5)2 = 12.25
47
47 – 41.5 = 5.5
(5.5)2 = 30.25
41
41 – 41.5 = –0.5
(–0.5)2 = 0.25
44
44 – 41.5 = 2.5
(2.5)2 = 6.25
41
41 – 41.5 = –0.5
(–0.5)2 = 0.25
37
37 – 41.5 = –4.5
(–4.5)2 = 20.25
42
42 – 41.5 = 0.5
(0.5)2 = 0.25
Σ( x  x ) = 0
21
Squares: ( x  x ) 2
SSx = 88.5
Solution: Finding the Sample Standard
Deviation
Sample Variance
( x  x )
88.5

 9.8
• s 
n 1
10  1
2
2
Sample Standard Deviation
88.5
 3.1
• s s 
9
2
The sample standard deviation is about 3.1, or $3100.
22
Example: Using Technology to Find the
Standard Deviation
Sample office rental rates (in dollars per
square foot per year) for Miami’s central
business district are shown in the table.
Use a calculator or a computer to find
the mean rental rate and the sample
standard deviation. (Adapted from: Cushman
& Wakefield Inc.)
23
Office Rental Rates
35.00
33.50
37.00
23.75
26.50
31.25
36.50
40.00
32.00
39.25
37.50
34.75
37.75
37.25
36.75
27.00
35.75
26.00
37.00
29.00
40.50
24.50
33.00
38.00
Solution: Using Technology to Find the
Standard Deviation
Sample Mean
Sample Standard
Deviation
24
Interpreting Standard Deviation
 Standard deviation is a measure of the typical amount an
entry deviates from the mean.
 The more the entries are spread out, the greater the standard
deviation.
25
Usual Sample Values
minimum ‘usual’ value  (mean) - 2 (standard deviation)
minimum  x - 2(s)
maximum ‘usual’ value  (mean) + 2 (standard deviation)
maximum  x + 2(s)
Interpreting Standard Deviation:
Empirical Rule (68 – 95 – 99.7 Rule)
For data with a (symmetric) bell-shaped distribution, the standard
deviation has the following characteristics:
• About 68% of the data lie within one standard
deviation of the mean.
• About 95% of the data lie within two standard
deviations of the mean.
• About 99.7% of the data lie within three standard
deviations of the mean.
27
Interpreting Standard Deviation:
Empirical Rule (68 – 95 – 99.7 Rule)
99.7% within 3 standard deviations
95% within 2 standard deviations
68% within 1
standard deviation
34%
2.35%
x  3s
28
34%
13.5%
x  2s
13.5%
x s
x
xs
2.35%
x  2s
x  3s
Example: Using the Empirical Rule
In a survey conducted by the National Center for Health Statistics,
the sample mean height of women in the United States (ages 20-29)
was 64 inches, with a sample standard deviation of 2.71 inches.
Estimate the percent of the women whose heights are between 64
inches and 69.42 inches.
29
Solution: Using the Empirical Rule
• Because the distribution is bell-shaped, you can use the
Empirical Rule.
34%
13.5%
55.87
x  3s
58.58
x  2s
61.29
x s
64
x
66.71
xs
69.42
x  2s
34% + 13.5% = 47.5% of women are between 64 and
69.42 inches tall.
30
72.13
x  3s
Chebychev’s Theorem
 The portion of any data set lying within k standard deviations (k
> 1) of the mean is at least:
1
1 2
k
• k = 2: In any data set, at least
1 3
1  2  or 75%
2
4
of the data lie within 2 standard deviations of the mean.
• k = 3: In any data set, at least
1 8
1  2  or 88.9%
3
9
of the data lie within 3 standard deviations of the mean.
31
Example: Using Chebychev’s Theorem
The age distribution for Florida is shown in the histogram. Apply
Chebychev’s Theorem to the data using k = 2. What can you
conclude?
32
Solution: Using Chebychev’s Theorem
k = 2: μ – 2σ = 39.2 – 2(24.8) = -10.4 (use 0 since age
can’t be negative)
μ + 2σ = 39.2 + 2(24.8) = 88.8
At least 75% of the population of Florida is between 0 and
88.8 years old.
33
Estimation of Standard Deviation
Range Rule of Thumb
x - 2s
x + 2s
x
(minimum
usual value)
(maximum
usual value)
Range  4s
or
s
34
Range
4
=
highest value - lowest value
4
Standard Deviation for Grouped Data
Sample standard deviation for a frequency distribution

( x  x ) 2 f
s
n 1
where n= Σf (the number of
entries in the data set)
 When a frequency distribution has classes, estimate the sample
mean and standard deviation by using the midpoint of each
class.
35
Example: Finding the Standard
Deviation for Grouped Data
You collect a random sample of the
number of children per household in a
region. Find the sample mean and the
sample standard deviation of the data set.
36
Number of Children in 50
Households
1
3
1
1
1
1
2
2
1
0
1
1
0
0
0
1
5
0
3
6
3
0
3
1
1
1
1
6
0
1
3
6
6
1
2
2
3
0
1
1
4
1
1
2
2
0
3
0
2
4
Solution: Finding the Standard
Deviation for Grouped Data
 First construct a frequency distribution.
 Find the mean of the frequency
distribution.
xf 91
x

 1.8
n
50
The sample mean is about 1.8
children.
37
x
f
xf
0
10
0(10) = 0
1
19
1(19) = 19
2
7
2(7) = 14
3
7
3(7) =21
4
2
4(2) = 8
5
1
5(1) = 5
6
4
6(4) = 24
Σf = 50 Σ(xf )= 91
Solution: Finding the Standard
Deviation for Grouped Data
 Determine the sum of squares.
xx
( x  x )2
( x  x )2 f
x
f
0
10
0 – 1.8 = –1.8
(–1.8)2 = 3.24
3.24(10) = 32.40
1
19
1 – 1.8 = –0.8
(–0.8)2 = 0.64
0.64(19) = 12.16
2
7
2 – 1.8 = 0.2
(0.2)2 = 0.04
0.04(7) = 0.28
3
7
3 – 1.8 = 1.2
(1.2)2 = 1.44
1.44(7) = 10.08
4
2
4 – 1.8 = 2.2
(2.2)2 = 4.84
4.84(2) = 9.68
5
1
5 – 1.8 = 3.2
(3.2)2 = 10.24
10.24(1) = 10.24
6
4
6 – 1.8 = 4.2
(4.2)2 = 17.64
17.64(4) = 70.56
( x  x )2 f  145.40
38
Solution: Finding the Standard
Deviation for Grouped Data
 Find the sample standard deviation.
x 2 x
( x  x )2
( x  x ) f
145.40
s

 1.7
n 1
50  1
( x  x )2 f
The standard deviation is about 1.7 children.
39
Standard Deviation from a Frequency Table
Shortcut Formula
n [(f • x 2)] -[(f • x)]2
S=
40
n (n - 1)
Practice Questions
Q(2.11)
The number of incidents where policies were needed for a
sample of ten schools in Allegheny County is
7, 37, 3, 8, 48, 11, 6, 0, 10, 3. Assume the data
represent samples.
Compute the sample variance and sample
Standard deviation.
Practice Questions
Q(2.12)
Compute the variance and standard deviation of the
given grouped data.
Number
f
27-90
13
91-154
2
155-218
0
219-282
5
283-346
0
347-410
2
411-474
0
475-539
1
539-602
2
42
Practice Questions
Q(2.13)
The mean of a distribution is 20 and the standard
deviation is 2. Use Chebyshev’s Theorem to answer the
following questions.
(1) At least what percentage of the values will fall
between 10 and 30?
(2) At least what percentage of the values will fall
between 12 and 28?
43
Practice Questions
Q(2.14)
The average U.S yearly per capita consumption of citrus
fruits is 26.8 pounds. Suppose that the distribution of
fruits amount consumed is bell-shaped with standard
deviation of 4.2 pounds.
What percentage of Americans would you expect to
consume more than 31 pounds of citrus fruit per year?
44
Section 2.4 Summary
 Determined the range of a data set
 Determined the variance and standard deviation of a
population and of a sample
 Used the Empirical Rule and Chebychev’s Theorem to
interpret standard deviation
 Approximated the sample standard deviation for grouped
data
45
Related documents