Download Section 3 – 2A

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Data mining wikipedia , lookup

Time series wikipedia , lookup

Transcript
Section 3 – 2A:
The Standard Deviation as a Measure of Variation
The Empirical Rule
for Data Under a Bell Shaped Curve
34%
0.15%
34%
2.35%
2.35%
13.5%
0.15%
13.5%
The 68 -– 95 – 99.7 Empirical Rule for Data Under a Bell Shaped Curve
Approximately 68% of all the data points fall between ± 1 standard deviation of the mean.
Approximately 95% of all the data points fall between ± 2 standard deviation of the mean.
Approximately 99.7 % of all the data points will fall between ± 3 standard deviations of the mean.
Note: The percents are approximate due to round off. Other sources may round differently and
produce slightly different decimal approximations for the areas at the ends of the graph.
99.7%
95%
68%
Section 3 – 2A Lecture
Page 1 of 11
© 2012 Eitel
The condensed version of the Empirical Rule
We will use a condensed version of the graph above for lecture notes and homework.
The marks on the number line show where 1, 2, and 3 standard deviations from the mean fall
The percent of data that falls within each of those standard deviations is shown between the marks.
.15%
–3 SD
2.35%
–2 SD
13.5%
–1 SD
34%
34%
mean
1 SD
13.5%
2.35%
2 SD
3 SD
A1)
34% of the data falls between 0 and +1 standard deviation from the mean.
A2)
34% of the data falls between 0 and –1 standard deviation from the mean.
B1)
13.5% of the data falls between +1 and +2 standard deviations from the mean.
B2)
13.5% of the data falls between –1 and – 2 standard deviations from the mean.
C1)
2.35% of the data falls between +3 and +2 standard deviations from the mean.
C1)
2.35% of the data falls between – 3 and – 2 standard deviations from the mean.
D1)
.15% of the data falls beyond + 3 standard deviation from the mean.
D2)
.15% of the data falls beyond – 3 standard deviation from the mean.
Section 3 – 2A Lecture
Page 2 of 11
.15%
© 2012 Eitel
Example 1
The scores for all high school seniors taking the verbal section of the Scholastic Aptitude Test (SAT) in
2009 had a population mean of 490 and a population standard deviation of 100. The
distribution of SAT scores is bell-shaped.
The problem states that µ x = 490 σ x = 100
1A) 68% of all the data points fall between ± 1 standard deviation of the mean. Find the range of
numbers that make this statement true for the data above.
68% of all the data falls
between ± 1standard deviation of the mean
µ − 1σ to µ + 1σ
490 − 100 to 490 + 100
390
to
590
68% of the data falls within 390 to 590
1B) 95% of all the data points fall between ± 2 standard deviations of the mean. Find the range of
numbers that make this statement true for the data above.
95% of all the data falls
between ± 2 standard deviations of the mean
µ − 2σ to µ + 2σ
490 − 2(100) to 490 + 2(100)
490 − 200
to
490 + 200
290
to
690
95% of the data falls within 290 to 690
1C) 99.7% of all the data points fall between ± 3 standard deviations of the mean. Find the range of
numbers that make this statement true for the data above.
99.7% of all the data falls
between ± 3 standard deviations of the mean
µ − 3σ to µ + 3σ
490 − 3(100) to 490 + 3(100)
490 − 300
to
490 + 300
190
to
790
99.7% of the data falls within 190 to 790
Section 3 – 2A Lecture
Page 3 of 11
© 2012 Eitel
3 different ways to present the solution
The scores for all high school seniors taking the verbal section of the Scholastic Aptitude Test (SAT) in
1999 had a population mean of 490 and a population standard deviation of 100. The
distribution of SAT scores is bell-shaped.
English Wording
68% of the data falls within 390 to 590
68% of the data falls within 290 to 690
99.7% of the data falls within 190 to 790
Bell Shaped Graph
Line Graph
.15%
2.35%
13.5%
34%
34%
13.5%
2.35%
.15%
–3 SD
–2 SD
–1 SD
mean
1 SD
2 SD
3 SD
_190_
_290_
_390_
_490_
_590_
_690__
_790_
The bell shaped graph with itʼs colored area is very impressive for presentations. The line graph has
a compact form and contains the information in a format that is the most helpful in answering the type of
questions we will ask in the homework and on the test.
Section 3 – 2A Lecture
Page 4 of 11
© 2012 Eitel
Example 2
IQ scores of all adults who take the Weschler IQ TEST have a population mean of 100 and a
population standard deviation of 15. The distribution of IQ scores is normal (bell-shaped). Find
the x values that correspond to the 68%, 95% and 99.7% mentioned in the Empirical Rule.
The problem states that µ x = 100 σ x = 15
68%
of all the data falls
between ± 1 standard
deviation of the mean
µ − 1σ to µ + 1σ
100 − 1(15) to 100 + 1(15)
100 − 15 to 100 + 15
95%
of all the data falls
between ± 2 standard
deviations of the mean
µ − 1σ to µ + 1σ
100 − 2(15) to 100 + 2(15)
100 − 30 to 100 + 30
99.7%
of all the data falls
between ± 3 standard
deviations of the mean
µ − 3σ to µ + 3σ
100 − 3(15) to 100 + 3(15)
100 − 45 to 100 + 45
85 to 115
70 to 130
55 to 145
.15%
2.35%
13.5%
–3 SD
–2 SD
–1 SD
_55_
_70_
_85_
34%
34%
mean
13.5%
1 SD
_100_
_115_
2.35%
2 SD
_130__
.15%
3 SD
_145_
1 SD = 15
100
{
mean
85
←4
68%
of2
data
→4444115
1444
444
44
4
3
1 standard deviation
704444444
←4
95%
of2data
→44444444
130
1
44
44
3
2 standard deviations
55
of 4
data
14444444444←
499.7%
4442
4→
44444444444145
4
3
3 standard deviations
Section 3 – 2A Lecture
Page 5 of 11
© 2012 Eitel
Example 3
The heights of a sample of 100 5th grade students at a local school forms a bell shaped graph. The
heights have a sample mean of 42.5 inches and a sample standard deviation of 5 inches.
Find the x values that correspond to the 68%, 95% and 99.7% mentioned in the Empirical Rule.
The problem states that x = 42.5 inches sx = 5 inches
68%
of all the data falls
between ± 1 standard
deviation of the mean
x −1sx to
x + 1sx
42.5 −1(5) to 42.5 + 1(5)
42.5 − 5 to 42.5 + 5
95%
of all the data falls
between± 2 standard
deviations of the mean
x − 2sx to
x + 2sx
42.5 − 2(5) to 42.5 + 2(5)
42.5 −10 to 42.5 + 10
99.7%
of all the data falls
between ± 3 standard
deviations of the mean
x − 3sx to
x + 3sx
42.5 − 3(5) to 42.5 + 3(5)
42.5 −15 to 42.5 + 15
37.5 to 47.5
32.5 to 52.5
27.5 to 57.5
.15%
2.35%
13.5%
34%
34%
13.5%
2.35%
.15%
–3 SD
–2 SD
–1 SD
mean
1 SD
2 SD
3 SD
_27.5_
_32.5_
_37.5_
_42.5_
_47.5_
_52.5__
_57.5_
1 SD = 5
42.5
{
mean
37.5
←4
68%
of2data
→44444
47.5
1
4444
44
44
3
1 standard deviation
32.5
of
14444444←
495%
444
2data
44→
4444444452.5
4
3
2 standard deviations
27.5
←4
99.7%
of4
data
→44444444444
57.5
144444444444
44
42
44
4
3
3 standard deviations
Section 3 – 2A Lecture
Page 6 of 11
© 2012 Eitel
Usual and Unusual Values
for data that is bell shaped
we consider all values
within 2 standard deviations of the mean to be USUAL.
for data that is bell shaped
we consider all values
outside of 2 standard deviations of the mean to be UNUSUAL.
The mean for a bell shaped data set is in the center of the graph and occurs the most frequently. Data
points close to the mean are very common. Data Points farther from the mean are less common.
Values at the far ends of a data set occur at such a low frequency that their occurrence is considered
unusual. For the purposes of this book we define all data points that are outside of 2 standard
deviations for the mean to be unusual.
The phrase “unusual “ does not mean there is a problem with the unusual data point. It does mean
that if you have such a point that it does not occur as frequently as the points closer to the mean. For
bell shaped data, we define unusual to mean more than 2 standard deviations above or below the
mean. For bell shaped data, this means that the top 2.5% of the data and the bottom 2.5% of the
data is considered unusual.
Section 3 – 2A Lecture
Page 7 of 11
© 2012 Eitel
Example 1
A bell shaped data set contains sample data. The data set has a mean of 25 and a standard
deviation of 3.
A) What is the range for usual data?
B) What is the range for unusual data?
C) Is a value of 13 unusual?
Solution
95%
of all the data falls
between ± 2 standard
deviations of the mean
x − 2sx to x + 2sx
25 − 2(3) to 25 + 2(3)
25 − 6 to 25 + 6
19 to 31
95% of the data falls within 19 ↔ 31
A) The range for “normal” data is between 19 and 31
B) The range for unusual data is below 19 and greater than 31
C) Yes
Section 3 – 2A Lecture
Page 8 of 11
© 2012 Eitel
Example 2
A random sample of local gas stations produced the following results. The prices for 87 octane gas
have a bell shaped data set with a mean of $ 4.15 a gallon and a standard deviation of 25 cents
a gallon.
A) What is the range for usual data?
B) What is the range for unusual data?
C) Is a price of $ 4.62 a gallon unusual?
Solution
95%
of all the data falls
between± 2 standard
deviations of the mean
x − 2sx to x + 2sx
4.15 − 2(.25) to 4.15 + 2(.25)
4.15 − .55 to 4.15 + .50
3.60 to 4.65
95% of the data falls within 3.60 ↔ 4.65
A) The range for “normal” data is between 3.60 and 4.65
B) The range for unusual data is below 3.60 and greater than 4.65
C) No
Data Distribution for a Bell Shaped Curve
.15%
–3 SD
2.35%
–2 SD
Section 3 – 2A Lecture
13.5%
–1 SD
34%
34%
mean
Page 9 of 11
1 SD
13.5%
2 SD
2.35%
.15%
3 SD
© 2012 Eitel
Optional
Notation for Population Data
English Wording
Statistics Meaning Notation
Contains
Within 1 standard deviation
in both directions
from the population mean
the data between
(µ −1σ ) and (µ + 1σ )
µ ± 1σ
68% of all
the data
Within 2 standard deviations
in both directions
from the population mean
the data between
(µ − 2σ ) and (µ + 2σ )
µ ± 2σ
95% of all
the data
Within 3 standard deviations
in both directions
from the population mean
the data between
(µ − 3σ ) and (µ + 3σ )
µ ± 3σ
99.7% of all
the data
Notation for Sample Data
English Wording
Statistics Meaning Notation
Contains
the data between
(µ −1s x ) and (µ + 1sx )
µ ± 1sx
68% of all
the data
Within 2 standard deviations
in both directions
from the population mean
the data between
(µ − 2sx ) and (µ + 2sx )
µ ± 2sx
95% of all
the data
Within 3 standard deviations
in both directions
from the population mean
the data between
(µ − 3sx ) and (µ + 3sx )
µ ± 3sx
99.7% of all
the data
Within 1 standard deviation
in both directions
from the population mean
Data Distribution for a Bell Shaped Curve
.15%
–3 SD
2.35%
–2 SD
Section 3 – 2A Lecture
13.5%
–1 SD
34%
mean
34%
1 SD
Page 10 of 11
13.5%
2 SD
2.35%
.15%
3 SD
© 2012 Eitel
Percent of Data
that is
Statistics Meaning
More than 3
standard deviations
from the sample mean
to the right of
Between 2 and 3
standard deviations
from the sample mean
Between 1 and 2
standard deviations
from the sample mean
(x +
from
(x +
2sx
from
(x +
1sx
Between 0 and 1
standard deviations
from the sample mean
from
Between –1 and 0
standard deviations
from the sample mean
from
Between –1 and –2
standard deviations
from the sample mean
Between –2 and –3
standard deviations
from the sample mean
Less than –3
standard deviations
from the sample mean
Section 3 – 2A Lecture
3sx
)
)
(x +
)
(x +
(x +
2sx
from
(x −
3sx
)
2.35%
)
13.5%
)
34%
3sx
to
2sx
to
1sx
to
( x − 1sx )
(x −
.15%
to
x
from
Percent of Data
34%
x
to
)
( x − 1sx )
to
)
(x −
2sx
)
13.5%
2.35%
to the left of
(x −
3sx
Page 11 of 11
)
.15%
© 2012 Eitel