Download Standard Deviation - South Miami Senior High School

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Time series wikipedia , lookup

Transcript
Standard Deviation
and Interpreting
Standard Deviation
The Mean
Mean: The sum of the data items divided by the
number of items.
Mean =
π‘₯
𝑛
where π‘₯ represents the sum of all the data items and
n represents the number of items
The Median
β€’ Median is the data item in the middle of each set of ranked,
or ordered, data.
β€’ To find the median of a group of data items,
1. Arrange the data items in order, from smallest to largest.
2. If the number of data items is odd, the median is the data item in
the middle of the list.
3. If the number of data items is even, the median is the mean of
the two middle data items.
Ex:
Five employees in a
manufacturing plant earn
salaries of $19,700, $20,400,
$21,500, $22,600 and
$23,000 annually. The section
manager has an annual salary
of $95,000.
a. Find the median annual
salary for the six people.
b. Find the mean annual
salary for the six people.
Note:
β€’ In the last example, the median annual salary is $22,050 and the mean
annual salary is $33,700. Why such a big difference between these two
measures of central tendency?
β€’ The relatively high annual salary of the section manager, $95,000, pulls
the mean salary to a value considerably higher than the median salary.
β€’ When one or more data items are much greater than the other items,
these extreme values can greatly influence the mean. In cases like this,
the median, rather than the mean, is used to summarize the incomes.
Calculate the mean and median for
birth weights and mother’s ages.
The Mode
β€’ Mode is the data value that occurs most often in a data set.
β€’ If more than one data value has the highest frequency, then each of
these data values is a mode.
β€’ If no data items are repeated, then the data set has no mode.
Range
β€’ Used to describe the spread of data items in a data set.
β€’ Range: The difference between the highest and the lowest data values
in a data set:
Range = highest data value – lowest data value
Ex: Honolulu’s hottest day is 89º and its coldest day is 61º. The range in
temperature is: 89-61 = 28º
Standard Deviation
β€’ A second measure of dispersion, and one that is dependent on all of
the data items, is called the standard deviation.
β€’ The standard deviation is found by determining how much each data
item differs from the mean.
Population
(π‘₯βˆ’πœ‡)2
,
𝑁
β€’ Population standard deviation: 𝜎 =
where the x’s
represent the data values, ΞΌ represents the mean, and N represents
the total amount of data.
Steps to Calculate the Population Standard Deviation:
1. Find the mean, ΞΌ, of the data values
2. Find the difference, x – ΞΌ, between each data value (x) and the
mean (ΞΌ)
3. Square each difference, (π‘₯ βˆ’ πœ‡)2
4. Sum these values
(π‘₯ βˆ’ πœ‡)2
𝜎=
5. Divide by the total number of data values
𝑁
6. Take the square root
Ex:
Ms. Mosier measured the height of her trees growing at home. The
heights of the 5 trees are listed below, in inches:
45,60,67,83,95
Find the standard deviation of the heights of Ms. Mosier’s trees.
Step 1: Find the mean, ΞΌ, of the data values
45 + 60 + 67 + 83 + 95 350
πœ‡=
=
= 70
5
5
Step 2: Find the difference, x – ΞΌ, between each data value (x)
and the mean (ΞΌ)
45 – 70 = -25
60 – 70 = -10
67 – 70 = -3
83 – 70 = 13
95 – 70 = 25
Step 3: Square each difference, (π‘₯ βˆ’ πœ‡)2
(βˆ’25)2 = 625
(βˆ’10)2 = 100
(βˆ’3)2 = 9
(13)2 = 169
(25)2 = 625
Step 4: Sum these values
625 + 100 + 9 + 169 + 625 = 1528
Step 5: Divide by the total number of data values
1528
= 305.6
5
Step 6: Take the square root
𝜎 = 305.6 = πŸπŸ•. πŸ’πŸ–
On Your Own:
The table displays the number of hurricanes in the Atlantic Ocean from
1992 to 1997. What are the mean and standard deviation?
Answer:
Mean = ΞΌ = 35/6 = 5.83
Standard deviation = Οƒ = 3.34
Sampling
β€’ A sample is part of a population.
β€’ If you determine a sample carefully, the statistics for the sample can
be used to make general conclusions about the larger population.
β€’ Suppose you want to know what percent of high school students in
the US use Twitter everyday. It likely would be impossible to get that
answer from every student. So instead you select a sample of the
students (like at South Miami High) to estimate the percentage who
use Twitter everyday.
Sample Standard Deviation
β€’ If only given a sample of a population, you can no longer compute the
population standard deviation. You only have a part of the population
and therefore have to calculate the sample standard deviation
instead. It turns out that it is an unbiased estimator for the
population.
(π‘₯βˆ’π‘₯)2
,
π‘›βˆ’1
β€’ Sample standard deviation: 𝑠 =
where the x’s represent
the data values, π‘₯ represents the sample mean, and n represents the
number of data taken by the sample.
Steps to Calculate the Sample Standard Deviation:
1. Find the sample mean, π‘₯, of the sample data values
2. Find the difference, π‘₯ βˆ’ π‘₯, between each data value (x) and the
sample mean (π‘₯)
3. Square each difference, (π‘₯ βˆ’ π‘₯)2
𝑠=
4. Sum these values
5. Divide by the number of data taken by the sample minus 1
6. Take the square root
(π‘₯ βˆ’ π‘₯)2
π‘›βˆ’1
Ex:
The blood alcohol concentrations of a sample of drivers involved in fatal crashes and then convicted
with jail sentences are given below (based on data from the U.S. Department of Justice):
0.27, 0.17, 0.29
Find the sample mean and sample standard deviation.
Step 1: Find the sample mean, π‘₯, of the sample data values
0.27 + 0.17 + 0.29 0.73
π‘₯=
=
= 0.24
3
3
Step 2: Find the difference, π‘₯ βˆ’ π‘₯, between each data value (x) and the
sample mean (π‘₯)
0.27 – 0.24 = .03
0.17 – 0.24 = -.07
0.29 - 0.24 = .05
Step 4: Sum these values
. 0009 + .0049 + .0025 = .0083
Step 3: Square each difference, (π‘₯ βˆ’ π‘₯)2
(.03)2 = .0009
Step 5: Divide by the number of data given minus 1
(βˆ’.07)2 = .0049
.0083
(.05)2 = .0025
= .00415
2
Step 6: Take the square root
𝑠 = .00415 =. πŸŽπŸ”
Calculate the standard deviation for
birth weights.
1. The standard deviation for the skewed distribution is 2.6. This is
significantly greater than the symmetric distribution’s. Explain why this
makes sense.
_____________________________________________________________
_____________________________________________________________
2. Which measures of center and spread would you report for the
symmetric distribution? For the skewed distribution? Explain your
reasoning.
_____________________________________________________________
_____________________________________________________________
_____________________________________________________________
Ex:
Two fifth-grade classes have nearly identical mean scores on an
aptitude test, but one class has a standard deviation three times that of
the other. All other factors being equal, which class is easier to teach,
and why?
Ex:
Shown below are the means and standard deviations of the yearly
returns on two investments from 1926 through 2004.
a. Use the means to determine which investment provided the greater
yearly return.
b. Use the standard deviation to determine which investment has the
greater risk. Explain your answer.
Interpreting and Understanding Standard Deviation
β€’ As stated earlier, standard deviation measures the variation among values. Values close together
will yield a small standard deviation, whereas values spread farther apart will yield a larger
standard deviation.
β€’ Many common statistics (such as human height, weight, or blood pressure) gathered from
samples in the natural world tend to have a normal distribution about their mean. A normal
distribution has a symmetric bell shape centered on the mean.
β€’ We will develop a sense for values of standard deviations using the Empirical Rule. This only
works for normal distributions!!!
Interactive
β€’ http://www.shodor.org/interactivate/activities/NormalDistribution/
β€’ Questions to ask:
β€’ What happens to the curve as the standard deviation gets larger?
β€’ What happens to the data as the number of trials increases?
β€’ What does this remind you of that we discussed?
Ex:
IQ scores of normal adults on the Weschler test have a bell-shaped distribution with a mean of 100
and a standard deviation of 15. What percentage of adults have IQ scores between 55 and 145?
mean
Let’s first draw the distribution.
Since the mean is 100, we put 100 in the center.
With a standard deviation of 15, we know that between 85 and 115, we are
within 1 standard deviation from the mean.
Similarly, we know that between 70 and 130, we are within 2 standard
deviations from the mean.
1 s.d
Finally, between 55 and 145, we are within 3 standard deviations from the mean.
Now to answer the question. If we want to know the percentage of adults who
have IQ scores between 55 and 145, we need to remember the Empirical Rule in
that 99.7% of data fall within 3 standard deviations from the mean.
2 s.d’s
Therefore, the answer is 99.7%
3 s.d’s
Ex:
The table displays the number of U.S. hurricane strikes by decade from the years 1851 to 2000. Let’s
say we’re given that the mean is 17.6 and the standard deviation is 3.5. How many standard
deviations from the mean do all the values fall?
Ex:
On Your Own:
For an English class, the average score on a research project was 82 and
the standard deviation of the normally distributed scores was 5. Sketch
a normal curve showing the project scores and three standard
deviations from the mean.