Download In statistics, mean has two related meanings:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Data mining wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Activity: Standard Deviation of Participant Heights
Step 1. Write down our sample data points. These are the heights (in inches) of five fictional
participants in an ITEP training course.
60
72
62
64
74
Step 2. Calculate the mean of our sample data points.
Sum of the heights = 332
Divide by the number of heights we measured (5)
The mean = 66.4 inches
Step 3. Write down the amount that each sample data point deviates from the mean.
60 - 66.4
72 - 66.4
62 - 66.4
64 - 66.4
74 - 66.4
Sum
=
=
=
=
=
=
- 6.4
5.6
- 4.4
- 2.4
7.6
0.0
Because of positive and negative values the sum of the deviations is zero. If we try to
average these deviations we will get an average of zero. This does not tell us much about
how our data as a whole deviates from the mean.
Step 4. So, let’s square the deviations to get rid of the positive and negative signs. All the
values will now be in square inches.
- 6.4
5.6
- 4.4
- 2.4
7.6
=
=
=
=
=
40.96
31.36
19.36
5.76
57.76
Step 5. Now sum the squared deviations.
40.96
31.36
19.36
5.76
+ 57.76
155.20
Revised 12/3/14 pe
The sum of the squared deviations =155.20 square inches.
This value is called Sum of Squares and is much-used in statistical work.
Step 6. Divide the sum of squares by the number of data points (5).
The result is 31.04 square inches.
This is the mean of the squared deviations.
Other names for this number are Mean Square or Variance.
Variance is much-used in statistical work.
Step 7. Since variance is still a squared value, we need to “un-square” it in order to return
to the units (inches) of our original data set.
The square root of the variance = 5.57 inches. We can round this to 5.6 inches.
This is the Standard Deviation for our data set. Standard deviation is a statistic that tells you
how tightly all the data are clustered around the mean.
The calculation we have just performed can be symbolized in the following way:
Standard Deviation 


X X
2
n
In words, we can say: the standard deviation is the square root of the average of squared
deviations from the mean.

Standard Deviation and Normal Curves
Mean
For a data set, the mean is the sum of all the data values divided by the number of values.
The mean is useful for understanding our data; however, there are additional statistics that can
increase our understanding of the data.
Range
The range tells us the entire spread of our data from lowest value to highest value. However, the
range can be unduly influenced by a single outlier.
Revised 12/3/14 pe
Standard Deviation
A better measure of the dispersion of the data around the mean will take into account all the data
values, but those values that show up more frequently will have more weight. Standard
Deviation is such a measure.
When the data are bunched together and the bell-shaped curve is steep, the standard deviation is
small. When the data are spread apart and the bell curve is flattened, the standard deviation is
large.
Diagram 1
A common assumption of statistics-users is that data are "normally" distributed. If the variation
in the data is truly random, it will fit a normal (bell-shaped) curve. Of course, for many data sets
the variation is not truly random, but that requires more complicated statistics. For the purpose
of this introduction, we will assume the data are normally distributed.
For data that are "normally distributed" we expect that about 68.3% of the data will be within 1
standard deviation from the mean. In general there is a relationship between the percentage of
the included data and the number of standard deviations from the mean. This relationship is
illustrated in Table 1 and Diagram 2.
Revised 12/3/14 pe
Table 1
Percent of the data
50.0
68.3
90.0
95.0
95.4
99.0
99.7
Number of Standard Deviations from the Mean
0.674
1.000
1.645
1.960
2.000
2.576
3.000
Diagram 2
For more information on statistics visit the following websites.
http://www.ltcconline.net/greenl/courses/201/descstat/mean.htm
http://www.robertniles.com/stats/stdev.shtml
http://en.wikipedia.org/wiki/Statistics
Revised 12/3/14 pe