Download Chapter 3: Variability: How Values Disperse It`s not enough to know

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 3: Variability: How Values Disperse
It’s not enough to know an average value. Understanding their variation gives context to the central
tendency of the values. [p. 61]
Measuring Variability with the Range [p. 62]
Just as there are three primary ways to measure the central tendency in a frequency distribution, there’s
more than one way to measure variability. Two of these methods, the standard deviation and the
variance, are closely related and take up most of the discussion in this chapter.
A third way of measuring variability is the range: the maximum value in a set minus the minimum value.
The size of the range is entirely dependent on the values of the largest and the smallest values. The
range does not change until and unless there’s a change in one or both of those values, the maximum
and the minimum. All the other values in the frequency distribution could change and the range would
remain the same. The other values could be distributed more homogeneously, or they could bunch up
near one or two modes, and the range would still not change.
Excel has no RANGE() function. To get the range, you must use something such as the following,
substituting the appropriate range address for the one shown: [p. 64]
=MAX(A2:A21) − MIN(A2:A21)
The Concept of a Standard Deviation
It’s just more or less the average deviation of all datapoints to the mean.
Arranging for a Standard
A standard deviation acts much like the fictitious unit described in the prior section. In any frequency
distribution that follows a normal curve, these statements are true:



You find about 34% of the records between the mean and
one standard deviation from the mean
You find about 14% of the records between one and two
standard deviations from the mean.
You find about 2% of the records between two and three
standard deviations from the mean.
These standards are displayed in Figure 3.4.
key here is if it follows a normal distribution. as mentioned in class this is a popular distribution;
however there are other ones, e.g. Poisson, and in some cases your data will be unique such that it
does not fit squarely into a wel-studied distribution. That is okay - we can still use mean, standard
deviation, etc., to describe the data. That said, the normal distribution is useful as has various
properties; such as those listed here, that can be useful when understanding your day.
A z-score, or sometimes z-value, tells you how many standard deviations above or below the mean a
record is. If someone tells you that your height in z-score units is +1.0, it’s the same as saying that your
height is one standard deviation above the mean height.
it used to be one would only hear this term z-score in academia; though I am seeing this more and more
in mainstream reporting. 538, the political blog Nate Silver sometimes will use the z-score.