Download Limitations of Range - english-rico

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Categorical variable wikipedia , lookup

Time series wikipedia , lookup

Transcript
Range
In statistics, range is defined simply as the
difference between the maximum and minimum
observations. It is intuitively obvious why we
define range in statistics this way – range should
suggest how diversely spreads out the values
are, and by computing the difference between
the maximum and minimum values, we can get
an estimate of the spread of the data.
Example
For example, suppose an experiment involves
finding out the weight of lab rats and the values
in grams are 320, 367, 423, 471 and 480. In this
case, the range is simply computed as 480-320 =
160 grams.
Source: http://www.experimentresources.com/range-instatistics.html#ixzz22LkgVypq
The Range:
Range is defined as the difference
between the maximum and the
minimum observation of the given data. If
denotes the maximum observation
denotes
the minimum observation then the range is
defined as
Range
Source:
http://www.emathzone.com/tutorials/basicstatistics/range-and-coefficient-of-range.html
Formula for Range
The calculation of the range is very
straightforward. All we need to do is find the
difference between the largest data value in our
set and the smallest data value. Stated succinctly
we have the following formula:
Range = Maximum Value – Minimum Value.
For example, the data set 4,6,10, 15, 18 has
maximum of 18, minimum of 4 and range of 18 –
4 = 14.
Limitations of Range
The range is a very crude measurement of the
spread of data because it is extremely sensitive
to outliers. A single data value can greatly affect
the value of the range. For example, consider the
set of data 1, 2, 3, 4, 6, 7, 7, 8. The maximum
value is 8, the minimum is 1 and the range is 7.
Now consider the same set of data, only with the
value 100 included. The range now becomes 100
– 1 = 99. The addition of a single extra data point
greatly affected the value of the range.
The standard deviation is another measure of
spread that is less susceptible to outliers. The
drawback is that the calculation of the standard
deviation is much more complicated.
Source:
http://statistics.about.com/od/DescriptiveStatistics/a/What-Is-The-Range-In-Statistics.htm
Variance
Statistical variance gives a measure of how the
data distributes itself about the mean or
expected value. Unlike range that only looks at
the extremes, the variance looks at all the data
point and then determines their distribution.
Usage
The concept of variance can be extended to
continuous data sets too. In that case, instead of
summing up the individual differences from the
mean, we need to integrate them. This approach
is also useful when the number of data points is
very large, like the population of a country.
Variance is extensively used in probability
theory, wherein from a given smaller sample set,
more generalized conclusions need to be drawn.
This is because variance gives us an idea about
the distribution of data around the mean, and
thus from this distribution, we can work out
where we can expect an unknown data point.
Source: http://www.experimentresources.com/statisticalvariance.html#ixzz22LvEJVE5
Source: http://www.quickmba.com/stats/standard-deviation/
Coefficients of Variations
Variance and Standard Deviation
While the range is useful for describing the
"borders" of a data set, it tells us almost nothing about
the points that fall between the two extremes. In most
cases, we are interested in knowing how far each of the
data points is from the mean or median of the data set,
as this would enable us to see the spread in all of the
data points. Since the median doesn't take the
magnitude of each data point into account and the
mean does, the mean is the better option for a central
point.
Source:
http://esa21.kennesaw.edu/activities/stats/stats.pdf
A coefficient of variation (CV) can be calculated and
interpreted in two different settings: analyzing a single
variable and interpreting a model. The standard
formulation of the CV, the ratio of the standard
deviation to the mean, applies in the single variable
setting. In the modeling setting, the CV is calculated as
the ratio of the root mean squared error (RMSE) to the
mean of the dependent variable. In both settings, the
CV is often presented as the given ratio multiplied by
100. The CV for a single variable aims to describe the
dispersion of the variable in a way that does not
depend on the variable's measurement unit. The higher
the CV, the greater the dispersion in the variable. The
CV for a model aims to describe the model fit in terms
of the relative sizes of the squared residuals and
outcome values. The lower the CV, the smaller the
residuals relative to the predicted value. This is
suggestive of a good model fit.
Source:http://www.ats.ucla.edu/stat/mult_pkg/faq/ge
neral/coefficient_of_variation.htm
Coefficient of Variation
The coefficient of variation (abbreviated CV) is a way to
quantify scatter. It is defined as the standard deviation
of a group of values divided by their mean. Often that
ratio is multiplied by 100 to express the coefficient of
variation as a percent (abbreviated %CV).
The CV is useful for comparing scatter of variables
measured in different units. You could ask, for example,
whether the variation in pulse rate is greater or less than
the variation in the concentration of serum sodium. The
pulse rate and sodium are measured in completely
different units, so comparing their standard deviation
would be nonsense. Comparing their coefficients of
variation might prove useful to some physiological
investigations of homeostasis.
Source:
http://www.graphpad.com/support/faqid/1088/
Coefficient of Variation (CV)
If you know nothing about the data other than the
mean, one way to interpret the relative magnitude of
the standard deviation is to divide it by the mean. This
is called the coefficient of variation. For example, if the
mean is 80 and standard deviation is 12, the cv = 12/80
= .15 or 15%.
If the standard deviation is .20 and the mean is .50,
then the cv = .20/.50 = .4 or 40%. So knowing nothing
else about the data, the CV helps us see that even a
lower standard deviation doesn't mean less variable
data.