Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia, lookup

History of statistics wikipedia, lookup

Bootstrapping (statistics) wikipedia, lookup

Taylor's law wikipedia, lookup

Time series wikipedia, lookup

Transcript
```Mean/Median NOTES
n = number of individuals in the data set.
The symbol
∑
represents sum.
It is the capital letter S in the Greek alphabet and is pronounced “sigma.”
CENTER
MEAN: the average of a data set. The sample mean is denoted by x and is pronounced “x bar”.
x=
∑x = 1
n
n
∑x
MEDIAN: the 50th percentile of a data set.
This says that 50% of the individuals have data values less than or equal to the median.
CASE:
n is odd.
The median is the middle number in the ordered list.
CASE:
n is even.
The median is the average of the two middle numbers in the ordered list.
The mean is affected by outliers and strong skewness. The median is more resistant (not affected)
to outliers and skewness. However, the mean provides more information than the median because it looks
at the value of every data. We prefer to use the mean as the measure of center except when we have
outliers or definite skewness. When that happens we use the median as the measure of center.
FIVE-NUMBER SUMMARY: minX,
Q 1 , Median, Q 3 , maxX.
VARIANCE
Loosely speaking, the variance is the average distance the data values are distributed about the mean.
∑ (x − x )
VAR =
2
n −1
The formula squares the distances of the data from the mean so that we are adding positive numbers
together in the sum, otherwise we get
∑ (x − x ) = 0 . We divide by n − 1 (this is the “degrees of
freedom” as we shall see later on in inferential statistics) because if you have n − 1 of the differences then
the last one, or n
th
∑ (x − x ) = 0 ).
one, is just the additive inverse (since
STANDARD DEVIATION: the square root of the variance.
The sample standard deviation is denoted by S x .
∑ (x − x )
2
Sx =
n −1
S x = VAR
VAR = (S x )
2
Since we are summing up squares in the variance, we take the square root of the VAR in order that the
units of measure (meters, gallons, \$, etc.) are not squared. This gives us the standard deviation of data