Download ch.4

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Measures of Variability
A single summary figure that describes the spread of observations within a distribution.
#1
#2
4
2
0
6
Frequency
6
Frequency
Frequency
6
#3
4
2
0
FEAR
4
2
0
FEAR
FEAR
#4
Frequency
6
4
2
0
FEAR
DESCRIBING
VARIABILITY
The amount by which scores are dispersed/spread/
scattered in a distribution
Range
Difference between the smallest and largest observations
¨  Pros and Cons
¨ 
J
¤  Values exist in the data set J
¤  Value depends on only two scores L
¤  Very sensitive to outliers L
¤  Easy!
¨ 
Examples:
¤  Fear
scores: 1, 1, 5, 7, 9, 3, 1
¤  Height
Deviations
¨ 
Score
Difference
The average amount that a score
deviates from the typical score.
ScoreMean
1
1-3
-2
Score – Mean = Difference Score
Average Mean Difference Score
2
2-3
-1
3
3-3
0
4
4-3
1
5
5-3
2
¤ 
¤ 
ΣX
X=
n
X=
1 + 2 + 3 + 4 + 5 15
= =3
5
5
*deviations always sum to zero
To fix this, square each one…
Σ(deviations ) = 0
Variance
¨ 
¨ 
σ
“sigma”
2
Mean of all squared deviation scores
Steps
1 + 2 + 3 + 4 + 5 15
= =3
¤  1. Calculate sample mean: X =
5
5
¤  2. Calculate difference scores: score - mean
¤  3. Square the difference scores (aka the Sum of Squares [SS])
Fear
Score - Difference Difference
¤  4. Add them up:
2
Σ(deviations ) 2 = 10
¤  5.
Take the average
Average =
2
(
deviations
)
∑
N
10
= =2
5
Score
Mean
1
1-3
-2
4
2
3
2-3
3-3
-1
0
1
0
4
5
4-3
5-3
1
2
1
4
Variance: Definitional Formula
¨ 
“mu”
Population
σ
2
( X − µ)
∑
=
N
your old friend “sigma”
…but lower case!
¨ 
Sample
2
S
2
(X − X )
∑
=
2
n −1
Symbol for
sample variance
*Note the “n-1” in the sample formula!
** Degrees of freedom (df)
Variance
¨ 
S2
Use the definitional formula to calculate the variance.
(X − X )
∑
=
2
-1
n
(3 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (6 − 6) 2 + (7 − 6) 2 + (7 − 6) 2 + (8 − 6) 2 + (8 − 6) 2 + (9 − 6) 2
2
S =
10 − 1
40
S2 =
= 4.44
9
Variance:
Computational Formula
¨ 
Population
σ
2
2
σ =
¨ 
( X − µ)
∑
=
2
N
N ∑ X 2 − (∑ X ) 2
N2
Sample
S2 =
2
(
X
−
X
)
∑
n −1
2
(
X
)
∑
2
X −
∑
2
n
S =
(n − 1)
Variance
¨ 
Use the computational formula to calculate the variance.
X
2
(
X
)
∑
2
X
−
∑
2
n
S =
(n − 1)
(60) 2
400 −
2
10
S =
9
400 − 360
S2 =
9
S 2 = 4.44
X2
3
9
4
16
4
16
4
16
6
36
7
49
7
49
8
64
8
64
9
81
Sum: 60 Sum: 400
Standard Deviation
Rough measure of the average amount by which
scores deviate on either side of the mean
¨  Steps:
¨ 
¤  1.
Calculate variance (we just did this)
¤  2. Take the square root
¨ 
Population
σ= σ
σ=
2
∑ (X − µ )
N
¨ 
Sample
s= s
2
S=
2
∑(X − X )
n −1
2
Variability Example: Standard Deviation
2
(
X
)
∑
2
X
−
∑
2
n
S =
(n − 1)
S=
∑(X − X )
2
n
(3 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (6 − 6) 2 + (7 − 6) 2 + (7 − 6) 2 + (8 − 6) 2 + (8 − 6) 2 + (9 − 6) 2
S=
10 − 1
S=
40
= 2.11
9
Mean: 6
Standard Deviation: 2.11
(60) 2
400 −
10
S=
9
S=
400 − 360
9
S = 4.44
S = 2.11
Practice!
Calculate the range, variance, and standard
deviation for the following set of “fear” scores
¨  Do this for the population AND the sample formulas
¨ 
10, 8, 5, 0, 3, 4
Practice!
10, 8, 5, 0, 3, 4
¨  Mean = 5
¨  10-5 = 5 à 25
¨  8-5 = 3 à 9
¨  5-5= 0 à 0
¨  0-5= -5 à 25
¨  3-5= -2 à 4
¨  4-5= -1 à 1
¨  Sum of Squares = 64
Population
-  Range: 10-0 = 10
-  Variance: 64/6 = 10.67
-  SD = 3.27
¨  Sample
¨  Range: 10
¨  Variance: 64/5 = 12.8
¨  SD = 3.58
¨  What is the ONLY
difference between the
two formulas? (N vs. n-1)
¨ 
Standard Deviation
A majority (68% for a normal distribution) of all scores
are within one standard deviation on either side of the
mean
¨  Only a small minority (5% for a normal distribution) is
more than two standard deviations on either side of the
mean
¨ 
Pros and Cons of Standard Deviation
¨ 
Pros
¤  Used
in calculating many other measures.
¤  Average of deviations around the mean.
¤  Majority of data within 1 s.d. above or below the mean.
¤  Combined with mean:
n  Efficiently
describes a distribution with just two numbers
n  Allows comparisons between distributions with different scales
¨ 
Cons
¤  Influenced
by extreme scores.
Related documents