Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Measures of Variability A single summary figure that describes the spread of observations within a distribution. #1 #2 4 2 0 6 Frequency 6 Frequency Frequency 6 #3 4 2 0 FEAR 4 2 0 FEAR FEAR #4 Frequency 6 4 2 0 FEAR DESCRIBING VARIABILITY The amount by which scores are dispersed/spread/ scattered in a distribution Range Difference between the smallest and largest observations ¨ Pros and Cons ¨ J ¤ Values exist in the data set J ¤ Value depends on only two scores L ¤ Very sensitive to outliers L ¤ Easy! ¨ Examples: ¤ Fear scores: 1, 1, 5, 7, 9, 3, 1 ¤ Height Deviations ¨ Score Difference The average amount that a score deviates from the typical score. ScoreMean 1 1-3 -2 Score – Mean = Difference Score Average Mean Difference Score 2 2-3 -1 3 3-3 0 4 4-3 1 5 5-3 2 ¤ ¤ ΣX X= n X= 1 + 2 + 3 + 4 + 5 15 = =3 5 5 *deviations always sum to zero To fix this, square each one… Σ(deviations ) = 0 Variance ¨ ¨ σ “sigma” 2 Mean of all squared deviation scores Steps 1 + 2 + 3 + 4 + 5 15 = =3 ¤ 1. Calculate sample mean: X = 5 5 ¤ 2. Calculate difference scores: score - mean ¤ 3. Square the difference scores (aka the Sum of Squares [SS]) Fear Score - Difference Difference ¤ 4. Add them up: 2 Σ(deviations ) 2 = 10 ¤ 5. Take the average Average = 2 ( deviations ) ∑ N 10 = =2 5 Score Mean 1 1-3 -2 4 2 3 2-3 3-3 -1 0 1 0 4 5 4-3 5-3 1 2 1 4 Variance: Definitional Formula ¨ “mu” Population σ 2 ( X − µ) ∑ = N your old friend “sigma” …but lower case! ¨ Sample 2 S 2 (X − X ) ∑ = 2 n −1 Symbol for sample variance *Note the “n-1” in the sample formula! ** Degrees of freedom (df) Variance ¨ S2 Use the definitional formula to calculate the variance. (X − X ) ∑ = 2 -1 n (3 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (6 − 6) 2 + (7 − 6) 2 + (7 − 6) 2 + (8 − 6) 2 + (8 − 6) 2 + (9 − 6) 2 2 S = 10 − 1 40 S2 = = 4.44 9 Variance: Computational Formula ¨ Population σ 2 2 σ = ¨ ( X − µ) ∑ = 2 N N ∑ X 2 − (∑ X ) 2 N2 Sample S2 = 2 ( X − X ) ∑ n −1 2 ( X ) ∑ 2 X − ∑ 2 n S = (n − 1) Variance ¨ Use the computational formula to calculate the variance. X 2 ( X ) ∑ 2 X − ∑ 2 n S = (n − 1) (60) 2 400 − 2 10 S = 9 400 − 360 S2 = 9 S 2 = 4.44 X2 3 9 4 16 4 16 4 16 6 36 7 49 7 49 8 64 8 64 9 81 Sum: 60 Sum: 400 Standard Deviation Rough measure of the average amount by which scores deviate on either side of the mean ¨ Steps: ¨ ¤ 1. Calculate variance (we just did this) ¤ 2. Take the square root ¨ Population σ= σ σ= 2 ∑ (X − µ ) N ¨ Sample s= s 2 S= 2 ∑(X − X ) n −1 2 Variability Example: Standard Deviation 2 ( X ) ∑ 2 X − ∑ 2 n S = (n − 1) S= ∑(X − X ) 2 n (3 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (4 − 6) 2 + (6 − 6) 2 + (7 − 6) 2 + (7 − 6) 2 + (8 − 6) 2 + (8 − 6) 2 + (9 − 6) 2 S= 10 − 1 S= 40 = 2.11 9 Mean: 6 Standard Deviation: 2.11 (60) 2 400 − 10 S= 9 S= 400 − 360 9 S = 4.44 S = 2.11 Practice! Calculate the range, variance, and standard deviation for the following set of “fear” scores ¨ Do this for the population AND the sample formulas ¨ 10, 8, 5, 0, 3, 4 Practice! 10, 8, 5, 0, 3, 4 ¨ Mean = 5 ¨ 10-5 = 5 à 25 ¨ 8-5 = 3 à 9 ¨ 5-5= 0 à 0 ¨ 0-5= -5 à 25 ¨ 3-5= -2 à 4 ¨ 4-5= -1 à 1 ¨ Sum of Squares = 64 Population - Range: 10-0 = 10 - Variance: 64/6 = 10.67 - SD = 3.27 ¨ Sample ¨ Range: 10 ¨ Variance: 64/5 = 12.8 ¨ SD = 3.58 ¨ What is the ONLY difference between the two formulas? (N vs. n-1) ¨ Standard Deviation A majority (68% for a normal distribution) of all scores are within one standard deviation on either side of the mean ¨ Only a small minority (5% for a normal distribution) is more than two standard deviations on either side of the mean ¨ Pros and Cons of Standard Deviation ¨ Pros ¤ Used in calculating many other measures. ¤ Average of deviations around the mean. ¤ Majority of data within 1 s.d. above or below the mean. ¤ Combined with mean: n Efficiently describes a distribution with just two numbers n Allows comparisons between distributions with different scales ¨ Cons ¤ Influenced by extreme scores.