Download Measures of Central Tendency

Document related concepts

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
STAT 3660 – Introduction to
Statistics
Chapter 6 – Measures of Central
Tendency
STAT 3660 – Introduction to Statistics
Objectives
• By the end of this material, you will be able to:
– Explain the purposes of measures of central tendency and interpret
the information they provide
– Calculate , explain and compare modes, medians and means
– Understand other measures of central tendency
– Be able to select appropriate measure of central tendency according
to the level of measurement and characteristics of the distribution
2
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Measures of Central Tendency
– Values can be calculated from the scores of the distribution to be
representative of the location or center of the distribution. These are
important measures of descriptive statistics
• There are three (3) common measures of central tendency
– Mean (µ - population, x - sample): is the arithmetic center of a
distribution and is the most commonly utilized measure. It reports the
average score of a distribution
– Median (~
x ): is the center score of the distribution. Half of the values
are below the median and half are above the median in the
distribution
)
– Mode (x ): is the most frequently occurring score of the distribution.
There may be more than one mode or none in a distribution
3
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Mean ( x , average)
– The mean is calculated by totaling all of the scores or values and
dividing by the number of scores or values that were summed
x
∑
x=
i
N
where,
∑ x = the sum of all values
i
N
= the number of values in the summations
4
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Three characteristics of the mean
1.
2.
3.
The mean balances all of the scores because it acts like a fulcrum. It
is the point around which all of the scores cancel out
The mean is the point in a distribution around which the variation is
minimized (least squares)
The mean can be misleading if the distribution is skewed or contains
unusual (outlier) values. Distributions with significant skewness will
have inflated means. This will be discussed further in the discussion
of median
5
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Example
– Given the following values:
Sample
Value
1
2
2
6
3
4
4
4
5
4
Total
20
∑x
i
= 2 + 6 + 4 + 4 + 4 = 20
x
∑
x=
20
=
=4
N
5
i
6
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• The mean balances all of the scores because it acts like a fulcrum. It is the
point around which all of the scores cancel out
Sample
Data Value
Distance from
Mean
1
3
3 - 4 = -1
2
4
4–4=0
3
7
7–4=3
4
5
5–4=1
5
1
1 – 4 = -3
20
0
Total
x
∑
x=
20
=
=4
N
5
i
7
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• The mean is the point in a distribution around which the variation is
minimized (least squares)
Sample
Data
Values
x – x-bar
(x – x-bar)2
x–3
(x – x-bar)2
x–5
(x – x-bar)2
1
3
3 - 4 = -1
1
3-3=0
0
3 - 5 = -2
4
2
4
4–4=0
0
4–3=1
1
4 – 5 = -1
1
3
7
7–4=3
9
7–3=4
16
7–5=2
4
4
5
5–4=1
1
5–3=2
4
5–5=0
0
5
1
1 – 4 = -3
9
1 – 3 = -2
4
1 – 5 = -4
16
Total
20
0
20
5
25
-5
25
Notice that the sums of squared differences increase as you move away from the mean
8
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• To illustrate the mean we use the following dotplot
The arithmetic center of the distribution, or balance point is 4 as calculated in the
previous slide
9
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Strengths and weaknesses of the mean
– Strengths
• All data from a variable are used to compute the mean
– Weaknesses
• Every score affects the mean
• A single score (very high or very low) will skew the distribution and
thereby giving a misleading interpretation to the mean
•
L
10
STAT 3660 – Introduction to Statistics
Mean of a distribution with outliers
Percent of people dying
x = 3.4
x = 4.2
Without the outliers
With the outliers
The mean is pulled to the right a lot by the outliers
(from 3.4 to 4.2).
11
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Median (center value)
– The median is calculated by locating the center value
– If there are an odd number of data values, then to find the center
value, simply order the values and remove them from the beginning
and end to find the center, e.g.:
2 4 4 4 6
The median is 4
– If there are an even number of data values, then find the center two
values and average them, e.g.:
2 4 4 4 5 6
4+4 8
Median =
= =2
2
2
12
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Characteristics of the median
– The median can be calculated for ordinal results as well as numeric
data, but cannot be calculated for nominal data as the data has no
order
– The median is always the exact center of the distribution
– One-half of the values within the distribution are below the median
and one-half of the values are above the median
– Notice that the mean and median are not always the same value:
• In 2004, household income had a mean of $60,528 and median income of
$43,384. The mean is almost 40% higher – why do you think this is the
case?1
1
From the US Department of Census
13
STAT 3660 – Introduction to Statistics
Mean and median of a distribution with outliers
Percent of people dying
x = 3.4
x = 4.2
Without the outliers
With the outliers
The mean is pulled to the
The median, on the other hand,
right a lot by the outliers
is only slightly pulled to the right
(from 3.4 to 4.2).
by the outliers (from 3.4 to 3.6).
14
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Mode (most frequent)
– The mode is found by finding the most frequent value (or values)
Values
Frequency (f)
2
1
4
3
6
1
The most frequent value is 4, since
there are 3 of them, so the mode is 4
15
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Characteristics of the mode:
–
–
–
–
–
–
the most common score
all three levels of measurement can use the mode
Commonly used with categorical-nominal variables
Some distributions have no mode
Some distributions have multiple modes
For categorical-ordinal and numerical-continuous, the mode may not
be central to the distribution
16
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Notice that in our example the mean, median and mode all
have the same value (4), which is not always the case
• You can tell a lot about the “shape” of a distribution by
comparing these three values
Mode
Median
Mean
• In this distribution the
mean, median and
mode are the same
value
• In this distribution the mean, median and mode
are different values
• Notice that the mean will always be pulled in the
direction of the “skew” of the distribution
17
STAT 3660 – Introduction to Statistics
Distinguishing the Median and Mean of a Density Curve
• The median of a distribution is the equal-areas point―the point that
divides the area under the curve in half.
• The mean is the balance point, at which the curve would balance if made
of solid material.
18
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• When a dataset has an extreme value, the mean will be pulled
in the direction of the extreme scores
– For a positive skew (skewed to the right), the mean will be greater
than the median
– For a negative skew, the mean will be less than the median
• When a numerical variable has a pronounced skew, the
median may be the more trustworthy measure of central
tendency
19
STAT 3660 – Introduction to Statistics
Question
• Given the following values calculate the following:
– Calculate the mean:
a)
b)
c)
d)
–
3
5
2
6
8
6
Calculate the median:
a)
b)
c)
d)
–
5.5
5
6
4
5.5
5
6
4
Calculate the mode:
a)
b)
c)
d)
5.5
6
5
4
20
STAT 3660 – Introduction to Statistics
Question
• Given the following values calculate the following:
– Calculate the mean:
a)
b)
c)
d)
–
3
5
2
6
8
6
Calculate the median:
a)
b)
c)
d)
–
5.5
5
6
4
5.5
5
6
4
Calculate the mode:
a)
b)
c)
d)
5.5
6
5
4
21
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• The relationship between level of measure and measures of
central tendency:
Level of Measurement
Measure of Central
Tendency
Nominal
Ordinal
Interval-Ratio
Mode
Yes
Yes
Yes
Median
No
Yes
Yes
Mean
No
Yes (?)
Yes
22
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Choosing a measure of Central Tendency
Use the Mode when
1. The variable is measured at the nominal level
2. You want a quick and easy measure for ordinal and
numerical variables
3. You want to report the most frequent score
Use the Median when:
1. The variable is measured at the ordinal level
2. A variable measured at the numerical level has a highly
skewed distribution (or outliers are present)
3. You want to report the central score. The median
always lies at the exact center of a distribution
Use the Mean when:
1. The variable is measured at the numerical variable
(except when the variable is highly skewed)
2. You want to report the typical score. The mean is “the
fulcrum that exactly balances all of the scores
3. You anticipate additional statistical analysis
23
STAT 3660 – Introduction to Statistics
Question
• A friend of yours used SPSS to report the mean political
affiliation where 1 = democrat, 2 = independent, and 3 =
republican, and 4 = other. You kindly state that:
a) SPSS is not useful for calculating a mode.
b) the variable political affiliation is nominal and therefore you should
calculate a mode(s) for this variable.
c) the variable political affiliation is ordinal and therefore you should
calculate a median for this variable.
d) to correctly calculate the mean, you needed to omit the “other“
category.
24
STAT 3660 – Introduction to Statistics
Question
• A friend of yours used SPSS to report the mean political
affiliation where 1 = democrat, 2 = independent, and 3 =
republican, and 4 = other. You kindly state that:
a) SPSS is not useful for calculating a mode.
b) the variable political affiliation is nominal and therefore you should
calculate a mode(s) for this variable.
c) the variable political affiliation is ordinal and therefore you should
calculate a median for this variable.
d) to correctly calculate the mean, you needed to omit the “other“
category.
25
STAT 3660 – Introduction to Statistics
Other Measures of Location
• Percentiles: point below which a specific percentage of cases
fall
• Quartiles: divides distribution into quarters (25, 50, 75)
• E.g., the median falls at the 50th percentile (or the 2nd
quartile)
26
STAT 3660 – Introduction to Statistics
Other Measures of Location
• To calculate a percentile:
– Sort scores in order from low to high
– Multiple the number of cases (N) by the proportional value of the
percentile (for example: the 80th percentile is 0.8)
– The resultant value indicates the position in the array of cases
• Example
– In a sample of 70 test grades, we want to find the 3th quartile (or the
75th percentile)
– 70 x 0.75 = 52.5, rounding to 53, so the 53rd case is the 75th percentile
27
STAT 3660 – Introduction to
Statistics
Chapter 6 – Measures of Dispersion
STAT 3660 – Introduction to Statistics
Objectives
• By the end of this material, you will be able to:
– Explain the purpose of measures of dispersion and the information
they convey
– Compute and explain:
• The range (R)
• The inner quartile range (IQR) [note: he book uses the symbol of Q for
this statistics, but the more common symbol is IQR)
• The variance (σ2 or s2)
• The standard deviation (σ or s)
– Select the appropriate measure of dispersion
– We will not discuss Average Absolute Deviation (AAD) or Median
Absolute Deviation (MAD)
29
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Notice that all of the measures of dispersion that we will
discuss are for numerical data
• The measures of dispersion for categorical data will NOT be
discussed in this material
30
STAT 3660 – Introduction to Statistics
Measures of Central Tendency
• Measures of Dispersion
– In the prior material you learned how to describe a variable using
graphical techniques and measures of center (or representative
values)
– In this material we will discuss the spread (or width) of the variable
• The measures of dispersion that we will discuss in this
materials are:
• The range (R) – maximum minus the minimum, which is the total spread
of the data
• The inner quartile range (IQR) - Q3 (3rd quartile) – Q1 (1st quartile), which
is commonly referred to as the inner spread and contains the middle 50%
of the data
• The variance (σ2 or s2) and the standard deviation (σ or s) will be
discussed in detail following
• The index of Qualitative Variation (IQV) will NOT be discussed
31
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Range (R)
– The range is the easiest measure of dispersion to calculate and is
simply the total spread of the data
Range = R = Highest score (Max) - Lowest score (Min)
32
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Characteristics of the Range (R)
– The range though simple to calculate and interpret, there are
significant limitations to it’s value
– It is highly affected by extreme values (outliers) thus is easily
exaggerated to indicate more variation (spread) is in the data than is
actually there
– It only includes information from two of the data values and so has
limited power for interpretation and more advanced statistical
techniques
33
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Example
– Given the following values:
Sample
Value
1
2
2
6
3
4
4
4
5
4
Total
The Lowest score –
Minimum (Min) = 2
The highest score –
Maximum (Max) = 6
20
R = Max – Min = 6 – 2 = 4
34
STAT 3660 – Introduction to Statistics
Question
• Given the following data what is the range?
a)
b)
c)
d)
3
5
4
6
Data
1
2
3
4
5
35
STAT 3660 – Introduction to Statistics
Question
• Given the following data what is the range?
a)
b)
c)
d)
3
5
4
6
Data
1
2
3
4
5
36
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• The Inner Quartile Range (IQR) avoids some of the problems
associated with the range by considering the middle 50% of
the distribution
Inner Quartile Range = IQR = Q3 – Q1
– To find the inner quartile range arrange the scores from highest to
lowest and then divide the distribution into four (4) quarters
– Find the values that correspond the score where 25% (first quartile Q1)
of the values are below and 75% third quartile Q3) of the values are
below
– Find the difference between Q3 and Q1 and this is the inner quartile
range (IQR)
37
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Characteristics of the IQR
– The IQR extracts the middle 50% of the data
– IQR avoids some of the problems of the range being exaggerated by
extreme or unusual values (outliers)
– The IQR is also calculated using only two (2) data values and so does
not have a high content of information and so has limited further use
for statistical techniques but is easily understood and interpreted
38
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Example
– Given the data from the text in table 4.3, Percent of Population Aged
25 and Older with a College Degree, 2007 [sample of 20 states], on
page 94
• Maximum = Max = 34.7 (Connecticut)
• Q3 = 27.0 (20*0.75 = 15, Montana)
• Q2 (median) = 25.45
• Q1 = 22.1 (20*0.25 = 5, Indiana)
• Minimum = Min = 17.3 (West Virginia)
Note: The IQR = 27.0 – 22.1 = 4.9
MiniTab boxplot of Data
Outliers would be beyond:
Lower Bound = Q1 – 1.5*IQR = 22.1 – 1.5*4.9 = 14.75
Upper Bound = Q3 + 1.5*IQR = 27.0 + 1.5*4.9 = 34.35
39
STAT 3660 – Introduction to Statistics
Example: Consider our New York travel times data. Construct a boxplot.
10
30
5
25
40
20
10
15
30
20
15
20
85
15
65
15
60
60
40
45
5
10
10
15
15
15
15
20
20
20
25
30
30
40
40
45
60
60
65
85
M = 22.5
Q1 = 15
Min=5
Q3= 42.5
Max=85
This is an outlier
by the
1.5 x IQR rule
0
10
20
30
40
50
60
TravelTime
70
80
90
40
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• The boxplot provides a very useful and informative chart for
describing a distribution
– The boxplot utilized what is called the 5-Number summary:
•
•
•
•
•
Minimum
Q1 (1st quartile)
Q2 (2nd quartile or median)
Q3 (3rd quartile)
Maximum
– The boxplot can illustrate the shape of the distribution by the length
of the whiskers (longer whiskers on one side indicate a skewed
distribution)
– Identify outliers by being above Q3 or below Q1 by more than 1.5 * IQR
41
STAT 3660 – Introduction to Statistics
Question
• Given Q3 = 20 and Q1 = 10, an outlier would be any value that
is:
a)
b)
c)
d)
Below 10
Above 30
Below 5
None of the above
42
STAT 3660 – Introduction to Statistics
Question
• Given Q3 = 20 and Q1 = 10, an outlier would be any value that
is:
a)
b)
c)
d)
Below 10
Above 30
Below 5
None of the above
43
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Variance (σ2 or s2) and Standard Deviation (σ or s)
– A good measure of dispersion needs to have the following
characteristics:
• Uses all the scores in the distribution – the statistic should use all
the information available
• Describe the “average” or typical deviation of the scores
• The statistic should give us an idea about how far scores are from
each other or from the center of the distribution
• Increase in value as the scores become more diverse and
decrease in value as the scores become less diverse, which is to
provide for comparison between different distributions
44
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Example of Variance (σ2 or s2)
– Given the following values:
x – x-bar
(x –
x-bar)2
Sample
Value
1
66
66-72.2 = -6.2
(-6.2)2 = 38.44
2
75
75-72.2 = +2.8
(2.8)2 = 7.84
3
69
69-72.2 = -3.2
(-3.2)2 = 10.24
4
72
72-72.2 = -0.2
(-0.2)2 = 0.04
5
84
84-72.2 = +11.8
(11.8)2
6
90
90-72.2 = +17.8
(17.8)2 = 316.84
7
96
96-72.2 = +23.8
(23.8)2 = 566.44
8
70
70-72.2 = -2.2
(-2.2)2 = 4.84
9
55
55-72.2 = -17.2
(-17.2)2 = 295.84
10
45
45-72.2 = -27.2
(-27.2)2 = 739.84
722
0
2,119.60
Totals
= 139.24
x
∑
Average = x =
722
= 72.2
N
10
Range = R = Max - Min = 96 - 45 = 51
i
=
• Recall from discussion of Mean that
the mean acts as the balance point of
the distribution and so the deviations
from the mean will add to zero (0)
• To get this to not add to zero, we will
remove the signs of the differences by
squaring each of the differences
45
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• The formula for calculating the variance is as follows:
Population Variance = σ 2 =
2
Sample Variance = s =
2
(
x
−
x
)
∑
N
2
∑ (x − x)
N −1
46
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Example of Variance (σ2 or s2) (con’t)
– From previous slide:
2
Population Variance = σ =
2
Sample Variance = s =
2
(
x
−
x
)
∑
N
2
(
x
−
x
)
∑
N −1
2,119.6
=
= 211.96
10
2,119.6 2,119.6
=
=
= 235.51
10 − 1
9
Note: The variance is difficult to interpret in relationship to the data scores, but is a
very powerful statistic and has many uses in statistical analyses as will be discussed
later
47
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Another example of Variance (σ2 or s2)
– From example in text, page 99 in comparing ages from two campuses
Residential Campus
Deviations
Squared
(x - x-bar)2
Sample
Ages
Deviations
(x - x-bar)
1
18
18 – 19 = -1
(-1)2 = 1
2
19
19 – 19 = 0
(0)2 = 0
3
20
20 – 19 = 1
(1)2 = 1
4
18
18 – 19 = -1
(-1)2 = 1
5
20
20 – 19 = 1
(1)2 = 1
95
0
4
Totals
∑x
95
= 19
N
5
( x − x )2
4
4
∑
2
Sample Variance = s =
=
= =1
N −1
5 −1 4
x - bar = x =
i
=
48
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Example of Variance (σ2 or s2)
– Continuing the example
Urban Campus
Deviations
Squared
(x - x-bar)2
Sample
Ages
Deviations
(x - x-bar)
1
20
20 – 23 = -3
(-3)2 = 9
2
22
22 – 23 = -1
(-1)2 = 1
3
18
18 – 23 = -5
(-5)2 = 25
4
25
25 – 23 = +2
(2)2 = 4
5
30
30 – 23 = +7
(7)2 = 49
115
0
88
Totals
∑x
115
= 23
N
5
( x − x )2
88 88
∑
2
Sample Variance = s =
=
=
= 22
N −1
5 −1 4
x - bar = x =
i
=
49
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Example of Variance (σ2 or s2)
– Continuing the example
The ages of Residential campus
students has clearly less variation
present than for the urban campus
Residential:
Variance (residential) = 1
Mean = 19
Urban:
Mean = 23
Variance (urban) = 22
50
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• A good measure of dispersion needs to have the following
characteristics:
– Uses all the scores in the distribution – Clearly the variance uses all
the scores within the distribution
– Describe the “average” or typical deviation of the scores – The
variance does provide a type of average, although the interpretation
of this results is difficult
– Increase in value as the scores become more diverse and decrease in
value as the scores become less diverse – The diversity of scores for
the urban campus is greater than the diversity in the residential
campus and the variances increase with the amount of diversity
present in the data
• Variance is a good measure of dispersion
51
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• A better measure of dispersion is the standard deviation:
Population Standard Deviation = σ = σ 2 =
Sample Standard Deviation = s = s 2 =
2
(
x
−
x
)
∑
N
2
(
x
−
x
)
∑
N −1
52
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Looking back over the previous example
Sample Standard Deviation (residential) = s = s 2 = 1 = 1
Sample Standard Deviation (urban) = s = s 2 = 22 = 4.69
– Since the squaring of the deviations from the mean is “undone” by
taking the square-root, the standard deviation can be thought of as an
“average” deviation from the mean
– It can be loosely stated that on average each score deviates from the
mean by the standard deviation
– This would mean that there is more than 4 times the variation in the
urban campus than in the residential campus
53
STAT 3660 – Introduction to Statistics
Example of calculating standard deviation
Consider the following data on the number of pets owned by a
group of nine children.
1. Calculate the mean.
2. Calculate each deviation.
deviation = observation – mean
deviation: 1 - 5 = -4
deviation: 8 - 5 = 3
0
2
4
6
NumberOfPets
Number
of Pets
8
x=5
54
STAT 3660 – Introduction to Statistics
Example of calculating standard deviation
3. Square each deviation.
4. Find the “average” squared
deviation. Calculate the sum of
the squared deviations divided by
(n – 1)…this is called the
variance.
5. Calculate the square root of the
variance…this is the standard
deviation.
(xi-mean)2
xi
(xi-mean)
1
1 - 5 = -4
(-4)2 = 16
3
3 - 5 = -2
(-2)2 = 4
4
4 - 5 = -1
(-1)2 = 1
4
4 - 5 = -1
(-1)2 = 1
4
4 - 5 = -1
(-1)2 = 1
5
5-5=0
(0)2 = 0
7
7-5=2
(2)2 = 4
8
8-5=3
(3)2 = 9
9
9-5=4
(4)2 = 16
Sum = ?
“Average” squared deviation = 52/(9 – 1) = 6.5
Standard deviation = square root of variance =
Sum = ?
This is the variance.
6.5 = 2.55
55
STAT 3660 – Introduction to Statistics
Question
• Given the following data what is the sample standard
deviation?
Data
a)
b)
c)
d)
6
1.581
4
2.500
1
2
3
4
5
56
STAT 3660 – Introduction to Statistics
Question
• Given the following data what is the sample standard
deviation?
Data
a)
b)
c)
d)
6
1.581
4
2.500
1
2
3
4
5
57
STAT 3660 – Introduction to Statistics
Measures of Dispersion
• Summary of measures of dispersion
Range
1. The range is easy to calculate and interpret
2. Range is significantly affected by unusual or outlying scores
3. Range is not very useful in later statistical analyses because it does not
utilize all of the information available in the data (only uses two scores in
calculating [max & min])
Inner Quartile
Range
1. The inner quartile range is relatively easy to calculate and interpret
2. The inner quartile range is not effected by unusual or outlying scores
3. The inner quartile range is useful in identifying unusual scores (outliers) and
utilizes the relative position of all scores
Variance or
Standard
Deviation
1. Both the variance and standard deviation are more calculation intensive
2. There is an effect on both the variance and standard deviation by unusual or
outlying scores
3. The variance and standard deviation are very useful in later statistical
analyses since they contain the information from all data (scores)
58
STAT 3660 – Introduction to Statistics
59