Download Measures of Central Tendency

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Mean field particle methods wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Categorical variable wikipedia , lookup

Regression toward the mean wikipedia , lookup

Transcript
Basic Statistics
Measures of Central Tendency
Characteristics of Distributions
• Location or Center
– Can be indexed by using a measure
of central tendency
• Variability or Spread
– Can be indexed by using a measure
of variability
Consider the following distribution of scores:
How do the red and blue distributions differ?
How do the red and green distributions differ?
Consider the following distributions:
How do the green and blue distributions differ?
Consider the following two distributions:
How do the green and red distributions differ?
Characteristics of Distributions
•
•
•
•
Location or Central Tendency
Variability
Symmetry
Kurtosis
Measures of Central Tendency
Summarizing Data
Give you one score or
measure that represents,
or is typical of, an entire
group of scores
The Mean
The Median
The Mode
Most scores tend to center toward
a point in the distribution.
frequency
score
Central Tendency
Measures of Central Tendency
Are statistics that describe typical, average,
or representative scores.
The most common measures of central
tendency (mean,median, and mode) are
quite different in conception and calculation.
These three statistics reflect different
notions of the “center” of a distribution.
“The Mode”
The score that occurs most frequently
In case of ungrouped frequency distribution
Unimodal Distribution
-One Mode-
Bimodal Distribution
–Two Modes-
Mode and Measurement Scales
Can you find a mode for each data?
Nominal Scale
1
3
3
2
1
21
23
12
33
23
3
3
3
1
2
2
Ordinal Scale
1234
4343
2442
1244
3234
4
Nationality
Football Poll
1=American
1=first
2=Asian
2=second
3=Mexican
3=third
4=fourth
Interval Scale
112 132
112 113
112 150
125 114
Ratio Scale
68 56 39
56 44 56
45 56 75
81 67 59
112 56
IQ score
Weight
“The Mode”
It is not affected by extremely large or
small values and is therefore a valuable
measure of central tendency when such
values occur.
It can be found for ratio-level, interval-
level, ordinal-level and nominal-level data
“The Median”
The Median is the 50th percentile of a distribution
- The point where half of the observations fall
below and half of the observations fall above
In any distribution there will always be an equal
number of cases above and below the Median.
Oh my !!
Where is the
median?
Location
For an odd number of untied scores
(11, 13, 18, 19, 20)
11
12
13
14 15
16
17
18
19
20
The Median is the middle score when
scores are arranged in rank order
Median Location = (N+1)/2 = 3rd
Median Score = 18
For an even number of untied scores
(11, 15, 19, 20)
11
12
13
14 15
16
17
18
19
20
The Median is halfway between the two central
values when scores are arranged in rank order
Median Location = (N+1)/2 = 2.5th Score
Median = (15+19)/2 = 17
The Median of group of scores is that point
on the number line such that sum of the
distances of all scores to that point is smaller
than the sum of the distances to any other
point.
There is a unique median for each data set.
It is not affected by extremely large or small
values and is therefore a valuable measure of
central tendency when such values occur.
The Median
• Can be computed for:
– Ordinal-level data
– Interval-level data
– Ratio-level data
Median and Levels of Measurement
1
3
3
2
1
2
2
1
3
2
1
3
2
3
3
3
3
1
2
2
No
Nationality
1234
4343
2442
1244
3234
112 132
112 113
112 150
125 114
68 56 39
56 44 56
45 56 75
81 67 59
IQ score
Weight
Yes Yes Yes
Football Poll
Can you find a median for each type of data?
The Mean
The Population Mean
For ungrouped data, the population mean is the
sum of all the population values divided by the
total number of population values. To compute
the population mean, use the following formula.
Sigma
Population
mean
X


N
Population
size
Individual value
The Sample Mean
For ungrouped data, the sample mean is the sum
of all the sample values divided by the number of
sample values. To compute the sample mean,
use the following formula.
Sigma (Summation)
Sample
Mean
X  nX
Sample size
Individual value
Characteristics of
The Mean
Center of Gravity of a Distribution
Center of Gravity of a
Distribution
1
2
3
4
Mean
5 6
7
8
How much error do you expect
for each case?
-6
25
27
-4
31
0
-2
29
35
4
Deviation
Scores
33
2
6
37
Data set
31
The Mean
On average,
I feel fine
It’s too
hot!
It’s too
cold!
The Mean of group of scores is the point
on the number line such that sum of the
squared differences between the scores
and the mean is smaller than the sum of
the squared difference to any other point.
If you summed the differences without
squaring them, the result would be zero.
Mean and Measurement Scales
Every set of interval-level and ratio-level data has a mean.
Can you find the Mean for the following data sets?
Nominal data
Ordinal data
1
1
2
3
NO
2
3
NO
Nationality
Football Poll
1=American
1=first
2=Asian
2=second
3=Mexican
3=third
Interval data
1
2
2
3
YES
IQ Test
Ratio data
1
2
2
3
YES
Weight
All the values are included in
computing the mean.
X  nX
A set of data has a unique mean and
the mean is affected by unusually
large or small data values [outliers].
11
3
5
7
54
6
5.5
The Mean
9
• Every set of interval-level and ratio-level
•
•
•
•
data has a mean.
All the values are included in computing
the mean.
A set of data has a unique mean.
The mean is affected by unusually large or
small data values.
The arithmetic mean is the only measure
of central tendency where the sum of the
deviations of each value from the mean is
zero.
The Relationships between
Measures of Central Tendency
and Shape of a Distribution
Normal Distribution
Symmetric
Unimodal
Mean = Median = Mode
Positively Skewed Distribution
Mode
Median
Mean
Mode < Median < Mean
The median falls closer to the mean than to the mode
Negatively Skewed Distribution
Mode
Median
Mean
Mode > Median > Mean
The median falls closer to the mean than to the mode
Bimodal Distribution
Mode1
Mode2
Mean = Median
Mode1 < Mean = Median < Mode2
SUMMARY
There are three common measures of central tendency. The mean is the
most widely used and the most precise for inferential purposes and is the
foundation for statistical concepts that will be introduced in subsequent
class. The mean is the ratio of the sum of the observations to the number of
observations. The value of the men is influenced by the value of every score
in a distribution. Consequently, in skewed distributions it is drawn toward
the elongated tail more than is the median or mode.
The median is the 50th percentile of a distribution. It is the point in a
distribution from which the sum of the absolute differences of all scores are
at a minimum. In perfectly symmetrical distributions the median and mean
have same value. When the mean and median differ greatly, the median is
usually the most meaningful measure of central tendency for descriptive
purposes.
The mode, unlike the mean and median, has descriptive meaning even with
nominal scales of measurement. The mode is the most frequently occurring
observation. When the median or mean is applicable, the mode is the least
useful measure of central tendency. In symmetrical unimodal distribution the
mode, median, and mean have the same value.